git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH 0/2] fallback to interpreter if JIT fails with pcre
@ 2018-12-09 23:00 Carlo Marcelo Arenas Belón
  2018-12-09 23:00 ` [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1 Carlo Marcelo Arenas Belón
  2018-12-09 23:00 ` [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2 Carlo Marcelo Arenas Belón
  0 siblings, 2 replies; 13+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2018-12-09 23:00 UTC (permalink / raw)
  To: git; +Cc: avarab

while testing in NetBSD 8, was surprised to find that most test cases
using PCRE2 where failing with some cryptic error from git :

  fatal: Couldn't JIT the PCRE2 pattern '$PATTERN', got '-48'

interestingly enough, using a JIT enabled PCRE1 library (not the default)
will show a similar error but a different error code.

the underlying problem is the same though; NetBSD includes PAX support
which restricts the use of memory that is both writeable and executable
and that prevents the JIT to create a compiled expression to jump into,
and while the "fix" for NetBSD is simple it would seem the user experience
could be improved if instead of aborting, git will instead return the matches
using the slower interpreter (which seem to be also the recomendation from
the library developers)

it is important to note that the problem is not unique to NetBSD and had
reproduced it in OpenBSD where working around the issue is more complicated
as WˆX exceptions require a filesystem mount option, and I can see it being
problematic with linux (as shown by the open bug[1] and the development of
an alternative allocator to workaround the issue with seLinux) and with
macOS (specially versions older than 10.14) where there are restrictions to
the number of maps allowed with those flags.

I am also curious if expanding NO_LIBPCRE1_JIT as an option to disable JIT
with PCRE2 (with a different name) might be worth pursuing? as well as some
ways to narrow the failures that will trigger the fallback, but the later
is likely to need library changes which might not be possible with the old
version anyway.

[1] https://bugs.exim.org/show_bug.cgi?id=1749

Carlo Marcelo Arenas Belón (2):
  grep: fallback to interpreter if JIT fails with pcre1
  grep: fallback to interpreter if JIT fails with pcre2

 Makefile | 12 ++++++------
 grep.c   | 13 +++++++++++--
 2 files changed, 17 insertions(+), 8 deletions(-)

-- 
2.20.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-09 23:00 [RFC PATCH 0/2] fallback to interpreter if JIT fails with pcre Carlo Marcelo Arenas Belón
@ 2018-12-09 23:00 ` Carlo Marcelo Arenas Belón
  2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
  2018-12-10  6:48   ` Junio C Hamano
  2018-12-09 23:00 ` [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2 Carlo Marcelo Arenas Belón
  1 sibling, 2 replies; 13+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2018-12-09 23:00 UTC (permalink / raw)
  To: git; +Cc: avarab

JIT support was added to 8.20 but the interface we rely on is only
enabled after 8.32 so try to make the message clearer.

in systems where there are restrictions against creating executable
pages programatically (like OpenBSD, NetBSD, macOS or seLinux) JIT
will fail, resulting in a error message to the user.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 Makefile | 12 ++++++------
 grep.c   |  6 ++++++
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index 1a44c811aa..62b0cb6ee6 100644
--- a/Makefile
+++ b/Makefile
@@ -32,14 +32,14 @@ all::
 # USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
 # instead if you'd like to use the legacy version 1 of the PCRE
 # library. Support for version 1 will likely be removed in some future
-# release of Git, as upstream has all but abandoned it.
+# release of Git, as upstream is focusing all development for new
+# features in the newer version instead.
 #
 # When using USE_LIBPCRE1, define NO_LIBPCRE1_JIT if the PCRE v1
-# library is compiled without --enable-jit. We will auto-detect
-# whether the version of the PCRE v1 library in use has JIT support at
-# all, but we unfortunately can't auto-detect whether JIT support
-# hasn't been compiled in in an otherwise JIT-supporting version. If
-# you have link-time errors about a missing `pcre_jit_exec` define
+# library is newer than 8.32 but compiled without --enable-jit or
+# you want to disable JIT
+#
+# If you have link-time errors about a missing `pcre_jit_exec` define
 # this, or recompile PCRE v1 with --enable-jit.
 #
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
diff --git a/grep.c b/grep.c
index 4db1510d16..5ccc0421a1 100644
--- a/grep.c
+++ b/grep.c
@@ -405,6 +405,12 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 		die("%s", error);
 
 #ifdef GIT_PCRE1_USE_JIT
+	if (p->pcre1_extra_info &&
+		!(p->pcre1_extra_info->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
+		/* JIT failed so fallback to the interpreter */
+		p->pcre1_jit_on = 0;
+		return;
+	}
 	pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
 	if (p->pcre1_jit_on == 1) {
 		p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);
-- 
2.20.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2
  2018-12-09 23:00 [RFC PATCH 0/2] fallback to interpreter if JIT fails with pcre Carlo Marcelo Arenas Belón
  2018-12-09 23:00 ` [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1 Carlo Marcelo Arenas Belón
@ 2018-12-09 23:00 ` Carlo Marcelo Arenas Belón
  2018-12-10  7:00   ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2018-12-09 23:00 UTC (permalink / raw)
  To: git; +Cc: avarab

starting with 10.23, and as a side effect of the work for bug1749[1] (grep
-P crash with seLinux), pcre2grep was modified to ignore any errors from
pcre2_jit_compile so the interpreter could be used as a fallback

[1] https://bugs.exim.org/show_bug.cgi?id=1749

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 grep.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index 5ccc0421a1..c751c8cc74 100644
--- a/grep.c
+++ b/grep.c
@@ -530,8 +530,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
 	if (p->pcre2_jit_on == 1) {
 		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
-		if (jitret)
-			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+		if (jitret) {
+			/* JIT failed so fallback to the interpreter */
+			p->pcre2_jit_on = 0;
+			return;
+		}
 
 		/*
 		 * The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
-- 
2.20.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-09 23:00 ` [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1 Carlo Marcelo Arenas Belón
@ 2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
  2018-12-10  0:42     ` brian m. carlson
  2018-12-10  6:54     ` Junio C Hamano
  2018-12-10  6:48   ` Junio C Hamano
  1 sibling, 2 replies; 13+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-12-09 23:51 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, pcre-dev


On Sun, Dec 09 2018, Carlo Marcelo Arenas Belón wrote:

[+CC pcre-dev]

> JIT support was added to 8.20 but the interface we rely on is only
> enabled after 8.32 so try to make the message clearer.
>
> in systems where there are restrictions against creating executable
> pages programatically (like OpenBSD, NetBSD, macOS or seLinux) JIT
> will fail, resulting in a error message to the user.
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  Makefile | 12 ++++++------
>  grep.c   |  6 ++++++
>  2 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 1a44c811aa..62b0cb6ee6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -32,14 +32,14 @@ all::
>  # USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
>  # instead if you'd like to use the legacy version 1 of the PCRE
>  # library. Support for version 1 will likely be removed in some future
> -# release of Git, as upstream has all but abandoned it.
> +# release of Git, as upstream is focusing all development for new
> +# features in the newer version instead.

I think whatever we do here it makes sense to split this into its own
patch, since it doesn't have to do with this fallback mechanism.

FWIW I was trying to word this in some way that very briefly described
the v1 v.s. v2 situation. Just saying "new features" doesn't quite
capture it, e.g. some bugs in v1 are closed with some resolution like
"this isn't trivial to fix, use v2 instead".

>  # When using USE_LIBPCRE1, define NO_LIBPCRE1_JIT if the PCRE v1
> -# library is compiled without --enable-jit. We will auto-detect
> -# whether the version of the PCRE v1 library in use has JIT support at
> -# all, but we unfortunately can't auto-detect whether JIT support
> -# hasn't been compiled in in an otherwise JIT-supporting version. If
> -# you have link-time errors about a missing `pcre_jit_exec` define
> +# library is newer than 8.32 but compiled without --enable-jit or
> +# you want to disable JIT
> +#
> +# If you have link-time errors about a missing `pcre_jit_exec` define
>  # this, or recompile PCRE v1 with --enable-jit.
>  #
>  # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
> diff --git a/grep.c b/grep.c
> index 4db1510d16..5ccc0421a1 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -405,6 +405,12 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
>  		die("%s", error);
>
>  #ifdef GIT_PCRE1_USE_JIT
> +	if (p->pcre1_extra_info &&
> +		!(p->pcre1_extra_info->flags & PCRE_EXTRA_EXECUTABLE_JIT)) {
> +		/* JIT failed so fallback to the interpreter */
> +		p->pcre1_jit_on = 0;
> +		return;
> +	}

Obviously this & what you have in 2/2 needs to be fixed in some way.

Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
the like work on those setup, or not? I.e. is this something upstream
can/is likely to fix eventually?

Are there cases where we can JIT, but fail for some entirely unrelated
reason, and are now hiding the error?

Are we mixing a condition where one some OS's or OS versions this just
won't work at all, and thus maybe should be something turned on in
config.mak.uname, v.s. e.g. SELinux where presumably it'll dynamically
change.

I'm inclined to suggest that we should have another ifdef here for "if
JIT fails I'd like it to die", so that e.g. packages I build (for
internal use) don't silently slow down in the future, only for me to
find some months later that someone enabled an overzealous SELinux
policy and we swept this under the rug.

But maybe that's just dumb for some reason and we always need to do this
dynamically...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
@ 2018-12-10  0:42     ` brian m. carlson
  2018-12-10  1:25       ` Carlo Arenas
                         ` (2 more replies)
  2018-12-10  6:54     ` Junio C Hamano
  1 sibling, 3 replies; 13+ messages in thread
From: brian m. carlson @ 2018-12-10  0:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Carlo Marcelo Arenas Belón, git, pcre-dev

[-- Attachment #1: Type: text/plain, Size: 2142 bytes --]

On Mon, Dec 10, 2018 at 12:51:01AM +0100, Ævar Arnfjörð Bjarmason wrote:
> Obviously this & what you have in 2/2 needs to be fixed in some way.
> 
> Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
> the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
> the like work on those setup, or not? I.e. is this something upstream
> can/is likely to fix eventually?

From the cover letter (but without testing), it seems like it would
probably be fine to first map the pages read-write to write the code and
then, once that's done, to map them read-executable. I know JIT
compilation does work on the BSDs, so presumably that's the technique to
make it do so.

Both versions of PCRE map pages both write and executable at the same
time, which is presumably where things go wrong. I assume it can be
fixed, but whether that's easy in the context of PCRE, I wouldn't know.

> Are we mixing a condition where one some OS's or OS versions this just
> won't work at all, and thus maybe should be something turned on in
> config.mak.uname, v.s. e.g. SELinux where presumably it'll dynamically
> change.

Considering that some Linux users use PaX kernels with standard
distributions and that most BSD kernels can be custom-compiled with a
variety of options enabled or disabled, I think this is something we
should detect dynamically.

> I'm inclined to suggest that we should have another ifdef here for "if
> JIT fails I'd like it to die", so that e.g. packages I build (for
> internal use) don't silently slow down in the future, only for me to
> find some months later that someone enabled an overzealous SELinux
> policy and we swept this under the rug.

My view is that JIT is a nice performance optimization, but it's
optional. I honestly don't think it should even be exposed through the
API: if it works, then things are faster, and if it doesn't, then
they're not. I don't see the value in an option for causing things to be
broken if someone improves the security of the system.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-10  0:42     ` brian m. carlson
@ 2018-12-10  1:25       ` Carlo Arenas
  2018-12-10  6:42       ` Junio C Hamano
  2018-12-10  8:24       ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 13+ messages in thread
From: Carlo Arenas @ 2018-12-10  1:25 UTC (permalink / raw)
  To: sandals, avarab, git, pcre-dev

On Sun, Dec 9, 2018 at 4:42 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On Mon, Dec 10, 2018 at 12:51:01AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > Obviously this & what you have in 2/2 needs to be fixed in some way.
> >
> > Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
> > the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
> > the like work on those setup, or not? I.e. is this something upstream
> > can/is likely to fix eventually?
>
> From the cover letter (but without testing), it seems like it would
> probably be fine to first map the pages read-write to write the code and
> then, once that's done, to map them read-executable. I know JIT
> compilation does work on the BSDs, so presumably that's the technique to
> make it do so.

and that has been implemented (sljitProtExecAllocator.c) as part of the work
triggered by the bug I linked about [1], deep inside sljit (which is
what pcre uses for JIT)

the code AS-IS wouldn't compile for the BSD[2] but that is easy to fix
and sure works as expected but I am under the impression that is not
something that can be considered as a solution as explained by the open issues
described with crashes after a fork()

note that changing the map from read-write to executable will be
prevented by the
same policy so you have to create 2 maps (and therefore a backing file) and I
don't think there is a way to solve that in a foolproof way in a
library which is why
I mentioned more work might be needed to define the right interfaces
so it can be
solved by the application, and that is unlikely to happen with the old library.

Carlo

[1] https://bugs.exim.org/show_bug.cgi?id=1749
[2] https://bugs.exim.org/show_bug.cgi?id=2155

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-10  0:42     ` brian m. carlson
  2018-12-10  1:25       ` Carlo Arenas
@ 2018-12-10  6:42       ` Junio C Hamano
  2018-12-10  8:24       ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-12-10  6:42 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Ævar Arnfjörð Bjarmason,
	Carlo Marcelo Arenas Belón, git, pcre-dev

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> Considering that some Linux users use PaX kernels with standard
> distributions and that most BSD kernels can be custom-compiled with a
> variety of options enabled or disabled, I think this is something we
> should detect dynamically.
> ...
> My view is that JIT is a nice performance optimization, but it's
> optional. I honestly don't think it should even be exposed through the
> API: if it works, then things are faster, and if it doesn't, then
> they're not. I don't see the value in an option for causing things to be
> broken if someone improves the security of the system.

I agree to both of these two points.  Thanks for a thoughtful
discussion, all.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-09 23:00 ` [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1 Carlo Marcelo Arenas Belón
  2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
@ 2018-12-10  6:48   ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-12-10  6:48 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, avarab

Carlo Marcelo Arenas Belón  <carenas@gmail.com> writes:

> JIT support was added to 8.20 but the interface we rely on is only
> enabled after 8.32 so try to make the message clearer.

Could you add some word before 8.20 and 8.32 (e.g. "pcre library
version 8.20", if that is what you are referring to, and if 8.32 is
also taken from the same context, adding such phrase to clarify the
context only to 8.20 is sufficient)?

> in systems where there are restrictions against creating executable
> pages programatically (like OpenBSD, NetBSD, macOS or seLinux) JIT
> will fail, resulting in a error message to the user.
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---

The change to the code looked sensible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
  2018-12-10  0:42     ` brian m. carlson
@ 2018-12-10  6:54     ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-12-10  6:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Carlo Marcelo Arenas Belón, git, pcre-dev

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> @@ -32,14 +32,14 @@ all::
>>  # USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
>>  # instead if you'd like to use the legacy version 1 of the PCRE
>>  # library. Support for version 1 will likely be removed in some future
>> -# release of Git, as upstream has all but abandoned it.
>> +# release of Git, as upstream is focusing all development for new
>> +# features in the newer version instead.
>
> I think whatever we do here it makes sense to split this into its own
> patch, since it doesn't have to do with this fallback mechanism.
>
> FWIW I was trying to word this in some way that very briefly described
> the v1 v.s. v2 situation. Just saying "new features" doesn't quite
> capture it, e.g. some bugs in v1 are closed with some resolution like
> "this isn't trivial to fix, use v2 instead".

I actually think we should remove the paragraph that says "Support
for version 1 will likely to be removed...", as I do not see how it
helps the users.  If they have both available, on the day they hear
that we are planning to remove pcre1 support and realize that they
need to plan to upgrade to the first version of Git that drops pcre1
support, they can switch to pcre2 just fine, so telling them about
our future that we do not even know definite plan yet does not help
them.  It is quite unlikely, given that "upstream has all but
abandoned it", that there only is pcre1 support without pcre2 for an
obscure platform, and even if there are such platforms, the users on
them won't have much choice.  Discouraging them from using pcre1
would not help them make better choices anyway.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2
  2018-12-09 23:00 ` [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2 Carlo Marcelo Arenas Belón
@ 2018-12-10  7:00   ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-12-10  7:00 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, avarab

Carlo Marcelo Arenas Belón  <carenas@gmail.com> writes:

> starting with 10.23, and as a side effect of the work for bug1749[1] (grep
> -P crash with seLinux), pcre2grep was modified to ignore any errors from
> pcre2_jit_compile so the interpreter could be used as a fallback

That may (or may not---I do not know and I do not particularly feel
the need to know or care about pcre2grep that is somebody else's
project) describe what happened in the pcre2grep project, but it
solicits a "so what?" response without the rest of the sentence you
omitted.

I am guessing that the missing end of the sentence is "we should
follow suit because ...", i.e. something like

    Starting from 10.23 [of what??? pcre2grep's version, libpcre's
    version, or something else???], pcre2grep falls back to
    interpreted pcre when JIT compilation fails.  We should follow
    suit in "git grep", because ...




> [1] https://bugs.exim.org/show_bug.cgi?id=1749
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  grep.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index 5ccc0421a1..c751c8cc74 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -530,8 +530,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  	pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
>  	if (p->pcre2_jit_on == 1) {
>  		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
> -		if (jitret)
> -			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
> +		if (jitret) {
> +			/* JIT failed so fallback to the interpreter */
> +			p->pcre2_jit_on = 0;
> +			return;
> +		}
>  
>  		/*
>  		 * The pcre2_config(PCRE2_CONFIG_JIT, ...) call just

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-10  0:42     ` brian m. carlson
  2018-12-10  1:25       ` Carlo Arenas
  2018-12-10  6:42       ` Junio C Hamano
@ 2018-12-10  8:24       ` Ævar Arnfjörð Bjarmason
  2018-12-11 20:11         ` Carlo Arenas
  2 siblings, 1 reply; 13+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-12-10  8:24 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Carlo Marcelo Arenas Belón, git, pcre-dev


On Mon, Dec 10 2018, brian m. carlson wrote:

> On Mon, Dec 10, 2018 at 12:51:01AM +0100, Ævar Arnfjörð Bjarmason wrote:
>> Obviously this & what you have in 2/2 needs to be fixed in some way.
>>
>> Is the issue on SELinux, OpenBSD, NetBSD etc. *how* PCRE is creating the
>> the JIT'd code? I.e. presumably Google Chrome's JIT engine, Java JIT and
>> the like work on those setup, or not? I.e. is this something upstream
>> can/is likely to fix eventually?
>
> From the cover letter (but without testing), it seems like it would
> probably be fine to first map the pages read-write to write the code and
> then, once that's done, to map them read-executable. I know JIT
> compilation does work on the BSDs, so presumably that's the technique to
> make it do so.
>
> Both versions of PCRE map pages both write and executable at the same
> time, which is presumably where things go wrong. I assume it can be
> fixed, but whether that's easy in the context of PCRE, I wouldn't know.
>
>> Are we mixing a condition where one some OS's or OS versions this just
>> won't work at all, and thus maybe should be something turned on in
>> config.mak.uname, v.s. e.g. SELinux where presumably it'll dynamically
>> change.
>
> Considering that some Linux users use PaX kernels with standard
> distributions and that most BSD kernels can be custom-compiled with a
> variety of options enabled or disabled, I think this is something we
> should detect dynamically.

Right. I'm asking whether we're mixing up cases where it can always be
detected at compile-time on some systems v.s. cases where it'll
potentially change at runtime.

>> I'm inclined to suggest that we should have another ifdef here for "if
>> JIT fails I'd like it to die", so that e.g. packages I build (for
>> internal use) don't silently slow down in the future, only for me to
>> find some months later that someone enabled an overzealous SELinux
>> policy and we swept this under the rug.
>
> My view is that JIT is a nice performance optimization, but it's
> optional. I honestly don't think it should even be exposed through the
> API: if it works, then things are faster, and if it doesn't, then
> they're not. I don't see the value in an option for causing things to be
> broken if someone improves the security of the system.

For many users that's definitely the case, but for others that's like
saying a RDBMS is still going to be functional if the "ORDER BY"
function degrades to bubblesort. The JIT improves performance my
multi-hundred percents sometimes, so some users (e.g. me) rely on that
not being silently degraded.

So I'm wondering if we can have something like:

    if (!jit)
        if (must_have_jit)
            BUG(...); // Like currently
        else
            fallback(); // new behavior

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-10  8:24       ` Ævar Arnfjörð Bjarmason
@ 2018-12-11 20:11         ` Carlo Arenas
  2018-12-11 20:51           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 13+ messages in thread
From: Carlo Arenas @ 2018-12-11 20:11 UTC (permalink / raw)
  To: avarab; +Cc: sandals, git, pcre-dev

On Mon, Dec 10, 2018 at 12:24 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Mon, Dec 10 2018, brian m. carlson wrote:
> > Considering that some Linux users use PaX kernels with standard
> > distributions and that most BSD kernels can be custom-compiled with a
> > variety of options enabled or disabled, I think this is something we
> > should detect dynamically.
>
> Right. I'm asking whether we're mixing up cases where it can always be
> detected at compile-time on some systems v.s. cases where it'll
> potentially change at runtime.

the closer we come to a system specific issues is with macOS where the
compiler (in some newer versions) is allocating the memory using the
MAP_JIT flag, which seems was originally meant to be only used in iOS
and has the strange characteristic of failing the mmap for versions
older than 10.14 if it was called more than once.

IMHO as brian pointed out, this is better done at runtime.

> >> I'm inclined to suggest that we should have another ifdef here for "if
> >> JIT fails I'd like it to die", so that e.g. packages I build (for
> >> internal use) don't silently slow down in the future, only for me to
> >> find some months later that someone enabled an overzealous SELinux
> >> policy and we swept this under the rug.
> >
> > My view is that JIT is a nice performance optimization, but it's
> > optional. I honestly don't think it should even be exposed through the
> > API: if it works, then things are faster, and if it doesn't, then
> > they're not. I don't see the value in an option for causing things to be
> > broken if someone improves the security of the system.
>
> For many users that's definitely the case, but for others that's like
> saying a RDBMS is still going to be functional if the "ORDER BY"
> function degrades to bubblesort. The JIT improves performance my
> multi-hundred percents sometimes, so some users (e.g. me) rely on that
> not being silently degraded.

the opposite is also true, specially considering that some old
versions of pcre result in a segfault instead of an error message and
therefore since there is no way to disable JIT, the only option left
is not to use `git grep -P` (or the equivalent git log call)

> So I'm wondering if we can have something like:
>
>     if (!jit)
>         if (must_have_jit)
>             BUG(...); // Like currently
>         else
>             fallback(); // new behavior

I am wondering if something like a `git doctor` command might be an
interesting alternative to this.

This way we could (for ex: in NetBSD) give the user a hint of what to
do to make their git grep -P faster when we detect we are running the
fallback, and might be useful as well to provide hints for
optimizations that could be used in other cases (probably even
depending on the size of the git repository)

For your use case, you just need to add a crontab that will trigger an
alarm if this command ever mentions PCRE

Carlo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
  2018-12-11 20:11         ` Carlo Arenas
@ 2018-12-11 20:51           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 13+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-12-11 20:51 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: sandals, git, pcre-dev


On Tue, Dec 11 2018, Carlo Arenas wrote:

> On Mon, Dec 10, 2018 at 12:24 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Mon, Dec 10 2018, brian m. carlson wrote:
>> > Considering that some Linux users use PaX kernels with standard
>> > distributions and that most BSD kernels can be custom-compiled with a
>> > variety of options enabled or disabled, I think this is something we
>> > should detect dynamically.
>>
>> Right. I'm asking whether we're mixing up cases where it can always be
>> detected at compile-time on some systems v.s. cases where it'll
>> potentially change at runtime.
>
> the closer we come to a system specific issues is with macOS where the
> compiler (in some newer versions) is allocating the memory using the
> MAP_JIT flag, which seems was originally meant to be only used in iOS
> and has the strange characteristic of failing the mmap for versions
> older than 10.14 if it was called more than once.
>
> IMHO as brian pointed out, this is better done at runtime.

Sure. Just something I was wondering since it wasn't clear from the
patch. Makes sense, if it's always runtime (or not worth the effort to
divide the two) let's do that.

>> >> I'm inclined to suggest that we should have another ifdef here for "if
>> >> JIT fails I'd like it to die", so that e.g. packages I build (for
>> >> internal use) don't silently slow down in the future, only for me to
>> >> find some months later that someone enabled an overzealous SELinux
>> >> policy and we swept this under the rug.
>> >
>> > My view is that JIT is a nice performance optimization, but it's
>> > optional. I honestly don't think it should even be exposed through the
>> > API: if it works, then things are faster, and if it doesn't, then
>> > they're not. I don't see the value in an option for causing things to be
>> > broken if someone improves the security of the system.
>>
>> For many users that's definitely the case, but for others that's like
>> saying a RDBMS is still going to be functional if the "ORDER BY"
>> function degrades to bubblesort. The JIT improves performance my
>> multi-hundred percents sometimes, so some users (e.g. me) rely on that
>> not being silently degraded.
>
> the opposite is also true, specially considering that some old
> versions of pcre result in a segfault instead of an error message and
> therefore since there is no way to disable JIT, the only option left
> is not to use `git grep -P` (or the equivalent git log call)

Right, of course it segfaulting is a bug...

>> So I'm wondering if we can have something like:
>>
>>     if (!jit)
>>         if (must_have_jit)
>>             BUG(...); // Like currently
>>         else
>>             fallback(); // new behavior
>
> I am wondering if something like a `git doctor` command might be an
> interesting alternative to this.
>
> This way we could (for ex: in NetBSD) give the user a hint of what to
> do to make their git grep -P faster when we detect we are running the
> fallback, and might be useful as well to provide hints for
> optimizations that could be used in other cases (probably even
> depending on the size of the git repository)

Such a command has been discussed before on-list. I think it's a good
idea for the more fuzzy things like optimization suggests, but for the
case of expecting something at compile-time where not having that at
runtime is a boolean state it's nicer to just die with a BUG(...).

> For your use case, you just need to add a crontab that will trigger an
> alarm if this command ever mentions PCRE

...The reason I'd like it to die is because it neatly and naturally
integrates with all existing test infrastructure I have. I.e. build
package, run stress tests on all sorts of machines, see that it passes,
and SELinux isn't ruining it or whatever.

I know this works now (I always get PCRE v2 JIT) because it doesn't die
or segfault. I'd like not to have it regress to having worse
performance.

Having a cronjob to test for "does PCRE v2 JIT still work?" is not as
easy & isn't a drop-in solution.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-12-11 20:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-09 23:00 [RFC PATCH 0/2] fallback to interpreter if JIT fails with pcre Carlo Marcelo Arenas Belón
2018-12-09 23:00 ` [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1 Carlo Marcelo Arenas Belón
2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
2018-12-10  0:42     ` brian m. carlson
2018-12-10  1:25       ` Carlo Arenas
2018-12-10  6:42       ` Junio C Hamano
2018-12-10  8:24       ` Ævar Arnfjörð Bjarmason
2018-12-11 20:11         ` Carlo Arenas
2018-12-11 20:51           ` Ævar Arnfjörð Bjarmason
2018-12-10  6:54     ` Junio C Hamano
2018-12-10  6:48   ` Junio C Hamano
2018-12-09 23:00 ` [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2 Carlo Marcelo Arenas Belón
2018-12-10  7:00   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).