git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Jeff King" <peff@peff.net>,
	"Carlo Marcelo Arenas Belón" <carenas@gmail.com>
Subject: Re: [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator
Date: Wed, 10 Feb 2021 13:38:37 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2102101338220.29765@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <20210204210556.25242-9-avarab@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]

Hi Ævar,

ACK!

And thank you for this patch,
Dscho

On Thu, 4 Feb 2021, Ævar Arnfjörð Bjarmason wrote:

> Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
> allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
> functions for allocation. We'll now use it for all PCREv2 allocations.
>
> The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
> issue is because it managed to target pretty much the allocation freed
> via free(), as opposed to by a pcre2_*free() function. I.e. the
> pcre2_maketables() and pcre2_maketables_free() pair. For most of the
> rest we continued allocating with stock malloc() inside PCREv2 itself,
> but didn't segfault because we'd use its corresponding free().
>
> In a preceding commit of mine I changed the free() to
> pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
> far as fixing the segfault goes we could revert 513f2b0bbd4. But then
> we wouldn't use the desired allocator, let's just use it instead.
>
> Before this patch we'd on e.g.:
>
>     grep --threads=1 -iP æ.*var.*xyz
>
> Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
> corresponding free() call. Now it's 12 calls to each. This can be
> observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.
>
> Reading the history of how this bug got introduced it wasn't present
> in Johannes's original patch[1] to fix the issue.
>
> My reading of that thread is that the approach the follow-up patches
> to Johannes's original pursued were based on misunderstanding of how
> the PCREv2 API works. In particular this part of [2]:
>
>     "most of the time (like when using UTF-8) the chartable (and
>     therefore the global context) is not needed (even when using
>     alternate allocators)"
>
> That's simply not how PCREv2 memory allocation works. It's easy to see
> how the misunderstanding came about. It's because (as noted above) the
> issue was noticed because of our use of free() in our own grep.c for
> freeing the memory allocated by pcre2_maketables().
>
> Thus the misunderstanding that PCREv2's compile context is something
> only needed for pcre2_maketables(), and e.g. an aborted earlier
> attempt[3] to only set it up when we ourselves called
> pcre2_maketables().
>
> That's not what PCREv2's compile context is. To quote PCREv2's
> documentation:
>
>     "This context just contains pointers to (and data for) external
>     memory management functions that are called from several places in
>     the PCRE2 library."
>
> Thus the failed attempts to go down the route of only creating the
> general context in cases where we ourselves call pcre2_maketables(),
> before finally settling on the approach 513f2b0bbd4 took of always
> creating it.
>
> Instead we should always create it, and then pass the general context
> to those functions that accept it, so that they'll consistently use
> our preferred memory allocation functions.
>
> 1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
> 2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
> 3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index c63dbff4b2..0116ff5f09 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  			if (!pcre2_global_context)
>  				BUG("pcre2_global_context uninitialized");
>  			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
> -			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
> +			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
>  			pcre2_set_character_tables(p->pcre2_compile_context,
>  							p->pcre2_tables);
>  		}
> @@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  					 p->pcre2_compile_context);
>
>  	if (p->pcre2_pattern) {
> -		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL);
> +		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
>  		if (!p->pcre2_match_data)
>  			die("Couldn't allocate PCRE2 match data");
>  	} else {
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

  reply	other threads:[~2021-02-10 12:44 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-16 12:06 [PATCH 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
2019-10-16 12:06 ` [PATCH 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-16 12:06 ` [PATCH 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-16 12:06 ` [PATCH 3/3] grep: avoid leak of chartables in PCRE2 Johannes Schindelin via GitGitGadget
2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
2019-10-16 12:10   ` [PATCH v2 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-18  1:38     ` Junio C Hamano
2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
2021-02-10 21:34       ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2021-03-04  0:34         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
2021-03-04  0:14         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
2021-03-04  0:24         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
2021-03-04  0:27         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
2021-03-04  0:28         ` Junio C Hamano
2021-02-04 21:05     ` [PATCH 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
2021-02-10 10:38       ` Johannes Schindelin
2021-02-04 21:05     ` [PATCH 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
2021-02-10 10:43       ` Johannes Schindelin
2021-03-04  0:16       ` Junio C Hamano
2021-02-04 21:05     ` [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
2021-02-10 12:38       ` Johannes Schindelin [this message]
2021-02-04 21:05     ` [PATCH 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
2021-02-10 12:40       ` Johannes Schindelin
2019-10-16 12:10   ` [PATCH v2 3/3] grep: avoid leak of chartables in PCRE2 Carlo Marcelo Arenas Belón via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2102101338220.29765@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).