git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Carlo Arenas <carenas@gmail.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 3/3] grep: plug leak of pcre chartables in PCRE2
Date: Tue, 30 Jul 2019 18:52:41 +0200	[thread overview]
Message-ID: <dff4a832-e8a8-789f-04cd-b11fd1671c52@web.de> (raw)
In-Reply-To: <CAPUEspgRUymY0Tm0zvfoSsYm0dKh8fy=fW-jN4Roid6oZVY86Q@mail.gmail.com>

Am 30.07.19 um 02:08 schrieb Carlo Arenas:
> On Mon, Jul 29, 2019 at 1:35 PM René Scharfe <l.s.r@web.de> wrote:
>>
>> Am 28.07.19 um 03:41 schrieb Carlo Arenas:
>>> On Sat, Jul 27, 2019 at 4:48 PM Ævar Arnfjörð Bjarmason
>>> <avarab@gmail.com> wrote:
>>>>> +     free((void *)p->pcre_tables);
>>>>
>>>> Is the cast really needed? I'm rusty on the rules, removing it from the
>>>> pcre_free() you might have copied this from produces a warning for me,
>>>> but not for free() itself. This is on GCC 8.3.0. How about for you &
>>>> what compiler(s)?
>>>
>>> both will trigger warnings for the same reason
>>> (-Wincompatible-pointer-types-discards-qualifiers)
>>> with Apple LLVM version 10.0.1 (clang-1001.0.46.4)
>>>
>>> gcc-9 in macOS triggers 2 "warnings"; one for discarding the const
>>> qualifier (-Wdiscarded-qualifiers)
>>> and another for mismatching parameter to free():
>>>
>>> note: expected 'void *' but argument is of type 'const uint8_t *' {aka
>>> 'const unsigned char *'}
>>
>> Right: pcre_maketables() returns a const pointer, and you have to cast
>> away this const'ness at some point if you want to free(3) that memory.
>> Returning a non-const pointer would be more fitting, but I guess the
>> idea is that users of the library are not supposed to change the
>> contents of the table.
>
> note that this pointer was generated by pcre2_maketables() instead
> which is actually "const uint8_t *", but yes these tables are meant to
> be inmutable and that is why they are "cost".

Only the const'ness matters in regards to the need for casting a pointer
to feed to free(3).

Doing the cast in a library function or using an opaque pointer type
instead of a fake const pointer would be nicer ways on the part of PCRE2
to keep callers from messing with the table.  Forcing users to cast
const away or leak is not very nice..  Nothing we can do about it,
except perhaps adding a wish list item for PCRE3, I guess.

https://pcre.org/current/doc/html/pcre2_maketables.html says that
pcre2_maketables() returns const unsigned char *, by the way.  I don't
get the PCRE2_SUFFIX business in pcre2.h ("Define macros that generate
width-specific names from generic versions."), though.

>> But wouldn't it be more correct to use pcre_free()?  As long as we keep
>> pcre_malloc() and pcre_free() at their default values it doesn't matter
>> in practice, but using free(3) directly is a layering violation, no?
>
> yes, but that is the only option PCRE2 gives when not using a global
> context which is what the comment in the commit refers to.
>
> FWIW pcre_free() doesn't exist anymore in PCRE2.

OK, and while https://pcre.org/original/doc/html/pcreapi.html#TOC1
says that pcre_malloc() is used by pcre_maketables() (and thus the
result should be passed to pcre_free() after use),
https://pcre.org/current/doc/html/pcre2_maketables.html says
pcre2_maketables() uses malloc(3), so that pointer needs to go to
free(3) at the end.  Missed the second part in my earlier reply.

>> Perhaps just UNLEAK that thing?  There is only a single way to build it
>> and we can reuse it throughout the lifetime of the program, so there is
>> no real need to clean it up before the OS does.
>
> That would be a better fit if it would be created once in cmd_grep and
> then shared with all worker threads (which I thought would be nice to
> do in the future anyway), but this change was trying to be conservative
> and just to the minimum to close the leak.

Sure.  I wonder how sharing between threads would influence performance..

René

  reply	other threads:[~2019-07-30 16:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-27 20:27 [PATCH 0/3] grep: memory leak in PCRE2 Carlo Marcelo Arenas Belón
2019-07-27 20:27 ` [PATCH 1/3] grep: make pcre1_tables version agnostic Carlo Marcelo Arenas Belón
2019-07-27 23:47   ` Ævar Arnfjörð Bjarmason
2019-07-28  2:50     ` Carlo Arenas
2019-07-27 20:27 ` [PATCH 2/3] grep: use pcre_tables also for PCRE2 Carlo Marcelo Arenas Belón
2019-07-27 20:27 ` [PATCH 3/3] grep: plug leak of pcre chartables in PCRE2 Carlo Marcelo Arenas Belón
2019-07-27 23:48   ` Ævar Arnfjörð Bjarmason
2019-07-28  1:41     ` Carlo Arenas
2019-07-29 20:34       ` René Scharfe
2019-07-30  0:08         ` Carlo Arenas
2019-07-30 16:52           ` René Scharfe [this message]
2019-08-01 17:09 ` [PATCH v2] grep: avoid leak of " Carlo Marcelo Arenas Belón
2019-08-02 16:19   ` Junio C Hamano
2019-08-03 18:50     ` Carlo Arenas
2019-08-05 19:34       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dff4a832-e8a8-789f-04cd-b11fd1671c52@web.de \
    --to=l.s.r@web.de \
    --cc=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).