unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: "Tim Rühsen" <tim.ruehsen@gmx.de>, liqingqing <liqingqing3@huawei.com>
Cc: libc-alpha@sourceware.org, Florian Weimer <fweimer@redhat.com>,
	Carlos O'Donell <carlos@redhat.com>,
	Hushiyuan <hushiyuan@huawei.com>, Liusirui <liusirui@huawei.com>
Subject: Re: question about regex
Date: Thu, 2 Jan 2020 14:14:27 -0800	[thread overview]
Message-ID: <05dfac46-6cf3-1f5d-14e7-3c8b07b2093e@cs.ucla.edu> (raw)
In-Reply-To: <fe84ed3d-6bbc-ea87-4e50-93e11736b005@gmx.de>

On 1/2/20 8:16 AM, Tim Rühsen wrote:
> Meanwhile grep (or libc) seems to exit gracefully:

Yes, there's no core dump if the operating system supports stack 
overflow detection that grep can use. The problem occurs only on OSes 
that don't do that, or on apps that don't try to detect stack overflow 
and simply dump core (or worse).

On 1/2/20 2:54 AM, liqingqing wrote:

> do we have any plan or good ways to fix up the bug as below

The best way would be to fix bug#24269, i.e., fix the glibc regex code 
so that it doesn't blow the stack. If you could write a patch for this 
bug (something that doesn't hurt performance for ordinary regexps), that 
would be welcome.

For that particular test case, you can use an OS that does proper stack 
overflow checking that grep can use.

PS. The next version of the grep manual is planned to nearly wash its 
hands of the matter. Here's the current draft:

----

Back-references can greatly slow down matching, as they can generate
exponentially many matching possibilities that can consume both time
and memory to explore.  Also, the POSIX specification for
back-references is at times unclear.  Furthermore, many regular
expression implementations have back-reference bugs that can cause
programs to return incorrect answers or even crash, and fixing these
bugs has often been low-priority: for example, as of 2020 the
@url{https://sourceware.org/bugzilla/,GNU C library bug database}
contained back-reference bugs
@url{https://sourceware.org/bugzilla/show_bug.cgi?id=52,,52},
@url{https://sourceware.org/bugzilla/show_bug.cgi?id=10844,,10844},
@url{https://sourceware.org/bugzilla/show_bug.cgi?id=11053,,11053},
@url{https://sourceware.org/bugzilla/show_bug.cgi?id=24269,,24269}
and @url{https://sourceware.org/bugzilla/show_bug.cgi?id=25322,,25322},
with little sign of forthcoming fixes.  Luckily,
back-references are rarely useful and it should be little trouble to
avoid them in practical applications.

  reply	other threads:[~2020-01-02 22:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-14  8:44 [discuss] iconv: what's the purpose of the mtrace in the tst-iconv2.c liqingqing
2020-01-02 10:54 ` question about regex liqingqing
2020-01-02 16:16   ` Tim Rühsen
2020-01-02 22:14     ` Paul Eggert [this message]
2020-01-03  8:09       ` liqingqing
2020-01-20 10:41         ` liqingqing
2020-01-20 19:25           ` Paul Eggert
2020-01-21  1:15             ` liqingqing
2020-01-21  8:57               ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=05dfac46-6cf3-1f5d-14e7-3c8b07b2093e@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=carlos@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=hushiyuan@huawei.com \
    --cc=libc-alpha@sourceware.org \
    --cc=liqingqing3@huawei.com \
    --cc=liusirui@huawei.com \
    --cc=tim.ruehsen@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).