From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id A36571F463 for ; Thu, 2 Jan 2020 22:14:34 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=BoR+t8uPXTUgzb4Q F8asjqFyzxrmt3wmRok1G4dPJvGjVCsD+rNjsu5x5DARWymIU4SJvHkRfJjLWkhz zyohvnjXolb2FjrtX1QOPeHfvxKTk/HUiSQme8HJ/VYR/H8M/w4r0pmELGTU0PxC DYasJxOSw2UgTpD3fE6CzP508W0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=agAsNbJv4RhACvc5U40xyA eaz4Q=; b=ZyDNYBJyBZ3uIjLGC2f/VgeJUWwM+jOyg9InU+JlXqtCbk5B2ZRLTp l7BjIO0Pu7oUnTYC+nmy8BlhvAD15W6M2rOXbewg2W55Jn+GYHiusJc0Qaw4DxXX TUAo/b0Nn5PHOU5Qzks6IZEBizZPv10V7BXvSXwUD8OOejntmJr3M= Received: (qmail 35077 invoked by alias); 2 Jan 2020 22:14:32 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 35069 invoked by uid 89); 2 Jan 2020 22:14:31 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: zimbra.cs.ucla.edu Subject: Re: question about regex To: =?UTF-8?Q?Tim_R=c3=bchsen?= , liqingqing Cc: libc-alpha@sourceware.org, Florian Weimer , Carlos O'Donell , Hushiyuan , Liusirui References: From: Paul Eggert Message-ID: <05dfac46-6cf3-1f5d-14e7-3c8b07b2093e@cs.ucla.edu> Date: Thu, 2 Jan 2020 14:14:27 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable On 1/2/20 8:16 AM, Tim R=C3=BChsen wrote: > Meanwhile grep (or libc) seems to exit gracefully: Yes, there's no core dump if the operating system supports stack=20 overflow detection that grep can use. The problem occurs only on OSes=20 that don't do that, or on apps that don't try to detect stack overflow=20 and simply dump core (or worse). On 1/2/20 2:54 AM, liqingqing wrote: > do we have any plan or good ways to fix up the bug as below The best way would be to fix bug#24269, i.e., fix the glibc regex code=20 so that it doesn't blow the stack. If you could write a patch for this=20 bug (something that doesn't hurt performance for ordinary regexps), that=20 would be welcome. For that particular test case, you can use an OS that does proper stack=20 overflow checking that grep can use. PS. The next version of the grep manual is planned to nearly wash its=20 hands of the matter. Here's the current draft: ---- Back-references can greatly slow down matching, as they can generate exponentially many matching possibilities that can consume both time and memory to explore. Also, the POSIX specification for back-references is at times unclear. Furthermore, many regular expression implementations have back-reference bugs that can cause programs to return incorrect answers or even crash, and fixing these bugs has often been low-priority: for example, as of 2020 the @url{https://sourceware.org/bugzilla/,GNU C library bug database} contained back-reference bugs @url{https://sourceware.org/bugzilla/show_bug.cgi?id=3D52,,52}, @url{https://sourceware.org/bugzilla/show_bug.cgi?id=3D10844,,10844}, @url{https://sourceware.org/bugzilla/show_bug.cgi?id=3D11053,,11053}, @url{https://sourceware.org/bugzilla/show_bug.cgi?id=3D24269,,24269} and @url{https://sourceware.org/bugzilla/show_bug.cgi?id=3D25322,,25322}, with little sign of forthcoming fixes. Luckily, back-references are rarely useful and it should be little trouble to avoid them in practical applications.