From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 914F21F5AE for ; Wed, 5 May 2021 05:28:29 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 50792120F66; Wed, 5 May 2021 14:27:22 +0900 (JST) Received: from dcvr.yhbt.net (dcvr.yhbt.net [64.71.152.64]) by neon.ruby-lang.org (Postfix) with ESMTPS id AEC3C120EE4 for ; Wed, 5 May 2021 14:27:15 +0900 (JST) Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 591261F5AE; Wed, 5 May 2021 05:28:14 +0000 (UTC) Date: Wed, 5 May 2021 05:28:14 +0000 From: Eric Wong To: ruby-core@ruby-lang.org Message-ID: <20210505052814.GA17488@dcvr> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-ML-Name: ruby-core X-Mail-Count: 103730 Subject: [ruby-core:103730] Re: [Ruby master Feature#17837] Add support for Regexp timeouts X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" sam.saffron@gmail.com wrote: > Feature #17837: Add support for Regexp timeouts > https://bugs.ruby-lang.org/issues/17837 > I recommend against a "per Regexp" API as this decision is at > the application level. You want to apply it to all regular > expressions in all the gems you are consuming. The syscall costs are higher nowadays and this will penalize good regexps. IME with unicorn, global timeouts of this type means problems go unfixed for too long and fester into worse problems. Ultimately many Ruby problems come from tolerating excessively deep/complex dependency stacks(*) and developers having too much crap to manage. Anecdotally, my experience with Perl5 RE is better than with Onig*. I know Perl5 has the same underlying problems as Onig*, however Perl5 RE seems less bad in practice. Again, Perl5 RE does have underlying problems, but they don't manifest nearly as much as they do with Ruby (I've as much or more Perl experience than I have in Ruby). One example I remember off the top of my head is [ruby-core:74030]. I just tested that again after 5 years: Ruby still infinite loops; Perl still terminates as it should. Your example translated to Perl5 also stops fine for me: ("A" . "C" x 100 . "X") =~ /A(B|C+)+D/; Onig* might be able to learn a thing or three from Perl5 when it comes to common real-world cases. Again, I know Perl5 RE has underlying problems just like Onig*, they do not manifest as easily. (*) and I apologize for letting crap like unicorn become too popular and perpetuating the existence of of buggy/broken code; I'll try to find more time to scare users away from it. > I recommend against a move to RE2 at the moment as way too > much would break RE2 could be done gradually, like frozen strings: %re2[foo] Or a magic comment: "# regexp-engine: re2" Perl supports pluggable re::engine since 2007, so more things Ruby can learn from Perl :> ```