From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id A266A1F5AE for ; Fri, 7 May 2021 07:57:51 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id CE219120F56; Fri, 7 May 2021 16:56:31 +0900 (JST) Received: from o1678948x4.outbound-mail.sendgrid.net (o1678948x4.outbound-mail.sendgrid.net [167.89.48.4]) by neon.ruby-lang.org (Postfix) with ESMTPS id E1654120AAE for ; Fri, 7 May 2021 16:56:29 +0900 (JST) Received: by filterdrecv-canary-69f987b9f9-sttmb with SMTP id filterdrecv-canary-69f987b9f9-sttmb-1-6094F2E6-33 2021-05-07 07:57:26.877980947 +0000 UTC m=+48582.728059205 Received: from herokuapp.com (unknown) by ismtpd0178p1mdw1.sendgrid.net (SG) with ESMTP id ofJCmlrxQoeZ2quzd5Ggtw for ; Fri, 07 May 2021 07:57:26.732 +0000 (UTC) Date: Fri, 07 May 2021 07:57:27 +0000 (UTC) From: mame@ruby-lang.org Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Feature X-Redmine-Issue-Id: 17837 X-Redmine-Issue-Author: sam.saffron X-Redmine-Sender: mame X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 79817 X-SG-EID: =?us-ascii?Q?EJh2gqwnyqXtd++xo=2FinyA1V0bXouTB4FkWnzNiKb49EDsPZLE2YLDZNTByqmj?= =?us-ascii?Q?RfewvcRU8maCtUcOejg8tAmfysJLa8EYqspUaxX?= =?us-ascii?Q?82hbFCei7paZym2EUIiAUn32HD=2FjXpwpl16zRvt?= =?us-ascii?Q?pAF0BTwDQfjHBmj4orBktrXS6Cy0RCeg49t1H7p?= =?us-ascii?Q?NRXQow1387k6ZFtWUzP71doGKLTOi3Bl+VuFbIy?= =?us-ascii?Q?eoiaJOF1V2ju67qNg=3D?= To: ruby-core@ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== X-ML-Name: ruby-core X-Mail-Count: 103770 Subject: [ruby-core:103770] [Ruby master Feature#17837] Add support for Regexp timeouts X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #17837 has been updated by mame (Yusuke Endoh). I've created a simple prototype of `Regexp.timeout=` by a polling approach. Conclusion first. It brings about 5% overhead in micro benchmark, unfortunately. I guess it is unlikely to be significant in a real application, but not good anyway. --- The following is about my patch, just for the record. https://github.com/ruby/ruby/compare/master...mame:regexp-timeout-prototype Implementation approach: 1. When starting regexp matching, the current time is recorded by using `clock_gettime(CLOCK_MONOTONIC)` 2. At `CHECK_INTERRUPT_IN_MATCH_AT`, the elapsed time is calculated and an exception is raised if expired Example: ``` Regexp.timeout = 1 # one second /^(([a-z])+)+$/ =~ "abcdefghijklmnopqrstuvwxyz@" #=> regexp match timeout (RuntimeError) ``` Benchmark: The following simple regexp matching becomes 4.8% slower. ``` 10000000.times { /(abc)+/ =~ "abcabcabc" } # The minimum time in 10 executions # before: 1.962 s # after: 2.056 s ``` The following complex regexp matching becomes 5.2% slower. ``` /^(([a-z])+)+$/ =~ "abcdefghijklmnopqrstuvwxyz@" # The minimum time in 10 executions # before: 2.237 s # after: 2.353 s ``` ---------------------------------------- Feature #17837: Add support for Regexp timeouts https://bugs.ruby-lang.org/issues/17837#change-91882 * Author: sam.saffron (Sam Saffron) * Status: Open * Priority: Normal ---------------------------------------- ### Background ReDoS are a very common security issue. At Discourse we have seen a few through the years. https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS In a nutshell there are 100s of ways this can happen in production apps, the key is for an attacker (or possibly innocent person) to supply either a problematic Regexp or a bad string to test it with. ``` /A(B|C+)+D/ =~ "A" + "C" * 100 + "X" ``` Having a problem Regexp somewhere in a large app is a universal constant, it will happen as long as you are using Regexps. Currently the only feasible way of supplying a consistent safeguard is by using `Thread.raise` and managing all execution. This kind of pattern requires usage of a third party implementation. There are possibly issues with jRuby and Truffle when taking approaches like this. ### Prior art .NET provides a `MatchTimeout` property per: https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-5.0 Java has nothing built in as far as I can tell: https://stackoverflow.com/questions/910740/cancelling-a-long-running-regex-match Node has nothing built in as far as I can tell: https://stackoverflow.com/questions/38859506/cancel-regex-match-if-timeout Golang and Rust uses RE2 which is not vulnerable to DoS by limiting features (available in Ruby RE2 gem) ``` irb(main):003:0> r = RE2::Regexp.new('A(B|C+)+D') => # irb(main):004:0> r.match("A" + "C" * 100 + "X") => nil ``` ### Proposal Implement `Regexp.timeout` which allow us to specify a global timeout for all Regexp operations in Ruby. Per Regexp would require massive application changes, almost all web apps would do just fine with a 1 second Regexp timeout. If `timeout` is set to `nil` everything would work as it does today, when set to second a "monitor" thread would track running regexps and time them out according to the global value. ### Alternatives I recommend against a "per Regexp" API as this decision is at the application level. You want to apply it to all regular expressions in all the gems you are consuming. I recommend against a move to RE2 at the moment as way too much would break ### See also: https://people.cs.vt.edu/davisjam/downloads/publications/Davis-Dissertation-2020.pdf https://levelup.gitconnected.com/the-regular-expression-denial-of-service-redos-cheat-sheet-a78d0ed7d865 -- https://bugs.ruby-lang.org/