From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-2.7 required=3.0 tests=AWL,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=no autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 2DEA41F5AE for ; Tue, 4 May 2021 10:31:39 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 1CB9E120DD7; Tue, 4 May 2021 19:30:34 +0900 (JST) Received: from o1678948x4.outbound-mail.sendgrid.net (o1678948x4.outbound-mail.sendgrid.net [167.89.48.4]) by neon.ruby-lang.org (Postfix) with ESMTPS id 83494120DD5 for ; Tue, 4 May 2021 19:30:31 +0900 (JST) Received: by filterdrecv-5d58f8bcff-w9t2m with SMTP id filterdrecv-5d58f8bcff-w9t2m-1-60912280-22 2021-05-04 10:31:28.227396751 +0000 UTC m=+479705.718674463 Received: from herokuapp.com (unknown) by ismtpd0171p1iad2.sendgrid.net (SG) with ESMTP id 7F27f0TiSC2-JBMIH3p7DQ for ; Tue, 04 May 2021 10:31:28.120 +0000 (UTC) Date: Tue, 04 May 2021 10:31:28 +0000 (UTC) From: eregontp@gmail.com Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Feature X-Redmine-Issue-Id: 17837 X-Redmine-Issue-Author: sam.saffron X-Redmine-Sender: Eregon X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 79757 X-SG-EID: =?us-ascii?Q?KippOI8ZHtTweq7XfQzW93937kJ4QNWwSBuHnaMEcr3ApT8kDwXydwV=2FxpNZip?= =?us-ascii?Q?a6v1LgFpALLORyGcGZJj5pFQVYkoqQXqZEE4T+u?= =?us-ascii?Q?jBHCXISpLvJTPGdoF1urKT2vMPxvE41yplIjEC1?= =?us-ascii?Q?lSy0YR4PVy0B8jpNvWPnYQ1WI2c3vlrYBqgBtUY?= =?us-ascii?Q?WrTLe6pZYZz=2FTocr8BYrJKxJTwtL=2FQyVs2g=3D=3D?= To: ruby-core@ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== X-ML-Name: ruby-core X-Mail-Count: 103710 Subject: [ruby-core:103710] [Ruby master Feature#17837] Add support for Regexp timeouts X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #17837 has been updated by Eregon (Benoit Daloze). sam.saffron (Sam Saffron) wrote in #note-6: > Sort of, it gets complicated. Unicorn is easy cause it is single threaded. Killing off threads in Puma is much more fraught, in Sidekiq the old pattern of killing off was nuked by Mike cause he saw it as way too risky https://github.com/mperham/sidekiq/commit/7e094567a585578fad0bfd0c8669efb46643f853. I think fixing Timeout.timeout might be possible. The main/major issue is it can trigger within `ensure`, right? Is there anything else? We could automatically mask `Thread#raise` within `ensure` so it only happens after the `ensure` body completes. And we could still have a larger "hard timeout" if an `ensure` takes way too long (shouldn't happen, but one cannot be sure). I recall discussing this with @schneems some time ago on Twitter. ---------------------------------------- Feature #17837: Add support for Regexp timeouts https://bugs.ruby-lang.org/issues/17837#change-91803 * Author: sam.saffron (Sam Saffron) * Status: Open * Priority: Normal ---------------------------------------- ### Background ReDoS are a very common security issue. At Discourse we have seen a few through the years. https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS In a nutshell there are 100s of ways this can happen in production apps, the key is for an attacker (or possibly innocent person) to supply either a problematic Regexp or a bad string to test it with. ``` /A(B|C+)+D/ =~ "A" + "C" * 100 + "X" ``` Having a problem Regexp somewhere in a large app is a universal constant, it will happen as long as you are using Regexps. Currently the only feasible way of supplying a consistent safeguard is by using `Thread.raise` and managing all execution. This kind of pattern requires usage of a third party implementation. There are possibly issues with jRuby and Truffle when taking approaches like this. ### Prior art .NET provides a `MatchTimeout` property per: https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-5.0 Java has nothing built in as far as I can tell: https://stackoverflow.com/questions/910740/cancelling-a-long-running-regex-match Node has nothing built in as far as I can tell: https://stackoverflow.com/questions/38859506/cancel-regex-match-if-timeout Golang and Rust uses RE2 which is not vulnerable to DoS by limiting features (available in Ruby RE2 gem) ``` irb(main):003:0> r = RE2::Regexp.new('A(B|C+)+D') => # irb(main):004:0> r.match("A" + "C" * 100 + "X") => nil ``` ### Proposal Implement `Regexp.timeout` which allow us to specify a global timeout for all Regexp operations in Ruby. Per Regexp would require massive application changes, almost all web apps would do just fine with a 1 second Regexp timeout. If `timeout` is set to `nil` everything would work as it does today, when set to second a "monitor" thread would track running regexps and time them out according to the global value. ### Alternatives I recommend against a "per Regexp" API as this decision is at the application level. You want to apply it to all regular expressions in all the gems you are consuming. I recommend against a move to RE2 at the moment as way too much would break ### See also: https://people.cs.vt.edu/davisjam/downloads/publications/Davis-Dissertation-2020.pdf https://levelup.gitconnected.com/the-regular-expression-denial-of-service-redos-cheat-sheet-a78d0ed7d865 -- https://bugs.ruby-lang.org/