From: mame@ruby-lang.org
To: ruby-core@ruby-lang.org
Subject: [ruby-core:103236] [Ruby master Bug#17774] Quantified empty group causes regex to fail
Date: Mon, 05 Apr 2021 06:16:15 +0000 (UTC) [thread overview]
Message-ID: <redmine.journal-91317.20210405061613.51428@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-17774.20210404080853.51428@ruby-lang.org
Issue #17774 has been updated by mame (Yusuke Endoh).
Thank you, I can reproduce the issue.
The issue is in the code from [onigmo](https://github.com/k-takata/Onigmo), so it would be helpful if you could report this issue to the upstream.
By a quick investigation, an optimization expands `(){4}`, and does not expand `(){5}`, which makes the difference of the behavior.
Enabling debug output suggests that the bug is caused by `USE_MONOMANIAC_CHECK_CAPTURES_IN_ENDLESS_REPEAT` option. The `(){5}` case works great by the following change that disables the option, but I'm unsure the performance impact.
```
diff --git a/regint.h b/regint.h
index 0740429688..968ea6cde8 100644
--- a/regint.h
+++ b/regint.h
@@ -71,7 +71,6 @@
#define USE_PERL_SUBEXP_CALL
#define USE_CAPITAL_P_NAMED_GROUP
#define USE_BACKREF_WITH_LEVEL /* \k<name+n>, \k<name-n> */
-#define USE_MONOMANIAC_CHECK_CAPTURES_IN_ENDLESS_REPEAT /* /(?:()|())*\2/ */
#define USE_NEWLINE_AT_END_OF_STRING_HAS_EMPTY_LINE /* /\n$/ =~ "\n" */
#define USE_WARNING_REDUNDANT_NESTED_REPEAT_OPERATOR
/* !!! moved to regenc.h. */ /* #define USE_CRNL_AS_LINE_TERMINATOR */
```
----------------------------------------
Bug #17774: Quantified empty group causes regex to fail
https://bugs.ruby-lang.org/issues/17774#change-91317
* Author: Davidebyzero (David Ellsworth)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-msys]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
The regex `^((x*)(?=\2$))*x$` matches powers of 2 in unary, expressed as strings of `x` characters whose length is the number.
Adding an empty group `()` in the middle of it should have no effect on its operation, and indeed it does not. `^((x*)()(?=\2$))*x$` still matches powers of 2 just fine.
Quantifying that empty group, `(){4}`, should still have no effect. And indeed, `^((x*)(){4}(?=\2$))*x$` still matches powers of 2. But quantify that to `(){5}`, and suddenly it fails.
The following command line should print `1`, but instead prints nothing:
```
ruby -e 'print 1 if "x"*32 =~ /^((x*)(){5}(?=\2$))*x$/'
```
However this one does print `1`:
```
ruby -e 'print 1 if "x"*32 =~ /^((x*)(){4}(?=\2$))*x$/'
```
Bug found to occur on [Try It Online](https://tio.run/): `ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-linux]`
Bug confirmed to happen on my own machine: `ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-msys]`
Solving the challenge [Is that number a Two Bit Number™️?](https://codegolf.stackexchange.com/questions/211840/is-that-number-a-two-bit-number%ef%b8%8f/222792#222792) on Code Golf Stack Exchange is what led me to discover this bug.
--
https://bugs.ruby-lang.org/
next prev parent reply other threads:[~2021-04-05 6:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-04 8:08 [ruby-core:103200] [Ruby master Bug#17774] Quantified empty group causes regex to fail davidell
2021-04-05 6:16 ` mame [this message]
2021-05-01 11:06 ` [ruby-core:103681] " s.wanabe
2021-05-01 12:50 ` [ruby-core:103682] " sawadatsuyoshi
2021-05-01 20:06 ` [ruby-core:103689] " s.wanabe
2021-10-13 16:43 ` [ruby-core:105633] " jeremyevans0 (Jeremy Evans)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.ruby-lang.org/en/community/mailing-lists/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=redmine.journal-91317.20210405061613.51428@ruby-lang.org \
--to=ruby-core@ruby-lang.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).