From: gotoken@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:88669] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
Date: Mon, 27 Aug 2018 05:44:44 +0000 (UTC) [thread overview]
Message-ID: <redmine.journal-73726.20180827054442.a170f0f95bc40f15@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-13671.20170622232858@ruby-lang.org
Issue #13671 has been updated by gotoken (Kentaro Goto).
Thanks znz. The workaround is helpful. And I understood what was happened.
https://github.com/k-takata/Onigmo/issues/92#issuecomment-373981492 shows how some combinations of letters are variable length.
For example, `"ss"` and `"st"` are mapped `"ß"` (`"\u00DF"`) and `"st"` (`"\uFB06"`).
Those combinations are listed in ftp://ftp.unicode.org/Public/UNIDATA/SpecialCasing.txt
By the way, this expansion by `//i` option looks over kill for me.
I wish case sensitivity and SpecialCasing mapping were separated...
----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73726
* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:
~~~ ruby
def test(description)
begin
yield
puts "#{description} is OK"
rescue RegexpError
puts "#{description} raises RegexpError"
end
end
test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as, case-insensitive, special") { /(?<!as)/i =~ '✨' }
test("ss, case-insensitive, special") { /(?<!ss)/i =~ '✨' }
test("ass, case-sensitive, special") { /(?<!ass)/ =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }
~~~
Running the test program with Ruby 2.4.1 (macOS) gives
~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as, case-insensitive, special is OK
ss, case-insensitive, special is OK
ass, case-sensitive, special is OK
ass, case-insensitive, regular is OK
~~~
The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"
Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.
Running the test program with Ruby 2.3.4 does not report any RegexpErrors.
I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.
---Files--------------------------------
test.rb (531 Bytes)
--
https://bugs.ruby-lang.org/
next prev parent reply other threads:[~2018-08-27 5:44 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
2017-06-22 23:28 ` [ruby-core:81742] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters dave
2017-06-23 8:49 ` [ruby-core:81748] " hanmac
2017-07-14 9:51 ` [ruby-core:82065] " naruse
2018-08-27 2:35 ` [ruby-core:88656] " gotoken
2018-08-27 3:46 ` [ruby-core:88657] " zn
2018-08-27 5:44 ` gotoken [this message]
2018-08-27 6:02 ` [ruby-core:88671] " shyouhei
2018-08-27 6:31 ` [ruby-core:88674] " gotoken
2018-08-29 10:20 ` [ruby-core:88729] " duerst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.ruby-lang.org/en/community/mailing-lists/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=redmine.journal-73726.20180827054442.a170f0f95bc40f15@ruby-lang.org \
--to=ruby-core@ruby-lang.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).