ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:81742] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
@ 2017-06-22 23:28 ` dave
  2017-06-23  8:49 ` [ruby-core:81748] " hanmac
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: dave @ 2017-06-22 23:28 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been reported by dschweisguth (Dave Schweisguth).

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:81748] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
  2017-06-22 23:28 ` [ruby-core:81742] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters dave
@ 2017-06-23  8:49 ` hanmac
  2017-07-14  9:51 ` [ruby-core:82065] " naruse
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: hanmac @ 2017-06-23  8:49 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by Hanmac (Hans Mackowiak).


did some checks on my windows system to check how deep the problem is.
i used "ä" as variable.

the same problem happens when you try to use match function too:
`/(?<!ass)/i.match('ä')`
also happen for
`Regexp.union(/(?<!ass)/i, /ä/)`

but i still don't understand why it does crash with ass, while ss works.
might have something todo how regexp are stored internal

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-65448

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:82065] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
  2017-06-22 23:28 ` [ruby-core:81742] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters dave
  2017-06-23  8:49 ` [ruby-core:81748] " hanmac
@ 2017-07-14  9:51 ` naruse
  2018-08-27  2:35 ` [ruby-core:88656] " gotoken
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: naruse @ 2017-07-14  9:51 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by naruse (Yui NARUSE).


I created a ticket in upstream: https://github.com/k-takata/Onigmo/issues/92

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-65799

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:88656] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2017-07-14  9:51 ` [ruby-core:82065] " naruse
@ 2018-08-27  2:35 ` gotoken
  2018-08-27  3:46 ` [ruby-core:88657] " zn
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: gotoken @ 2018-08-27  2:35 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by gotoken (Kentaro Goto).


I encountered a non `ss` case.  Is this a same problem?

```
% ruby -ve '"".match(/(?<=ast)/ui)'
ruby 2.6.0dev (2018-08-27 trunk 64549) [x86_64-linux]
-e:1: invalid pattern in look-behind: /(?<=ast)/i
```

It was reproduced in version 2.4 and 2.5.
#14838 seems to be duplicate.

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73712

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:88657] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2018-08-27  2:35 ` [ruby-core:88656] " gotoken
@ 2018-08-27  3:46 ` zn
  2018-08-27  5:44 ` [ruby-core:88669] " gotoken
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: zn @ 2018-08-27  3:46 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by znz (Kazuhiro NISHIYAMA).


You can use `(?:s)` instead of `s` for workaround.

```
$ ruby -ve '/(?<=ast)/iu'
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-darwin17]
-e:1: invalid pattern in look-behind: /(?<=ast)/i
-e:1: warning: possibly useless use of a literal in void context
$ ruby -ve '/(?<=a(?:s)t)/iu'
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-darwin17]
-e:1: warning: possibly useless use of a literal in void context
```


----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73713

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:88669] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2018-08-27  3:46 ` [ruby-core:88657] " zn
@ 2018-08-27  5:44 ` gotoken
  2018-08-27  6:02 ` [ruby-core:88671] " shyouhei
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: gotoken @ 2018-08-27  5:44 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by gotoken (Kentaro Goto).


Thanks znz. The workaround is helpful. And I understood what was happened.
https://github.com/k-takata/Onigmo/issues/92#issuecomment-373981492 shows how some combinations of letters are variable length.

For example, `"ss"` and `"st"` are mapped `"ß"` (`"\u00DF"`) and `"st"` (`"\uFB06"`).
Those combinations are listed in ftp://ftp.unicode.org/Public/UNIDATA/SpecialCasing.txt

By the way, this expansion by `//i` option looks over kill for me. 
I wish case sensitivity and SpecialCasing mapping were separated...

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73726

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:88671] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2018-08-27  5:44 ` [ruby-core:88669] " gotoken
@ 2018-08-27  6:02 ` shyouhei
  2018-08-27  6:31 ` [ruby-core:88674] " gotoken
  2018-08-29 10:20 ` [ruby-core:88729] " duerst
  8 siblings, 0 replies; 9+ messages in thread
From: shyouhei @ 2018-08-27  6:02 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by shyouhei (Shyouhei Urabe).


gotoken (Kentaro Goto) wrote:
> By the way, this expansion by `//i` option looks over kill for me. 
> I wish case sensitivity and SpecialCasing mapping were separated...

I know how you feel.  Too bad we are just doing what Unicode specifies to do. 

See also http://unicode.org/faq/casemap_charprop.html#11


----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73728

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:88674] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2018-08-27  6:02 ` [ruby-core:88671] " shyouhei
@ 2018-08-27  6:31 ` gotoken
  2018-08-29 10:20 ` [ruby-core:88729] " duerst
  8 siblings, 0 replies; 9+ messages in thread
From: gotoken @ 2018-08-27  6:31 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by gotoken (Kentaro Goto).


Thanks shyouhei for your pointing out. 

I imagine another Rexexp option, say `//I`, which is almost the same as `//i` except for never-applying SpecialCasing mapping. 
This change extends Unicode matching indeed but does not introduce incompatibilities, IMHO.
A difficulty is the implementation is on the upstream library and cruby is just a user. 

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73731

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:88729] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
       [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2018-08-27  6:31 ` [ruby-core:88674] " gotoken
@ 2018-08-29 10:20 ` duerst
  8 siblings, 0 replies; 9+ messages in thread
From: duerst @ 2018-08-29 10:20 UTC (permalink / raw)
  To: ruby-core

Issue #13671 has been updated by duerst (Martin Dürst).


gotoken (Kentaro Goto) wrote:

> For example, `"ss"` and `"st"` are mapped `"ß"` (`"\u00DF"`) and `"st"` (`"\uFB06"`).
> Those combinations are listed in ftp://ftp.unicode.org/Public/UNIDATA/SpecialCasing.txt
> 
> By the way, this expansion by `//i` option looks over kill for me. 
> I wish case sensitivity and SpecialCasing mapping were separated...

I still have to verify this, but currently I strongly suspect that the problem is NOT in SpecialCasing, but in how Onigmo (/Oniguruma?) implement it. 

----------------------------------------
Bug #13671: Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters
https://bugs.ruby-lang.org/issues/13671#change-73785

* Author: dschweisguth (Dave Schweisguth)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.4.1
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Here is a test program:

~~~ ruby
def test(description)
  begin
    yield
    puts "#{description} is OK"
  rescue RegexpError
    puts "#{description} raises RegexpError"
  end
end

test("ass, case-insensitive, special") { /(?<!ass)/i =~ '✨' }
test("bss, case-insensitive, special") { /(?<!bss)/i =~ '✨' }
test("as,  case-insensitive, special") { /(?<!as)/i  =~ '✨' }
test("ss,  case-insensitive, special") { /(?<!ss)/i  =~ '✨' }
test("ass, case-sensitive,   special") { /(?<!ass)/  =~ '✨' }
test("ass, case-insensitive, regular") { /(?<!ass)/i =~ 'x' }

~~~

Running the test program with Ruby 2.4.1 (macOS) gives

~~~
ass, case-insensitive, special raises RegexpError
bss, case-insensitive, special raises RegexpError
as,  case-insensitive, special is OK
ss,  case-insensitive, special is OK
ass, case-sensitive,   special is OK
ass, case-insensitive, regular is OK

~~~

The RegexpError is "invalid pattern in look-behind: /(?<!ass)/i (RegexpError)"

Side note: in the real code in which I found this error I was able to work around the error by using (?i) after the lookbehind instead of //i.

Running the test program with Ruby 2.3.4 does not report any RegexpErrors.

I think this is a regression, although I might be wrong and it might be saving me from an incorrect result with certain strings.

---Files--------------------------------
test.rb (531 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-08-29 10:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-13671.20170622232858@ruby-lang.org>
2017-06-22 23:28 ` [ruby-core:81742] [Ruby trunk Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters dave
2017-06-23  8:49 ` [ruby-core:81748] " hanmac
2017-07-14  9:51 ` [ruby-core:82065] " naruse
2018-08-27  2:35 ` [ruby-core:88656] " gotoken
2018-08-27  3:46 ` [ruby-core:88657] " zn
2018-08-27  5:44 ` [ruby-core:88669] " gotoken
2018-08-27  6:02 ` [ruby-core:88671] " shyouhei
2018-08-27  6:31 ` [ruby-core:88674] " gotoken
2018-08-29 10:20 ` [ruby-core:88729] " duerst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).