From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: poffice@blade.nagaokaut.ac.jp Delivered-To: poffice@blade.nagaokaut.ac.jp Received: from kankan.nagaokaut.ac.jp (kankan.nagaokaut.ac.jp [133.44.2.24]) by blade.nagaokaut.ac.jp (Postfix) with ESMTP id 03DB719E005C for ; Fri, 18 Dec 2015 12:42:35 +0900 (JST) Received: from voscc.nagaokaut.ac.jp (voscc.nagaokaut.ac.jp [133.44.1.100]) by kankan.nagaokaut.ac.jp (Postfix) with ESMTP id 11854B5D88C for ; Fri, 18 Dec 2015 13:14:42 +0900 (JST) Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by voscc.nagaokaut.ac.jp (Postfix) with ESMTP id 6DA4718CC7CC for ; Fri, 18 Dec 2015 13:14:42 +0900 (JST) Received: from [221.186.184.76] (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id E6AED120767; Fri, 18 Dec 2015 13:13:47 +0900 (JST) X-Original-To: ruby-core@ruby-lang.org Delivered-To: ruby-core@ruby-lang.org Received: from mail-io0-f171.google.com (mail-io0-f171.google.com [209.85.223.171]) by neon.ruby-lang.org (Postfix) with ESMTPS id 868F212040C for ; Fri, 18 Dec 2015 13:13:18 +0900 (JST) Received: by mail-io0-f171.google.com with SMTP id o67so77666655iof.3 for ; Thu, 17 Dec 2015 20:13:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:message-id:subject:mime-version:content-type; bh=bgY1Vymxr/1h8omJpJoxzJwvpP1W9+R+eTqcyJjBCJI=; b=YAOBcGwbgDyAtNBntotrDrcvnO/CoDNA8Q04zNVT3nf1h1Luq0j0DAtijngavkzG61 mT2Pe99oasJan1NluytcloHWWHiG97gG75GwsFh5M9gY1z3mhA29RfFv/WSZFw70amvE A7ihF5sKxlePyjCRXW5SnZw/KQMfMYgjbuvKrTbMHCNE9D6XJfmgmeNcKfnw8Lwv9t1l 11CCKpg1pGTUT4vV1rAucphLWs4mQIC78vJdBX6APp73rOvVXVy/etOZEo1CphiF1KJX we/J2z10VX8AFq32dK/DuoWS+TBbkRIx+mThxPO4zkJsHGAOjZSTzKpZRTUnA8SwnIiV hNQA== X-Received: by 10.107.3.88 with SMTP id 85mr2159825iod.101.1450411997345; Thu, 17 Dec 2015 20:13:17 -0800 (PST) Received: from Josephe-Jones (75-166-130-47.hlrn.qwest.net. [75.166.130.47]) by smtp.gmail.com with ESMTPSA id y143sm5827778iod.35.2015.12.17.20.13.16 for (version=TLSv1/SSLv3 cipher=OTHER); Thu, 17 Dec 2015 20:13:16 -0800 (PST) Date: Thu, 17 Dec 2015 21:13:16 -0700 From: Joseph Jones To: Ruby developers Cc: Message-ID: <475C5A37-27C6-46BA-BE6F-340E97B4DFC1@gmail.com> X-Mailer: BoxerFree 6.0.4 (321) X-Boxer-Generated: true X-Boxer-IsLike: true MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="567387dc_1fbfe8e0_16c" X-ML-Name: ruby-core X-Mail-Count: 72338 Subject: [ruby-core:72338] =?utf-8?q?_=5BRuby_trunk_-_Bug_=2310097=5D_Case-insensitive_Rege?= =?utf-8?q?xp_matching_for_Windows-1252_not_working_for_=C5=A0?= =?utf-8?b?xaHFvcW+xZLFk8O/xbg=?= X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" --567387dc_1fbfe8e0_16c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Joseph Jones liked your message with Boxer. On December 11, 2015 at 01:04= :08 MST, duerst=40it.aoyama.ac.jp wrote:Issue =2310097 has been updated b= y Martin D=C3=BCrst.Nobuyoshi Nakada wrote:> Is this correct=3F> https://= github.com/nobu/ruby/compare/windows-1252Sorry for the very slow response= . Please commit. Thanks=21----------------------------------------Bug =23= 10097: Case-insensitive Regexp matching for Windows-1252 not working for = =C5=A0=C5=A1=C5=BD=C5=BE=C5=92=C5=93=C3=BF=C5=B8https://bugs.ruby-lang.or= g/issues/10097=23change-55458* Author: Martin D=C3=BCrst* Status: Open* P= riority: Normal* Assignee: * ruby -v: 1.9.3p545* Backport: 2.0.0: UNKNOWN= , 2.1: UNKNOWN----------------------------------------By chance I had a l= ook at enc/iso=5F8859=5F1.c and found=7E=7E=7ECENC=5FREPLICATE(=22Windows= -1252=22, =22ISO-8859-1=22)=7E=7E=7Eon line 288. But this does not work f= or case folding:=7E=7E=7Eruby=23 http://en.wikipedia.org/wiki/Windows-125= 2s1 =3D =22=5Cu0160=22.encode 'windows-1252' =23 '=C5=A0'r1 =3D Regexp.ne= w(=22=5Cu0161=22.encode('windows-1252'), Regexp::IGNORECASE) =23 /=C5=A1/= is1 =3D=7E r1 =23 =3D> nils2 =3D =22=5Cu0178=22.encode 'windows-1252' =23= '=C5=B8'r2 =3D Regexp.new(=22=5Cu00=46=46=22.encode('windows-1252'), Reg= exp::IGNORECASE) =23 /=C3=BF/is2 =3D=7E r2 =23 =3D> nils3 =3D =22=5Cu00C0= =22.encode 'windows-1252' =23 '=C3=80'r3 =3D Regexp.new(=22=5Cu00E0=22.en= code('windows-1252'), Regexp::IGNORECASE) =23 /=C3=A0/is3 =3D=7E r3 =23 =3D= > 0=7E=7E=7ESo case-insensitive matching works when both characters are i= n iso-8859-1, but not when one (=C3=BF=C5=B8) or both (=C5=A0=C5=A1=C5=BD= =C5=BE=C5=92=C5=93) characters are not in iso-8859-1.-- https://bugs.ruby= -lang.org/ --567387dc_1fbfe8e0_16c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Joseph Jones liked your message with Boxer.


= On December 11, 2015 at 01:04:08 MST, duerst=40it.aoyama.ac.jp wrote:
Issue =2310097 has been u= pdated by Martin D=C3=BCrst.


Nobuyoshi Nakada wrote:
> Is this correct=3F
> https://github.com/nobu/ruby/compare/window= s-1252

Sorry for the very slow response. Please commit. Thanks= =21

----------------------------------------
Bug =2310097= : Case-insensitive Regexp matching for Windows-1252 not working for =C5=A0= =C5=A1=C5=BD=C5=BE=C5=92=C5=93=C3=BF=C5=B8
https://bugs.ruby-lang.or= g/issues/10097=23change-55458

* Author: Martin D=C3=BCrst
* Status: Open
* Priority: Normal
* Assignee:
* ruby -v:= 1.9.3p545
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
-----------= -----------------------------
By chance I had a look at enc/iso=5F88= 59=5F1.c and found

=7E=7E=7EC
ENC=5FREPLICATE(=22Windows-= 1252=22, =22ISO-8859-1=22)
=7E=7E=7E
on line 288. But this does= not work for case folding:

=7E=7E=7Eruby
=23 http://en.w= ikipedia.org/wiki/Windows-1252
s1 =3D =22=5Cu0160=22.encode 'windows= -1252' =23 '=C5=A0'
r1 =3D Regexp.new(=22=5Cu0161=22.encode('windows= -1252'), Regexp::IGNORECASE) =23 /=C5=A1/i
s1 =3D=7E r1
=23 = =3D> nil
s2 =3D =22=5Cu0178=22.encode 'windows-1252' =23 '=C5=B8'r2 =3D Regexp.new(=22=5Cu00=46=46=22.encode('windows-1252'), Regexp::I= GNORECASE) =23 /=C3=BF/i
s2 =3D=7E r2
=23 =3D> nil
s3 =3D= =22=5Cu00C0=22.encode 'windows-1252' =23 '=C3=80'
r3 =3D Regexp.new= (=22=5Cu00E0=22.encode('windows-1252'), Regexp::IGNORECASE) =23 /=C3=A0/i=
s3 =3D=7E r3
=23 =3D> 0
=7E=7E=7E

So case-i= nsensitive matching works when both characters are in iso-8859-1, but not= when one (=C3=BF=C5=B8) or both (=C5=A0=C5=A1=C5=BD=C5=BE=C5=92=C5=93) c= haracters are not in iso-8859-1.



--
https://= bugs.ruby-lang.org/
--567387dc_1fbfe8e0_16c--