* [ruby-core:20226] \b and \B with Unicode
@ 2008-12-02 23:46 Dave Thomas
2008-12-03 0:06 ` [ruby-core:20227] " Radosław Bułat
0 siblings, 1 reply; 5+ messages in thread
From: Dave Thomas @ 2008-12-02 23:46 UTC (permalink / raw
To: ruby-core
#encoding: utf-8
p " ∂og ".match(/\b./u) # => #<MatchData "o">
I was surprised that \b didn't find the boundary between the space and
the Unicode ∂ character. Is that correct?
Dave
^ permalink raw reply [flat|nested] 5+ messages in thread
* [ruby-core:20227] Re: \b and \B with Unicode
2008-12-02 23:46 [ruby-core:20226] \b and \B with Unicode Dave Thomas
@ 2008-12-03 0:06 ` Radosław Bułat
2008-12-03 0:15 ` [ruby-core:20228] " Yukihiro Matsumoto
2008-12-03 0:30 ` [ruby-core:20229] " Michael Selig
0 siblings, 2 replies; 5+ messages in thread
From: Radosław Bułat @ 2008-12-03 0:06 UTC (permalink / raw
To: ruby-core
On Wed, Dec 3, 2008 at 12:46 AM, Dave Thomas <dave@pragprog.com> wrote:
> #encoding: utf-8
> p " ∂og ".match(/\b./u) # => #<MatchData "o">
>
>
> I was surprised that \b didn't find the boundary between the space and the
> Unicode ∂ character. Is that correct?
Maybe ∂ isn't treat as word character?
" ∂og ".match(/\w/) => #<MatchData "o">
I don't know if it should or not.
--
Pozdrawiam
Radosław Bułat
http://radarek.jogger.pl - mój blog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [ruby-core:20228] Re: \b and \B with Unicode
2008-12-03 0:06 ` [ruby-core:20227] " Radosław Bułat
@ 2008-12-03 0:15 ` Yukihiro Matsumoto
2008-12-03 0:30 ` [ruby-core:20229] " Michael Selig
1 sibling, 0 replies; 5+ messages in thread
From: Yukihiro Matsumoto @ 2008-12-03 0:15 UTC (permalink / raw
To: ruby-core
Hi,
In message "Re: [ruby-core:20227] Re: \b and \B with Unicode"
on Wed, 3 Dec 2008 09:06:55 +0900, "=?ISO-8859-2?Q?Rados=B3aw_Bu=B3at?=" <radek.bulat@gmail.com> writes:
|
|On Wed, Dec 3, 2008 at 12:46 AM, Dave Thomas <dave@pragprog.com> wrote:
|> #encoding: utf-8
|> p " ∂og ".match(/\b./u) # => #<MatchData "o">
|>
|> I was surprised that \b didn't find the boundary between the space and the
|> Unicode ∂ character. Is that correct?
|
|Maybe ∂ isn't treat as word character?
|" ∂og ".match(/\w/) => #<MatchData "o">
|I don't know if it should or not.
I think it should. I suspect it's a bug in Oniguruma. I will inspect
it later.
matz.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [ruby-core:20229] Re: \b and \B with Unicode
2008-12-03 0:06 ` [ruby-core:20227] " Radosław Bułat
2008-12-03 0:15 ` [ruby-core:20228] " Yukihiro Matsumoto
@ 2008-12-03 0:30 ` Michael Selig
2008-12-03 3:54 ` [ruby-core:20234] " Dave Thomas
1 sibling, 1 reply; 5+ messages in thread
From: Michael Selig @ 2008-12-03 0:30 UTC (permalink / raw
To: ruby-core
On Wed, 03 Dec 2008 11:06:55 +1100, Radosław Bułat <radek.bulat@gmail.com>
wrote:
> On Wed, Dec 3, 2008 at 12:46 AM, Dave Thomas <dave@pragprog.com> wrote:
>> #encoding: utf-8
>> p " ∂og ".match(/\b./u) # => #<MatchData "o">
>>
>>
>> I was surprised that \b didn't find the boundary between the space and
>> the
>> Unicode ∂ character. Is that correct?
>
> Maybe ∂ isn't treat as word character?
> " ∂og ".match(/\w/) => #<MatchData "o">
> I don't know if it should or not.
>
The character in question is Unicode U+2202 which is "Partial
Differential". Though it looks like it might be a letter, it is NOT
defined as a letter in Unicode (it is part of the "mathematical operators"
block). So I think Ruby is correct here!
Cheers
Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
* [ruby-core:20234] Re: \b and \B with Unicode
2008-12-03 0:30 ` [ruby-core:20229] " Michael Selig
@ 2008-12-03 3:54 ` Dave Thomas
0 siblings, 0 replies; 5+ messages in thread
From: Dave Thomas @ 2008-12-03 3:54 UTC (permalink / raw
To: ruby-core
On Dec 2, 2008, at 6:30 PM, Michael Selig wrote:
> The character in question is Unicode U+2202 which is "Partial
> Differential". Though it looks like it might be a letter, it is NOT
> defined as a letter in Unicode (it is part of the "mathematical
> operators" block). So I think Ruby is correct here!
Ouch: I just used Option-D on the Mac to generate it.
I'll try again.
Thanks
Dave
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-12-03 4:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-02 23:46 [ruby-core:20226] \b and \B with Unicode Dave Thomas
2008-12-03 0:06 ` [ruby-core:20227] " Radosław Bułat
2008-12-03 0:15 ` [ruby-core:20228] " Yukihiro Matsumoto
2008-12-03 0:30 ` [ruby-core:20229] " Michael Selig
2008-12-03 3:54 ` [ruby-core:20234] " Dave Thomas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).