From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: poffice@blade.nagaokaut.ac.jp Delivered-To: poffice@blade.nagaokaut.ac.jp Received: from kankan.nagaokaut.ac.jp (kankan.nagaokaut.ac.jp [133.44.2.24]) by blade.nagaokaut.ac.jp (Postfix) with ESMTP id 3A1D919C0535 for ; Mon, 30 Nov 2015 22:18:13 +0900 (JST) Received: from voscc.nagaokaut.ac.jp (voscc.nagaokaut.ac.jp [133.44.1.100]) by kankan.nagaokaut.ac.jp (Postfix) with ESMTP id 45F1DB5D8F5 for ; Mon, 30 Nov 2015 22:49:31 +0900 (JST) Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by voscc.nagaokaut.ac.jp (Postfix) with ESMTP id DBB0418CC7CC for ; Mon, 30 Nov 2015 22:49:31 +0900 (JST) Received: from [221.186.184.76] (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 1C707120498; Mon, 30 Nov 2015 22:49:30 +0900 (JST) X-Original-To: ruby-core@ruby-lang.org Delivered-To: ruby-core@ruby-lang.org Received: from o10.shared.sendgrid.net (o10.shared.sendgrid.net [173.193.132.135]) by neon.ruby-lang.org (Postfix) with ESMTPS id 9DB63120470 for ; Mon, 30 Nov 2015 22:49:25 +0900 (JST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sendgrid.me; h=from:to:references:subject:mime-version:content-type:content-transfer-encoding:list-id; s=smtpapi; bh=Oz767yqaiA42hZ4BJwuYM24GSw4=; b=WiaA0niH446flgfDxA +rYWW+zvKXXLXFCleL3FKasoDILW4Brv0mh160jnxxr7QuAm2pUGr+KcyCzXX8uI UAAhXV/e+H8yaUzTSPWpH1La+ZBFZNhppu8NZsflzv1lm9y6bcS30QcHLsZ9v2GD +3xj09jCwbBl2SQW8h6BCRPQo= Received: by filter0444p1mdw1.sendgrid.net with SMTP id filter0444p1mdw1.32260.565C53C81C 2015-11-30 13:48:56.221126476 +0000 UTC Received: from herokuapp.com (ec2-54-90-151-22.compute-1.amazonaws.com [54.90.151.22]) by ismtpd0003p1iad1.sendgrid.net (SG) with ESMTP id 6gbMOKLxQCyrRflzwlJcZw for ; Mon, 30 Nov 2015 13:48:56.114 +0000 (UTC) Date: Mon, 30 Nov 2015 13:48:56 +0000 From: shugo@ruby-lang.org To: ruby-core@ruby-lang.org Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Redmine-MailingListIntegration-Message-Ids: 46443 X-Redmine-Project: ruby-trunk X-Redmine-Issue-Id: 10891 X-Redmine-Issue-Author: tom-lord X-Redmine-Issue-Assignee: naruse X-Redmine-Sender: shugo X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: ync6xU2WACa70kv/Ymy4QrNMhiuLXJG8OTL2vJD1yS4bSarQUYES38unP93Z2J3/40UqmWwGadYCtX z7QHClVO9dcG+zbh4w/7QZvZ4zKGwg/wpENOK29dpbN+wkk88gCtHlCqEUWb/YetC7RL+10OyOK/0Q 4AJWVu2QvaL96+4pMfGC0dhSQtvmURJEQxHy X-ML-Name: ruby-core X-Mail-Count: 71756 Subject: [ruby-core:71756] [Ruby trunk - Bug #10891] /[[:punct:]]/ POSIX group broken (with string literals?) X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #10891 has been updated by Shugo Maeda. Yui NARUSE wrote: > It follows UTR#18's Standard Recommendation. > http://www.unicode.org/reports/tr18/#punct In general, it would be a reasonable choice. However, in Ruby, the problem is that it's hard to guess the programmers intention from code, because the behavior is decided not by the regular expression, but by the target string. ``` def do_something(s) ... if /[[:punct:]]/ =~ s # should "<" match, or shouldn't? ... end ... end ``` If you want to reject symbols, `/\p{P}/` can be used instead, and it's more readable. ---------------------------------------- Bug #10891: /[[:punct:]]/ POSIX group broken (with string literals?) https://bugs.ruby-lang.org/issues/10891#change-55167 * Author: Tom Lord * Status: Feedback * Priority: Normal * Assignee: Yui NARUSE * ruby -v: ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-linux] * Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN ---------------------------------------- The regular expression: `/[[:punct:]]/` should match the following characters: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~ However, it only works for these characters: ! " # % & ' ( ) * , - . / : ; ? @ [ \\ ] _ { } And does not work for these characters: $ + < = > ^ ` | ~ However, this is where it gets really weird... Consider the following: 60.chr == "<" # true 60.chr =~ /[[:punct:]]/ # => 0 "<" =~ /[[:punct:]]/ # => nil So, it seems that the regular expression only fails for string literals! -- https://bugs.ruby-lang.org/