ruby-dev (Japanese) list archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-dev:47108] [ruby-trunk - Bug #7954][Open] "あ".byteslice(0,2).valid_encoding? should return false
@ 2013-02-25  9:03 Tietew (Toru Iwase)
  2013-02-25  9:48 ` [ruby-dev:47109] [ruby-trunk - Bug #7954] " duerst (Martin Dürst)
  2013-02-25 10:09 ` [ruby-dev:47110] [ruby-trunk - Bug #7954][Assigned] " naruse (Yui NARUSE)
  0 siblings, 2 replies; 3+ messages in thread
From: Tietew (Toru Iwase) @ 2013-02-25  9:03 UTC (permalink / raw
  To: ruby developers list


Issue #7954 has been reported by Tietew (Toru Iwase).

----------------------------------------
Bug #7954: "あ".byteslice(0,2).valid_encoding? should return false
https://bugs.ruby-lang.org/issues/7954

Author: Tietew (Toru Iwase)
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 2.0.0
ruby -v: 2.0.0p0


=begin
valid encodingな文字列に対するString#bytesliceがinvalid encodingなバイト列を生成してもvalid_encoding?がtrueになります。
これはfalseになるべきだと思います。

なお、1.9.3も同じ挙動です。

 irb(main):001:0> RUBY_DESCRIPTION
 => "ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]"
 irb(main):002:0> "あ".encoding
 => #<Encoding:UTF-8>
 irb(main):003:0> "あ".valid_encoding?
 => true
 irb(main):004:0> "あ".byteslice(0,2)
 => "\xE3\x81"
 irb(main):005:0> "あ".byteslice(0,2).valid_encoding?
 => true
 irb(main):006:0> "\xE3\x81".encoding
 => #<Encoding:UTF-8>
 irb(main):007:0> "\xE3\x81".valid_encoding?
 => false

ちなみに、invalidな文字列を正しい位置でbytesliceすると正しくvalidと判定されます。

 irb(main):025:0> "あ\xE3".valid_encoding?
 => false
 irb(main):026:0> "あ\xE3".byteslice(0,3).valid_encoding?
 => true
=end


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [ruby-dev:47109] [ruby-trunk - Bug #7954] "あ".byteslice(0,2).valid_encoding? should return false
  2013-02-25  9:03 [ruby-dev:47108] [ruby-trunk - Bug #7954][Open] "あ".byteslice(0,2).valid_encoding? should return false Tietew (Toru Iwase)
@ 2013-02-25  9:48 ` duerst (Martin Dürst)
  2013-02-25 10:09 ` [ruby-dev:47110] [ruby-trunk - Bug #7954][Assigned] " naruse (Yui NARUSE)
  1 sibling, 0 replies; 3+ messages in thread
From: duerst (Martin Dürst) @ 2013-02-25  9:48 UTC (permalink / raw
  To: ruby developers list


Issue #7954 has been updated by duerst (Martin Dürst).


この場合の問題はそもぞも
> "あ".byteslice(0,2).encoding
=> #<Encoding:UTF-8>

にあるかと思います。byteslice の戻り値の encoding は BINARY にすべきだと思います。
----------------------------------------
Bug #7954: "あ".byteslice(0,2).valid_encoding? should return false
https://bugs.ruby-lang.org/issues/7954#change-36991

Author: Tietew (Toru Iwase)
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 2.0.0
ruby -v: 2.0.0p0


=begin
valid encodingな文字列に対するString#bytesliceがinvalid encodingなバイト列を生成してもvalid_encoding?がtrueになります。
これはfalseになるべきだと思います。

なお、1.9.3も同じ挙動です。

 irb(main):001:0> RUBY_DESCRIPTION
 => "ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]"
 irb(main):002:0> "あ".encoding
 => #<Encoding:UTF-8>
 irb(main):003:0> "あ".valid_encoding?
 => true
 irb(main):004:0> "あ".byteslice(0,2)
 => "\xE3\x81"
 irb(main):005:0> "あ".byteslice(0,2).valid_encoding?
 => true
 irb(main):006:0> "\xE3\x81".encoding
 => #<Encoding:UTF-8>
 irb(main):007:0> "\xE3\x81".valid_encoding?
 => false

ちなみに、invalidな文字列を正しい位置でbytesliceすると正しくvalidと判定されます。

 irb(main):025:0> "あ\xE3".valid_encoding?
 => false
 irb(main):026:0> "あ\xE3".byteslice(0,3).valid_encoding?
 => true
=end


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [ruby-dev:47110] [ruby-trunk - Bug #7954][Assigned] "あ".byteslice(0,2).valid_encoding? should return false
  2013-02-25  9:03 [ruby-dev:47108] [ruby-trunk - Bug #7954][Open] "あ".byteslice(0,2).valid_encoding? should return false Tietew (Toru Iwase)
  2013-02-25  9:48 ` [ruby-dev:47109] [ruby-trunk - Bug #7954] " duerst (Martin Dürst)
@ 2013-02-25 10:09 ` naruse (Yui NARUSE)
  1 sibling, 0 replies; 3+ messages in thread
From: naruse (Yui NARUSE) @ 2013-02-25 10:09 UTC (permalink / raw
  To: ruby developers list


Issue #7954 has been updated by naruse (Yui NARUSE).

Category set to M17N
Status changed from Open to Assigned
Assignee set to naruse (Yui NARUSE)

duerst (Martin Dürst) wrote:
> この場合の問題はそもぞも
> > "あ".byteslice(0,2).encoding
> => #<Encoding:UTF-8>
> 
> にあるかと思います。byteslice の戻り値の encoding は BINARY にすべきだと思います。

BINARY で受け取りたいならば、"あ".b.slice(0,2) とすればいいのであって、別のメソッドである意味がありません。
----------------------------------------
Bug #7954: "あ".byteslice(0,2).valid_encoding? should return false
https://bugs.ruby-lang.org/issues/7954#change-36992

Author: Tietew (Toru Iwase)
Status: Assigned
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: M17N
Target version: 2.0.0
ruby -v: 2.0.0p0


=begin
valid encodingな文字列に対するString#bytesliceがinvalid encodingなバイト列を生成してもvalid_encoding?がtrueになります。
これはfalseになるべきだと思います。

なお、1.9.3も同じ挙動です。

 irb(main):001:0> RUBY_DESCRIPTION
 => "ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]"
 irb(main):002:0> "あ".encoding
 => #<Encoding:UTF-8>
 irb(main):003:0> "あ".valid_encoding?
 => true
 irb(main):004:0> "あ".byteslice(0,2)
 => "\xE3\x81"
 irb(main):005:0> "あ".byteslice(0,2).valid_encoding?
 => true
 irb(main):006:0> "\xE3\x81".encoding
 => #<Encoding:UTF-8>
 irb(main):007:0> "\xE3\x81".valid_encoding?
 => false

ちなみに、invalidな文字列を正しい位置でbytesliceすると正しくvalidと判定されます。

 irb(main):025:0> "あ\xE3".valid_encoding?
 => false
 irb(main):026:0> "あ\xE3".byteslice(0,3).valid_encoding?
 => true
=end


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-02-25 10:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-25  9:03 [ruby-dev:47108] [ruby-trunk - Bug #7954][Open] "あ".byteslice(0,2).valid_encoding? should return false Tietew (Toru Iwase)
2013-02-25  9:48 ` [ruby-dev:47109] [ruby-trunk - Bug #7954] " duerst (Martin Dürst)
2013-02-25 10:09 ` [ruby-dev:47110] [ruby-trunk - Bug #7954][Assigned] " naruse (Yui NARUSE)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).