ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:64198] [ruby-trunk - Bug #10111] [Open] gdbm truncated UTF-8 data problem
       [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
@ 2014-08-05 12:00 ` testors
  2014-08-06  2:28 ` [ruby-core:64215] [ruby-trunk - Bug #10111] " nobu
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: testors @ 2014-08-05 12:00 UTC (permalink / raw
  To: ruby-core

Issue #10111 has been reported by KiHyun Kang.

----------------------------------------
Bug #10111: gdbm truncated UTF-8 data problem
https://bugs.ruby-lang.org/issues/10111

* Author: KiHyun Kang
* Status: Open
* Priority: Normal
* Assignee: Aaron Patterson
* Category: ext
* Target version: 
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Reproducible script is here.

~~~
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

throw 'data truncated!!' if db['key'] != data
~~~



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:64215] [ruby-trunk - Bug #10111] gdbm truncated UTF-8 data problem
       [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
  2014-08-05 12:00 ` [ruby-core:64198] [ruby-trunk - Bug #10111] [Open] gdbm truncated UTF-8 data problem testors
@ 2014-08-06  2:28 ` nobu
  2014-08-06  2:52 ` [ruby-core:64217] " testors
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: nobu @ 2014-08-06  2:28 UTC (permalink / raw
  To: ruby-core

Issue #10111 has been updated by Nobuyoshi Nakada.


gdbm doesn't preserve encodings now.

----------------------------------------
Bug #10111: gdbm truncated UTF-8 data problem
https://bugs.ruby-lang.org/issues/10111#change-48209

* Author: KiHyun Kang
* Status: Open
* Priority: Normal
* Assignee: Aaron Patterson
* Category: ext
* Target version: 
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Reproducible script is here.

~~~
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

throw 'data truncated!!' if db['key'] != data
~~~



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:64217] [ruby-trunk - Bug #10111] gdbm truncated UTF-8 data problem
       [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
  2014-08-05 12:00 ` [ruby-core:64198] [ruby-trunk - Bug #10111] [Open] gdbm truncated UTF-8 data problem testors
  2014-08-06  2:28 ` [ruby-core:64215] [ruby-trunk - Bug #10111] " nobu
@ 2014-08-06  2:52 ` testors
  2014-08-06  3:55 ` [ruby-core:64218] " nobu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: testors @ 2014-08-06  2:52 UTC (permalink / raw
  To: ruby-core

Issue #10111 has been updated by KiHyun Kang.


Nobuyoshi Nakada wrote:
> gdbm doesn't preserve encodings now.

gdbm doesn't have to preserve encodings.

ext/dbm works well but ext/gdbm because ext/gdbm is using 'length' to get size.

'length' is not suitable to determine actual size.

use 'bytesize' instead of 'length'.

----------------------------------------
Bug #10111: gdbm truncated UTF-8 data problem
https://bugs.ruby-lang.org/issues/10111#change-48211

* Author: KiHyun Kang
* Status: Open
* Priority: Normal
* Assignee: Aaron Patterson
* Category: ext
* Target version: 
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Reproducible script is here.

~~~
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

throw 'data truncated!!' if db['key'] != data
~~~



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:64218] [ruby-trunk - Bug #10111] gdbm truncated UTF-8 data problem
       [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2014-08-06  2:52 ` [ruby-core:64217] " testors
@ 2014-08-06  3:55 ` nobu
  2014-08-16  1:25 ` [ruby-core:64408] " akr
  2015-04-11  5:15 ` [ruby-core:68840] [Ruby trunk - Bug #10111] [Rejected] " akr
  5 siblings, 0 replies; 6+ messages in thread
From: nobu @ 2014-08-06  3:55 UTC (permalink / raw
  To: ruby-core

Issue #10111 has been updated by Nobuyoshi Nakada.


KiHyun Kang wrote:
> Nobuyoshi Nakada wrote:
> > gdbm doesn't preserve encodings now.
> 
> gdbm doesn't have to preserve encodings.

~~~
$ ./ruby -v -rgdbm -e 'data = "\xEA\xB0\x80ABCDEF"' -e 'db = GDBM.new("test.db", 0666)' -e 'db["key"] = data' -e 'p db["key"] == data.b'
ruby 2.1.2p195 (2014-08-04 revision 47056) [x86_64-darwin13.0]
true
~~~

> ext/dbm works well but ext/gdbm because ext/gdbm is using 'length' to get size.
> 
> 'length' is not suitable to determine actual size.
> 
> use 'bytesize' instead of 'length'.

I can't understand what you mean at all.

----------------------------------------
Bug #10111: gdbm truncated UTF-8 data problem
https://bugs.ruby-lang.org/issues/10111#change-48212

* Author: KiHyun Kang
* Status: Open
* Priority: Normal
* Assignee: Aaron Patterson
* Category: ext
* Target version: 
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Reproducible script is here.

~~~
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

throw 'data truncated!!' if db['key'] != data
~~~



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:64408] [ruby-trunk - Bug #10111] gdbm truncated UTF-8 data problem
       [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2014-08-06  3:55 ` [ruby-core:64218] " nobu
@ 2014-08-16  1:25 ` akr
  2015-04-11  5:15 ` [ruby-core:68840] [Ruby trunk - Bug #10111] [Rejected] " akr
  5 siblings, 0 replies; 6+ messages in thread
From: akr @ 2014-08-16  1:25 UTC (permalink / raw
  To: ruby-core

Issue #10111 has been updated by Akira Tanaka.


The data is not truncated but has a different encoding (as nobu pointed at first).

```
% cat t.gdbm.rb
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

p [db['key'].b, db['key'].encoding]
p [data.b, data.encoding]
throw 'data truncated!!' if db['key'] != data
% ./ruby -v t.gdbm.rb
ruby 2.2.0dev (2014-08-15 trunk 47187) [x86_64-linux]
["\xEA\xB0\x80ABCDEF", #<Encoding:ASCII-8BIT>]
["\xEA\xB0\x80ABCDEF", #<Encoding:UTF-8>]
t.gdbm.rb:10:in `throw': uncaught throw "data truncated!!" (ArgumentError)
	from t.gdbm.rb:10:in `<main>'
```

dbm behaves same as gdbm.

```
% cat t.dbm.rb 
# coding: utf-8
require 'dbm'

data = "\xEA\xB0\x80ABCDEF"
db = DBM.new( 'test.db', 0666 )
db['key'] = data

p [db['key'].b, db['key'].encoding]
p [data.b, data.encoding]
throw 'data truncated!!' if db['key'] != data
% ./ruby -v t.dbm.rb 
ruby 2.2.0dev (2014-08-15 trunk 47187) [x86_64-linux]
["\xEA\xB0\x80ABCDEF", #<Encoding:ASCII-8BIT>]
["\xEA\xB0\x80ABCDEF", #<Encoding:UTF-8>]
t.dbm.rb:10:in `throw': uncaught throw "data truncated!!" (ArgumentError)
	from t.dbm.rb:10:in `<main>'
```


----------------------------------------
Bug #10111: gdbm truncated UTF-8 data problem
https://bugs.ruby-lang.org/issues/10111#change-48365

* Author: KiHyun Kang
* Status: Open
* Priority: Normal
* Assignee: Aaron Patterson
* Category: ext
* Target version: 
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Reproducible script is here.

~~~
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

throw 'data truncated!!' if db['key'] != data
~~~



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:68840] [Ruby trunk - Bug #10111] [Rejected] gdbm truncated UTF-8 data problem
       [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2014-08-16  1:25 ` [ruby-core:64408] " akr
@ 2015-04-11  5:15 ` akr
  5 siblings, 0 replies; 6+ messages in thread
From: akr @ 2015-04-11  5:15 UTC (permalink / raw
  To: ruby-core

Issue #10111 has been updated by Akira Tanaka.

Status changed from Open to Rejected

gdbm (and dbm) doesn't record encoding.
So, current behavior is natural and not a bug, I think.

----------------------------------------
Bug #10111: gdbm truncated UTF-8 data problem
https://bugs.ruby-lang.org/issues/10111#change-52104

* Author: KiHyun Kang
* Status: Rejected
* Priority: Normal
* Assignee: Aaron Patterson
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Reproducible script is here.

~~~
# coding: utf-8
require 'gdbm'

data = "\xEA\xB0\x80ABCDEF"
db = GDBM.new( 'test.db', 0666 )
db['key'] = data

throw 'data truncated!!' if db['key'] != data
~~~



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-04-11  5:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-10111.20140805120024@ruby-lang.org>
2014-08-05 12:00 ` [ruby-core:64198] [ruby-trunk - Bug #10111] [Open] gdbm truncated UTF-8 data problem testors
2014-08-06  2:28 ` [ruby-core:64215] [ruby-trunk - Bug #10111] " nobu
2014-08-06  2:52 ` [ruby-core:64217] " testors
2014-08-06  3:55 ` [ruby-core:64218] " nobu
2014-08-16  1:25 ` [ruby-core:64408] " akr
2015-04-11  5:15 ` [ruby-core:68840] [Ruby trunk - Bug #10111] [Rejected] " akr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).