ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:69599] [Ruby trunk - Bug #11264] [Open] Memory leak in JSON stdlib ext (JSON generation)
       [not found] <redmine.issue-11264.20150616025418@ruby-lang.org>
@ 2015-06-16  2:54 ` luke.gru
  2015-06-16  4:00 ` [ruby-core:69601] [Ruby trunk - Bug #11264] [Third Party's Issue] " nobu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: luke.gru @ 2015-06-16  2:54 UTC (permalink / raw)
  To: ruby-core

Issue #11264 has been reported by Luke Gruber.

----------------------------------------
Bug #11264: Memory leak in JSON stdlib ext (JSON generation)
https://bugs.ruby-lang.org/issues/11264

* Author: Luke Gruber
* Status: Open
* Priority: Normal
* Assignee: 
* ruby -v: 2.2-head
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
Hi,

I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:

------------------------------
require 'json'

class MyClass
  def to_json(*)
    "a" * 1048576 # 1 megabytes of chars
  end
end

class MyOther
  def to_json(*)
    raise "OMG"
  end
end

1000.times do |i| # will leak up to ~ 4 gigs
  puts i
  JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
------------------------------

What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the fbuffer as needed. The problem is that that the API extension point of adding a to_json method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.

To test it against normal behaviour in the above script, take out the offending MyOther.new in the array. It should run much more smoothly this way :)

Note that since the fbuffers aren't GC marked (not that they should be), it isn't possible to trace this leak using GC.stat.

Once again, not sure if this is a bug or if we should never raise errors from custom to_json methods (ie: always wrap them in a begin... rescue block.

Thanks,

I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:69601] [Ruby trunk - Bug #11264] [Third Party's Issue] Memory leak in JSON stdlib ext (JSON generation)
       [not found] <redmine.issue-11264.20150616025418@ruby-lang.org>
  2015-06-16  2:54 ` [ruby-core:69599] [Ruby trunk - Bug #11264] [Open] Memory leak in JSON stdlib ext (JSON generation) luke.gru
@ 2015-06-16  4:00 ` nobu
  2015-06-16 23:33 ` [ruby-core:69622] [Ruby trunk - Bug #11264] " luke.gru
  2015-06-17  0:53 ` [ruby-core:69623] " nobu
  3 siblings, 0 replies; 4+ messages in thread
From: nobu @ 2015-06-16  4:00 UTC (permalink / raw)
  To: ruby-core

Issue #11264 has been updated by Nobuyoshi Nakada.

Description updated
Status changed from Open to Third Party's Issue

Luke Gruber wrote:
> Once again, not sure if this is a bug or if we should never raise errors from custom to_json methods (ie: always wrap them in a begin... rescue block.

It doesn't work enough, as exceptions can raise even inside `cState_prepare_buffer()`.

----------------------------------------
Bug #11264: Memory leak in JSON stdlib ext (JSON generation)
https://bugs.ruby-lang.org/issues/11264#change-52942

* Author: Luke Gruber
* Status: Third Party's Issue
* Priority: Normal
* Assignee: 
* ruby -v: 2.2-head
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
Hi,

I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:

~~~ruby
require 'json'

class MyClass
  def to_json(*)
    "a" * 1048576 # 1 megabytes of chars
  end
end

class MyOther
  def to_json(*)
    raise "OMG"
  end
end

1000.times do |i| # will leak up to ~ 4 gigs
  puts i
  JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
~~~

What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the `fbuffer` as needed. The problem is that that the API extension point of adding a `to_json` method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.

To test it against normal behaviour in the above script, take out the offending `MyOther.new` in the array. It should run much more smoothly this way :)

Note that since the `fbuffer`s aren't GC marked (not that they should be), it isn't possible to trace this leak using `GC.stat`.

Once again, not sure if this is a bug or if we should never raise errors from custom `to_json` methods (ie: always wrap them in a begin... rescue block.

Thanks,

I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:69622] [Ruby trunk - Bug #11264] Memory leak in JSON stdlib ext (JSON generation)
       [not found] <redmine.issue-11264.20150616025418@ruby-lang.org>
  2015-06-16  2:54 ` [ruby-core:69599] [Ruby trunk - Bug #11264] [Open] Memory leak in JSON stdlib ext (JSON generation) luke.gru
  2015-06-16  4:00 ` [ruby-core:69601] [Ruby trunk - Bug #11264] [Third Party's Issue] " nobu
@ 2015-06-16 23:33 ` luke.gru
  2015-06-17  0:53 ` [ruby-core:69623] " nobu
  3 siblings, 0 replies; 4+ messages in thread
From: luke.gru @ 2015-06-16 23:33 UTC (permalink / raw)
  To: ruby-core

Issue #11264 has been updated by Luke Gruber.


Thanks for getting back to me.

That's true, but my guess is the most common leak scenario would be when calling a method that's basically an extension point to the library itself (such as `to_json`. At least if you leak the internals of the buffer, it won't be too many bytes per leak :) Otherwise you would have to wrap the whole 'cState_partial_generate' method in a rescue block.

As the library stands now, lots of methods that raise will result in a leak, not just `to_json`:

Object#respond_to?(:to_json)
Object#to_s
Array#[]
Hash#keys
Hash#[]
String#encode

But the chances of those methods raising errors are much lower, I think.

Should we continue this here or on https://github.com/flori/json/issues/251 ? I don't know what the right approach is
for std libs that are maintained separately as gems.

----------------------------------------
Bug #11264: Memory leak in JSON stdlib ext (JSON generation)
https://bugs.ruby-lang.org/issues/11264#change-52965

* Author: Luke Gruber
* Status: Third Party's Issue
* Priority: Normal
* Assignee: 
* ruby -v: 2.2-head
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
Hi,

I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:

~~~ruby
require 'json'

class MyClass
  def to_json(*)
    "a" * 1048576 # 1 megabytes of chars
  end
end

class MyOther
  def to_json(*)
    raise "OMG"
  end
end

1000.times do |i| # will leak up to ~ 4 gigs
  puts i
  JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
~~~

What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the `fbuffer` as needed. The problem is that that the API extension point of adding a `to_json` method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.

To test it against normal behaviour in the above script, take out the offending `MyOther.new` in the array. It should run much more smoothly this way :)

Note that since the `fbuffer`s aren't GC marked (not that they should be), it isn't possible to trace this leak using `GC.stat`.

Once again, not sure if this is a bug or if we should never raise errors from custom `to_json` methods (ie: always wrap them in a begin... rescue block.

Thanks,

I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:69623] [Ruby trunk - Bug #11264] Memory leak in JSON stdlib ext (JSON generation)
       [not found] <redmine.issue-11264.20150616025418@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2015-06-16 23:33 ` [ruby-core:69622] [Ruby trunk - Bug #11264] " luke.gru
@ 2015-06-17  0:53 ` nobu
  3 siblings, 0 replies; 4+ messages in thread
From: nobu @ 2015-06-17  0:53 UTC (permalink / raw)
  To: ruby-core

Issue #11264 has been updated by Nobuyoshi Nakada.


Luke Gruber wrote:
> Should we continue this here or on https://github.com/flori/json/issues/251 ? I don't know what the right approach is
> for std libs that are maintained separately as gems.

Yes, I've posted a patch there.

----------------------------------------
Bug #11264: Memory leak in JSON stdlib ext (JSON generation)
https://bugs.ruby-lang.org/issues/11264#change-52968

* Author: Luke Gruber
* Status: Third Party's Issue
* Priority: Normal
* Assignee: 
* ruby -v: 2.2-head
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
Hi,

I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:

~~~ruby
require 'json'

class MyClass
  def to_json(*)
    "a" * 1048576 # 1 megabytes of chars
  end
end

class MyOther
  def to_json(*)
    raise "OMG"
  end
end

1000.times do |i| # will leak up to ~ 4 gigs
  puts i
  JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
~~~

What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the `fbuffer` as needed. The problem is that that the API extension point of adding a `to_json` method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.

To test it against normal behaviour in the above script, take out the offending `MyOther.new` in the array. It should run much more smoothly this way :)

Note that since the `fbuffer`s aren't GC marked (not that they should be), it isn't possible to trace this leak using `GC.stat`.

Once again, not sure if this is a bug or if we should never raise errors from custom `to_json` methods (ie: always wrap them in a begin... rescue block.

Thanks,

I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-17  0:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-11264.20150616025418@ruby-lang.org>
2015-06-16  2:54 ` [ruby-core:69599] [Ruby trunk - Bug #11264] [Open] Memory leak in JSON stdlib ext (JSON generation) luke.gru
2015-06-16  4:00 ` [ruby-core:69601] [Ruby trunk - Bug #11264] [Third Party's Issue] " nobu
2015-06-16 23:33 ` [ruby-core:69622] [Ruby trunk - Bug #11264] " luke.gru
2015-06-17  0:53 ` [ruby-core:69623] " nobu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).