ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: luke.gru@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:69622] [Ruby trunk - Bug #11264] Memory leak in JSON stdlib ext (JSON generation)
Date: Tue, 16 Jun 2015 23:33:10 +0000	[thread overview]
Message-ID: <redmine.journal-52965.20150616233308.9835cb8b696559b3@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-11264.20150616025418@ruby-lang.org

Issue #11264 has been updated by Luke Gruber.


Thanks for getting back to me.

That's true, but my guess is the most common leak scenario would be when calling a method that's basically an extension point to the library itself (such as `to_json`. At least if you leak the internals of the buffer, it won't be too many bytes per leak :) Otherwise you would have to wrap the whole 'cState_partial_generate' method in a rescue block.

As the library stands now, lots of methods that raise will result in a leak, not just `to_json`:

Object#respond_to?(:to_json)
Object#to_s
Array#[]
Hash#keys
Hash#[]
String#encode

But the chances of those methods raising errors are much lower, I think.

Should we continue this here or on https://github.com/flori/json/issues/251 ? I don't know what the right approach is
for std libs that are maintained separately as gems.

----------------------------------------
Bug #11264: Memory leak in JSON stdlib ext (JSON generation)
https://bugs.ruby-lang.org/issues/11264#change-52965

* Author: Luke Gruber
* Status: Third Party's Issue
* Priority: Normal
* Assignee: 
* ruby -v: 2.2-head
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
Hi,

I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:

~~~ruby
require 'json'

class MyClass
  def to_json(*)
    "a" * 1048576 # 1 megabytes of chars
  end
end

class MyOther
  def to_json(*)
    raise "OMG"
  end
end

1000.times do |i| # will leak up to ~ 4 gigs
  puts i
  JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
~~~

What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the `fbuffer` as needed. The problem is that that the API extension point of adding a `to_json` method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.

To test it against normal behaviour in the above script, take out the offending `MyOther.new` in the array. It should run much more smoothly this way :)

Note that since the `fbuffer`s aren't GC marked (not that they should be), it isn't possible to trace this leak using `GC.stat`.

Once again, not sure if this is a bug or if we should never raise errors from custom `to_json` methods (ie: always wrap them in a begin... rescue block.

Thanks,

I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251



-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2015-06-16 23:10 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-11264.20150616025418@ruby-lang.org>
2015-06-16  2:54 ` [ruby-core:69599] [Ruby trunk - Bug #11264] [Open] Memory leak in JSON stdlib ext (JSON generation) luke.gru
2015-06-16  4:00 ` [ruby-core:69601] [Ruby trunk - Bug #11264] [Third Party's Issue] " nobu
2015-06-16 23:33 ` luke.gru [this message]
2015-06-17  0:53 ` [ruby-core:69623] [Ruby trunk - Bug #11264] " nobu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-52965.20150616233308.9835cb8b696559b3@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).