ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: Eric Wong <normalperson@yhbt.net>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:86983] Re: [Ruby trunk Bug#14745] High memory usage when using String#replace with IO.copy_stream
Date: Fri, 11 May 2018 03:36:41 +0000	[thread overview]
Message-ID: <20180511033641.GA4459@dcvr> (raw)
In-Reply-To: <redmine.issue-14745.20180509125009.1d902e8a4fa3dc01@ruby-lang.org>

janko.marohnic@gmail.com wrote:
> the memory usage has now doubled to 100MB at the end of the
> program, indicating that some string bytes weren't
> successfully deallocated. So, it seems that String#replace has
> different behaviour compared to String#clear + String#<<.

Yes, this is an unfortunate side effect because of copy-on-write
semantics of String#replace.  If the arg (other_str) for
String#replace is non-frozen, a new frozen string is created with
using the existing malloc-ed pointer.  Both the receiver string
and other_str point to that new, shared string.

> I was *only* able to reproduce this with `IO.copy_stream`

I suspect part of this is because outbuf becomes a long-lived object
with IO.copy_stream (not sure), and the hidden frozen string
becomes long-lived, as well.

So yeah; a combination of well-intentioned optimizations hurt
when combined together.

The other part could be anything using IO#write could create
massive amounts of garbage before 2.5:
https://bugs.ruby-lang.org/issues/13085 And your copy_stream use
hits the IO#write case.  Unfortunately, the "fix" for [Bug
#13085] won't work effectively with shared strings, because we
can't free+recycle the a string which wasn't created internally
by the VM.

  reply	other threads:[~2018-05-11  3:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-14745.20180509125009@ruby-lang.org>
2018-05-09 12:50 ` [ruby-core:86954] [Ruby trunk Bug#14745] High memory usage when using String#replace with IO.copy_stream janko.marohnic
2018-05-11  3:36   ` Eric Wong [this message]
2018-05-11  4:26   ` [ruby-core:86987] " Eric Wong
2018-05-11  5:43   ` [ruby-core:86988] " Eric Wong
2018-05-16  9:33 ` [ruby-core:87075] " janko.marohnic
2018-05-16 10:07   ` [ruby-core:87077] " Eric Wong
2018-05-17 12:28 ` [ruby-core:87136] " janko.marohnic
2018-05-18 18:52   ` [ruby-core:87176] " Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180511033641.GA4459@dcvr \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).