From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id D46461F42D for ; Wed, 16 May 2018 10:07:15 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id A5188120940; Wed, 16 May 2018 19:07:13 +0900 (JST) Received: from dcvr.yhbt.net (dcvr.yhbt.net [64.71.152.64]) by neon.ruby-lang.org (Postfix) with ESMTPS id 921D11209FB for ; Wed, 16 May 2018 19:07:09 +0900 (JST) Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 5C62F1F42D; Wed, 16 May 2018 10:07:07 +0000 (UTC) Date: Wed, 16 May 2018 10:07:07 +0000 From: Eric Wong To: ruby-core@ruby-lang.org Message-ID: <20180516100707.GA16631@dcvr> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-ML-Name: ruby-core X-Mail-Count: 87077 Subject: [ruby-core:87077] Re: [Ruby trunk Bug#14745] High memory usage when using String#replace with IO.copy_stream X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" janko.marohnic@gmail.com wrote: > I wrote: > > Finally, I always assumed your example is a contrived case and > > you're dealing with an interface somewhere (not StringIO) which > > doesn't accept a destination buffer for .read. > The example was simplified for reproducing purposes. The place > where I discovered this was in > https://github.com/rocketjob/symmetric-encryption/pull/98 (I > eventually managed to figure out `String#replace` was causing > the high memory usage, so I switched to `String#clear`). > > In short, the `SymmetricEncryption::Reader` object wraps an IO > object with encrypted content, and when calling `#read` it > reads data from the underlying IO object, decrypts it and > returns the decrypted data. So, it's not patching the lack of > outbuf argument (because the underlying IO object *should* > accept the outbuf argument), rather it provides an `IO#read` > interface over incrementally decrypting IO object content. I may be misreading(*), but it looks like @stream_cipher.update can take a second destination arg (like IO#read and friends) and maybe that helps... (that appears to be OpenSSL::Cipher#update) If that's not usable somehow, I've sometimes wondered if adding a String#exchange! method might help: foo = "hello............................" bar = "world............................" foo.exchange!(bar) p foo # => "world............................" p bar # => "hello............................" In your case, you could still use String#clear with this: dst.clear dst.exchange!(tmp) # tmp is now '' because dst was cleared before For non-embed strings, it would swap the pointers and be zero-copy, but won't suffer the new-frozen-string-behind-your-back problem of String#replace. (*) I'm looking at your commit 8d41efacff26f3357016d6b611bee174802fba66 after git clone-ing (since I don't do JS-infested web UIs)