From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.0 required=3.0 tests=AWL,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 09DDD1F5AE for ; Sun, 2 Aug 2020 20:34:59 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id A14FB120929; Mon, 3 Aug 2020 05:34:26 +0900 (JST) Received: from xtrwkhkc.outbound-mail.sendgrid.net (xtrwkhkc.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id B391212091F for ; Mon, 3 Aug 2020 05:34:24 +0900 (JST) Received: by filterdrecv-p3iad2-d8cc6d457-jqj7p with SMTP id filterdrecv-p3iad2-d8cc6d457-jqj7p-18-5F272368-B 2020-08-02 20:34:48.203368507 +0000 UTC m=+354390.034504093 Received: from herokuapp.com (unknown) by ismtpd0045p1mdw1.sendgrid.net (SG) with ESMTP id t5TzqqEDQjiZyfw16uFKUA for ; Sun, 02 Aug 2020 20:34:48.050 +0000 (UTC) Date: Sun, 02 Aug 2020 20:34:48 +0000 (UTC) From: larskanis@gmail.com Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 75276 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 17023 X-Redmine-Issue-Author: larskanis X-Redmine-Issue-Assignee: tenderlovemaking X-Redmine-Sender: larskanis X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: =?us-ascii?Q?+E+TGAYxgAqNvT2Hc3jgthCyK1YJRpYNPBZWWqZMzpWrYE0rhsvYHve1GExD0i?= =?us-ascii?Q?Vnx0ktAI1SppWk7mZBdGVbl89OOpi+FjrE=2F6Nni?= =?us-ascii?Q?onEMiicB7X5ICxKlLnV1yILT5YSmTtWjJRq2PrT?= =?us-ascii?Q?GaeZr0bFlVf=2FboHL0SH9JciLVAb48mAKLHRaB0T?= =?us-ascii?Q?7NZMWDLWCqze4xuSS+2I1qh5U4tLtNKB1qYo4UR?= =?us-ascii?Q?StJiURnp6PNnxFoT0=3D?= To: ruby-core@ruby-lang.org X-ML-Name: ruby-core X-Mail-Count: 99446 Subject: [ruby-core:99446] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #17023 has been updated by larskanis (Lars Kanis). I think there are several options to solve this issue: 1. Change `RSTRING_PTR()` to pin the string to a fixed memory location. 2. Add a new function kind of `rb_obj_mark_unmovable()` to ruby's C-API that pins string memory without keeping the string alive. 3. Expose `rb_gc_register_address()` and `rb_gc_unregister_address()` to ruby and require users to explicit mark objects to be in use. 4. Change the FFI contract, in such a way, that a parameter passed as `String` might not be accessed after the C call returns. As an alternative, `FFI::MemoryPointer` can be used, which works already without any modifications. As mentioned above the issue is a very rare situation. An invalid access only happens in case of a parameter passed as String has been stored by a C function, `GC.compact` is then called somewhere in the application, the String in question is actually moved and is subsequently accessed on C level. In fact it is that rare, that we didn't receive a single bug report for this situation, but the example above is just constructed. Nevertheless I would like to fix it in a proactive manner, instead of keeping it as an undocumented Heisenbug. Given the rare probability this issue comes up, I don't think it's acceptable to get a measurable slowdown by the solution, however. To 1: I don't think it's necessary to change `RSTRING_PTR()`. It is a function used by almost every C extension, but this particular issue is very specific to FFI. Only FFI depends on indirect marking of objects as a core feature. Even Fiddle is much less affected, since it requires the user to explicit allocate and deallocate memory by default and doesn't allow passing ruby strings as parameters. So we don't need a transparent solution, but only something that can be build into FFI and that can be used explicit where necessary. Also if this change slows down `RSTRING_PTR()` it would slow down all C-extensions. To 2: This is my favorite: I would like to call a Ruby-API function from FFI's C-extension that marks the passed String object as unmovable without keeping it alive. So the pinning should be for the rest of the lifetime of the String object. Since this function must be called for every String passed to a FFI function, it must be fast. So copying the string to non-embedded memory is no good option, since https://github.com/ffi/ffi/pull/811 showed, that it results in a 70% slowdown for each and every string passed to C - regardless how it will be used afterwards. To 3: As mentioned earlier I don't like this solution. It doesn't fit to the FFI API - it is ugly. The object must be explicit unregistered in contrast to most other FFI objects. Since the object must be stored in a Ruby variable for calling `rb_gc_unregister_address()`, the object is always marked twice - once by the the mark list and once by the ruby variable. So both references have to be released to get the object GC'ed. It's hard to describe when and how this function has to be called. To 4: If 2. isn't possible to implement, I would prefer this one. Since values on the stack are not moved, it should be safe to use String objects by the C function, even if the function releases the GVL and `GC.compact` is called in the meanwhile. And when the function returns, any access to the pointer is prohibited. @tenderlovemaking Is it possible to implement something like 2? ---------------------------------------- Bug #17023: How to prevent String memory to be relocated in ruby-ffi https://bugs.ruby-lang.org/issues/17023#change-86902 * Author: larskanis (Lars Kanis) * Status: Closed * Priority: Normal * Assignee: tenderlovemaking (Aaron Patterson) * ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux] * Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE ---------------------------------------- [ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location. This example shows the situation and that the string is relocated although it is still referenced in ruby code: ```ruby File.write "string-relocate.c", <<-EOC static char *g_str; void set(char* str) { g_str = str; } char* get() { return g_str; } EOC system "gcc -shared -fPIC string-relocate.c -o string-relocate.so" require 'ffi' class Foo extend FFI::Library ffi_lib File.expand_path('string-relocate.so') attach_function :set, [:string], :void attach_function :get, [], :string def initialize(count) proc {} # necessary to trigger relocation a = "a" * count set(a) GC.verify_compaction_references(toward: :empty, double_heap: true) puts "get(#{count}): #{get} (should be: #{a})" end end Foo.new(23) Foo.new(24) ``` The output looks like so on ruby-2.7.1: ``` get(23): (should be: aaaaaaaaaaaaaaaaaaaaaaa) get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa) ``` So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented? A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it. ---Files-------------------------------- string-relocate.rb (653 Bytes) 0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB) -- https://bugs.ruby-lang.org/