ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: tenderlove@ruby-lang.org
To: ruby-core@ruby-lang.org
Subject: [ruby-core:99184] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
Date: Wed, 15 Jul 2020 20:00:39 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-86563.20200715200039.5550@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-17023.20200710175402.5550@ruby-lang.org

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).


larskanis (Lars Kanis) wrote in #note-7:
> So actually I was looking for something like `rb_obj_mark_unmovable()` to pin the string. FFI could call this function on all string pointers passed to C. In FFI we don't know how long the string pointer is in use in the C library, so that marking all argument strings as unmovable is over-pinning as well. But for sure this would pin way less strings than `RSTRING_PTR()`.

Right, that makes sense.  I really need to document this (and I apologize for not doing so already), but `rb_gc_register_address` will pin your objects.  When you know you're done with the reference, you can release it with `rb_gc_unregister_address`.  Of course if you don't call the unregister function, the reference will stay alive forever.

Currently the GC can't know for how long you want the reference to remain unmovable, so de-registering it is required.  In the code example you've presented, if the constant were to be removed or redefined, the string buffer reference would go bad without compaction involved, so I'm not sure it's "safe" to do either.

One thought I have is that FFI could ensure T_STRING objects are not embedded.  The underlying char buffer won't move, but the object still could.

Another thought: could FFI expose `rb_gc_register_address`?  The code could become something like:

``` ruby
class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  ffi_register_global(A = "")

  def initialize(count)
    A.replace "a" * count

    set(A)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{A})"
  end
end
```

I know this would require people to change their code, but then we would be better aware of the lifetime of the object.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86563

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2020-07-15 20:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
2020-07-10 19:51 ` [ruby-core:99116] " chris
2020-07-10 21:11 ` [ruby-core:99118] " larskanis
2020-07-11  0:24 ` [ruby-core:99120] " duerst
2020-07-11 10:11 ` [ruby-core:99126] " nobu
2020-07-12 16:51 ` [ruby-core:99140] " tenderlove
2020-07-13 17:55 ` [ruby-core:99155] " tenderlove
2020-07-15 19:35 ` [ruby-core:99183] " larskanis
2020-07-15 20:00 ` tenderlove [this message]
2020-07-15 23:35   ` [ruby-core:99185] " Eric Wong
     [not found]     ` <647A6735-04C8-4410-A71A-307A2390CFBD@gmail.com>
2020-07-16  0:04       ` [ruby-core:99186] " Eric Wong
2020-07-19  2:52 ` [ruby-core:99218] " nagachika00
2020-07-21 20:18 ` [ruby-core:99254] " larskanis
2020-07-22  7:41 ` [ruby-core:99260] " hanmac
2020-07-22  9:46 ` [ruby-core:99262] " larskanis
2020-07-22 17:57 ` [ruby-core:99273] " tenderlove
2020-07-22 18:19 ` [ruby-core:99275] " eregontp
2020-07-22 18:23 ` [ruby-core:99276] " tenderlove
2020-07-22 19:47 ` [ruby-core:99280] " eregontp
2020-08-02 20:34 ` [ruby-core:99446] " larskanis
2020-08-18 18:14 ` [ruby-core:99630] " larskanis
2020-08-18 18:57 ` [ruby-core:99631] " tenderlove

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-86563.20200715200039.5550@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).