ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
@ 2020-07-10 17:54 larskanis
  2020-07-10 19:51 ` [ruby-core:99116] " chris
                   ` (18 more replies)
  0 siblings, 19 replies; 22+ messages in thread
From: larskanis @ 2020-07-10 17:54 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been reported by larskanis (Lars Kanis).

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023

* Author: larskanis (Lars Kanis)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99116] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
@ 2020-07-10 19:51 ` chris
  2020-07-10 21:11 ` [ruby-core:99118] " larskanis
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: chris @ 2020-07-10 19:51 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by chrisseaton (Chris Seaton).


> The user has to ensure on Ruby level that the string isn't GC'ed

The contract has always been more than that - the user must also not modify the string, as this may cause the character pointer to be reallocated.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86491

* Author: larskanis (Lars Kanis)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99118] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
  2020-07-10 19:51 ` [ruby-core:99116] " chris
@ 2020-07-10 21:11 ` larskanis
  2020-07-11  0:24 ` [ruby-core:99120] " duerst
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: larskanis @ 2020-07-10 21:11 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by larskanis (Lars Kanis).


@chrisseaton Exactly, the string is considered `const char *` as mentioned here for the `:string` type: https://github.com/ffi/ffi/wiki/Types#user-content-types

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86493

* Author: larskanis (Lars Kanis)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99120] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
  2020-07-10 19:51 ` [ruby-core:99116] " chris
  2020-07-10 21:11 ` [ruby-core:99118] " larskanis
@ 2020-07-11  0:24 ` duerst
  2020-07-11 10:11 ` [ruby-core:99126] " nobu
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: duerst @ 2020-07-11  0:24 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by duerst (Martin Dürst).

Assignee set to tenderlovemaking (Aaron Patterson)
Status changed from Open to Assigned

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86496

* Author: larskanis (Lars Kanis)
* Status: Assigned
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99126] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (2 preceding siblings ...)
  2020-07-11  0:24 ` [ruby-core:99120] " duerst
@ 2020-07-11 10:11 ` nobu
  2020-07-12 16:51 ` [ruby-core:99140] " tenderlove
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: nobu @ 2020-07-11 10:11 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by nobu (Nobuyoshi Nakada).

Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
Status changed from Assigned to Closed

As it doesn't seem to occur in the master but only in 2.7, I'll close this and set "Backport to 2.7" required, although I'm not sure if it is a bug.
Aaron, could you tell which commit fixed this?

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86503

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99140] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (3 preceding siblings ...)
  2020-07-11 10:11 ` [ruby-core:99126] " nobu
@ 2020-07-12 16:51 ` tenderlove
  2020-07-13 17:55 ` [ruby-core:99155] " tenderlove
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: tenderlove @ 2020-07-12 16:51 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).


Stack scanning should prevent `a` from moving, so you shouldn’t need to do anything to pin this.  I will also try this against the master branch, and if it’s fixed there I’ll find the commit that fixed it.  Regardless objects that are on the stack should be pinned, so I’m not sure off the top of my head why this object would move.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86518

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99155] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (4 preceding siblings ...)
  2020-07-12 16:51 ` [ruby-core:99140] " tenderlove
@ 2020-07-13 17:55 ` tenderlove
  2020-07-15 19:35 ` [ruby-core:99183] " larskanis
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: tenderlove @ 2020-07-13 17:55 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).

File 0001-Only-marked-objects-should-be-considered-movable.patch added

This is fixed in 6e7e7c1e577d6c2276e9a8cc85c28c55c46c2618.  I tried cherry picking the commit to ruby_2_7, but got a conflict.  I've attached the patch with the conflict resolved.

Thanks for reporting this!

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86534

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99183] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (5 preceding siblings ...)
  2020-07-13 17:55 ` [ruby-core:99155] " tenderlove
@ 2020-07-15 19:35 ` larskanis
  2020-07-15 20:00 ` [ruby-core:99184] " tenderlove
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: larskanis @ 2020-07-15 19:35 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by larskanis (Lars Kanis).


Thank you Aaron for looking into this! The patch avoids that the particular string is relocated, but unfortunately the patch is not sufficient to avoid relocation of any strings given to FFI. A slightly modified version of the script fails on ruby-2.7 and on master:

```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  A = ""

  def initialize(count)
    A.replace "a" * count

    set(A)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{A})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output is something like:
```
get(23): h8�U (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

In fact we recommend using a constant for this kind of usage since years: https://github.com/ffi/ffi/wiki/Core-Concepts#string-memory-allocation

I don't think there is a generic way to fix this issue other than pinning all strings on `RSTRING_PTR()` usage. However `RSTRING_PTR()` would be massive over-pinning, since the pointer is most often used only for a very short time frame and the string is movable again afterwards.

So actually I was looking for something like `rb_obj_mark_unmovable()` to pin the string. FFI could call this function on all string pointers passed to C. In FFI we don't know how long the string pointer is in use in the C library, so that marking all argument strings as unmovable is over-pinning as well. But for sure this would pin way less strings than `RSTRING_PTR()`.


----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86562

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99184] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (6 preceding siblings ...)
  2020-07-15 19:35 ` [ruby-core:99183] " larskanis
@ 2020-07-15 20:00 ` tenderlove
  2020-07-15 23:35   ` [ruby-core:99185] " Eric Wong
  2020-07-19  2:52 ` [ruby-core:99218] " nagachika00
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 22+ messages in thread
From: tenderlove @ 2020-07-15 20:00 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).


larskanis (Lars Kanis) wrote in #note-7:
> So actually I was looking for something like `rb_obj_mark_unmovable()` to pin the string. FFI could call this function on all string pointers passed to C. In FFI we don't know how long the string pointer is in use in the C library, so that marking all argument strings as unmovable is over-pinning as well. But for sure this would pin way less strings than `RSTRING_PTR()`.

Right, that makes sense.  I really need to document this (and I apologize for not doing so already), but `rb_gc_register_address` will pin your objects.  When you know you're done with the reference, you can release it with `rb_gc_unregister_address`.  Of course if you don't call the unregister function, the reference will stay alive forever.

Currently the GC can't know for how long you want the reference to remain unmovable, so de-registering it is required.  In the code example you've presented, if the constant were to be removed or redefined, the string buffer reference would go bad without compaction involved, so I'm not sure it's "safe" to do either.

One thought I have is that FFI could ensure T_STRING objects are not embedded.  The underlying char buffer won't move, but the object still could.

Another thought: could FFI expose `rb_gc_register_address`?  The code could become something like:

``` ruby
class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  ffi_register_global(A = "")

  def initialize(count)
    A.replace "a" * count

    set(A)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{A})"
  end
end
```

I know this would require people to change their code, but then we would be better aware of the lifetime of the object.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86563

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99185] Re: [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-15 20:00 ` [ruby-core:99184] " tenderlove
@ 2020-07-15 23:35   ` Eric Wong
       [not found]     ` <647A6735-04C8-4410-A71A-307A2390CFBD@gmail.com>
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Wong @ 2020-07-15 23:35 UTC (permalink / raw)
  To: ruby-core; +Cc: unicorn-public

tenderlove@ruby-lang.org wrote:
> Right, that makes sense.  I really need to document this (and
> I apologize for not doing so already), but
> `rb_gc_register_address` will pin your objects.  When you know
> you're done with the reference, you can release it with
> `rb_gc_unregister_address`.  Of course if you don't call the
> unregister function, the reference will stay alive forever.

Btw, does rb_gc_register_mark_object pin?  A quick glance at
gc.c tells me it doesn't, and I'll need to revert commit
2a6cb76d5010cb763ef5a2c305728465d15eb7c9 in unicorn:
https://yhbt.net/unicorn-public/20181226050857.6413-1-e@80x24.org/

Anyways, it takes me too long to compile Ruby so I'm back to
running whatever my distro ships.  I haven't been able to
test GC.compact at all.

> https://bugs.ruby-lang.org/issues/17023#change-86563

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99186] Re: [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
       [not found]     ` <647A6735-04C8-4410-A71A-307A2390CFBD@gmail.com>
@ 2020-07-16  0:04       ` Eric Wong
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Wong @ 2020-07-16  0:04 UTC (permalink / raw)
  To: Aaron Patterson; +Cc: ruby-core, unicorn-public

Aaron Patterson <aaron.patterson@gmail.com> wrote:
> > On Jul 15, 2020, at 4:35 PM, Eric Wong <normalperson@yhbt.net> wrote:
> > Btw, does rb_gc_register_mark_object pin?  A quick glance at
> > gc.c tells me it doesn't, and I'll need to revert commit
> > 2a6cb76d5010cb763ef5a2c305728465d15eb7c9 in unicorn:
> > https://yhbt.net/unicorn-public/20181226050857.6413-1-e@80x24.org/
> 
> Yes, it does pin.  I’m not super proud of this code, but here
> is where objects passed to rb_gc_register_mark_object get
> pinned:
> 
>   https://github.com/ruby/ruby/blob/c2a6295ec04a191c689d22254ac1ad5d665e27ad/vm.c#L2307-L2320

Ah, missed that, thanks.  So all is well in unicorn (for now).

> I don’t know why the mark object array is an array of arrays
> (I assume so as not to waste space in the array buffer?).
> Maybe this could be a more friendly data structure.

It seems so, since rb_ary_push can double the internal buffer.
Anyways, ko1 made commit
a617afc643e17cea5c1960a60adcd5ffb25fdfe2 r46714
Maybe it can be flattened nowadays and rely on GC.compact
to reclaim unused space.

/me goes back to hacking in Perl 5...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99218] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (7 preceding siblings ...)
  2020-07-15 20:00 ` [ruby-core:99184] " tenderlove
@ 2020-07-19  2:52 ` nagachika00
  2020-07-21 20:18 ` [ruby-core:99254] " larskanis
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: nagachika00 @ 2020-07-19  2:52 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by nagachika (Tomoyuki Chikanaga).

Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE

ruby_2_7 e619178e52250ceda3a0fe32ff5addb16617b58c merged revision(s) 6e7e7c1e577d6c2276e9a8cc85c28c55c46c2618.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86599

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99254] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (8 preceding siblings ...)
  2020-07-19  2:52 ` [ruby-core:99218] " nagachika00
@ 2020-07-21 20:18 ` larskanis
  2020-07-22  7:41 ` [ruby-core:99260] " hanmac
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: larskanis @ 2020-07-21 20:18 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by larskanis (Lars Kanis).


Sorry for not responding earlier! Unfortunately this issue still remains. So it is not closed and the backported commit doesn't fix it in any way.

I tried Aarons hint to change all passed strings to non-embedded content. The result is here: https://github.com/ffi/ffi/pull/811

Unfortunately this adds a significant slowdown (up to 70%) and is not suitable for frozen strings. But frozen strings are equally relocated. These both issues makes the PR a no-go.

Regarding `rb_gc_register_address()`: Enqueuing every string value passed to C would keep them alive forever, but we don't know how long the string is in use by the C library. We'd need some kind of finalizer hook to unregister them. But the finalizer is not called until `rb_gc_unregister_address()` was called.

Regarding `ffi_register_global`: This could solve the issue, but establishes a new API which is mostly superfluous and breaks the current contract. Both causes (`GC.compact` and long string usage in C) are rarely used in combination. But only if they're used together, this new call would be needed. This is quite hard to explain to people.

We would need a function that pins the VALUE for it's further lifetime, but without keeping it alive.


----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86638

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99260] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (9 preceding siblings ...)
  2020-07-21 20:18 ` [ruby-core:99254] " larskanis
@ 2020-07-22  7:41 ` hanmac
  2020-07-22  9:46 ` [ruby-core:99262] " larskanis
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: hanmac @ 2020-07-22  7:41 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by Hanmac (Hans Mackowiak).


the problem there is that ruby can't know how long the C lib is going to use the String. Depending on the C Function, might even take "ownership" of the string and does free it itself, so when Ruby would free the string too, it causes a Double Free crash.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86644

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99262] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (10 preceding siblings ...)
  2020-07-22  7:41 ` [ruby-core:99260] " hanmac
@ 2020-07-22  9:46 ` larskanis
  2020-07-22 17:57 ` [ruby-core:99273] " tenderlove
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: larskanis @ 2020-07-22  9:46 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by larskanis (Lars Kanis).


@Hanmac The string must be kept as a ruby object in an instance variable or constant, etc. and must not be modified as long as it's in use in the C library. Similar on the C side the string must not be modified and freeing it is not allowed. The details are the same since 10 years and are described here: https://github.com/ffi/ffi/wiki/Core-Concepts#memory-management Now ruby-2.7+ breaks this contract in some cases by moving the string value around.

Raw String usage for `:pointer` or `:string` arguments is the fastest way to transfer data buffers from ruby to C. `FFI::MemoryPointer` is a more flexible alternative, but is way slower.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86648

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99273] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (11 preceding siblings ...)
  2020-07-22  9:46 ` [ruby-core:99262] " larskanis
@ 2020-07-22 17:57 ` tenderlove
  2020-07-22 18:19 ` [ruby-core:99275] " eregontp
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: tenderlove @ 2020-07-22 17:57 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).


One idea I had is that we could make `RSTRING_PTR`, `StringValueCStr`, `StringValuePtr`, etc ensure that the character buffer it returns is not embedded in the object.  So, if it's an embedded string, move it to `malloc`.  We could introduce "unsafe" functions that don't do that, then change MRI internals to use the unsafe versions.  This seems like a big change, but I'm struggling to think of something smaller.

Another thought is rather than exposing `rb_gc_register_address`, just implement an object that has one reference and calls `rb_gc_mark` on the reference.  Again, this would have to be a change for FFI's API, but I think it's worthwhile since anyone could just `const_set(:A, nil)` and the string would get collected and the saved char* pointer would go bad. (Obviously people probably aren't going to do that `const_set` thing, but relying on an indirect reference to keep your object alive is asking for trouble)

I'm willing to do the work to introduce "unsafe" versions of the string buffer APIs if we think it would be worthwhile and if we can't find a smaller solution.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86661

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99275] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (12 preceding siblings ...)
  2020-07-22 17:57 ` [ruby-core:99273] " tenderlove
@ 2020-07-22 18:19 ` eregontp
  2020-07-22 18:23 ` [ruby-core:99276] " tenderlove
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: eregontp @ 2020-07-22 18:19 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by Eregon (Benoit Daloze).


Maybe FFI should discourage cases where the native function saves the RSTRING_PTR() and still use it after the native call returned?
That seems quite dangerous if e.g. the Ruby String is mutable.
Is it commonly used, do we have a real example?

Maybe we need a pin function that doesn't mind if the String is frozen?
After all pinning doesn't change the String contents.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86663

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99276] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (13 preceding siblings ...)
  2020-07-22 18:19 ` [ruby-core:99275] " eregontp
@ 2020-07-22 18:23 ` tenderlove
  2020-07-22 19:47 ` [ruby-core:99280] " eregontp
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: tenderlove @ 2020-07-22 18:23 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).


Eregon (Benoit Daloze) wrote in #note-15:
> Maybe we need a pin function that doesn't mind if the String is frozen?
> After all pinning doesn't change the String contents.

I think Lars was having trouble expanding strings that are frozen, not pinning them.  I suggested that strings could be expanded such that they cannot be embedded in the Ruby object, and this technique doesn't work on frozen strings.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86664

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99280] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (14 preceding siblings ...)
  2020-07-22 18:23 ` [ruby-core:99276] " tenderlove
@ 2020-07-22 19:47 ` eregontp
  2020-08-02 20:34 ` [ruby-core:99446] " larskanis
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 22+ messages in thread
From: eregontp @ 2020-07-22 19:47 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by Eregon (Benoit Daloze).


tenderlovemaking (Aaron Patterson) wrote in #note-16:
> I think Lars was having trouble expanding strings that are frozen, not pinning them.  I suggested that strings could be expanded such that they cannot be embedded in the Ruby object, and this technique doesn't work on frozen strings.

Right, what I wanted to mean is "pinning the RSTRING_PTR", so a new function to expand the char* so it's not embedded, and that would also work for frozen Strings.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86668

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99446] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (15 preceding siblings ...)
  2020-07-22 19:47 ` [ruby-core:99280] " eregontp
@ 2020-08-02 20:34 ` larskanis
  2020-08-18 18:14 ` [ruby-core:99630] " larskanis
  2020-08-18 18:57 ` [ruby-core:99631] " tenderlove
  18 siblings, 0 replies; 22+ messages in thread
From: larskanis @ 2020-08-02 20:34 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by larskanis (Lars Kanis).


I think there are several options to solve this issue:

1. Change `RSTRING_PTR()` to pin the string to a fixed memory location.
2. Add a new function kind of `rb_obj_mark_unmovable()` to ruby's C-API that pins string memory without keeping the string alive.
3. Expose `rb_gc_register_address()` and `rb_gc_unregister_address()` to ruby and require users to explicit mark objects to be in use.
4. Change the FFI contract, in such a way, that a parameter passed as `String` might not be accessed after the C call returns. As an alternative, `FFI::MemoryPointer` can be used, which works already without any modifications.

As mentioned above the issue is a very rare situation. An invalid access only happens in case of a parameter passed as String has been stored by a C function, `GC.compact` is then called somewhere in the application, the String in question is actually moved and is subsequently accessed on C level. In fact it is that rare, that we didn't receive a single bug report for this situation, but the example above is just constructed. Nevertheless I would like to fix it in a proactive manner, instead of keeping it as an undocumented Heisenbug. Given the rare probability this issue comes up, I don't think it's acceptable to get a measurable slowdown by the solution, however.

To 1: I don't think it's necessary to change `RSTRING_PTR()`. It is a function used by almost every C extension, but this particular issue is very specific to FFI. Only FFI depends on indirect marking of objects as a core feature. Even Fiddle is much less affected, since it requires the user to explicit allocate and deallocate memory by default and doesn't allow passing ruby strings as parameters. So we don't need a transparent solution, but only something that can be build into FFI and that can be used explicit where necessary. Also if this change slows down `RSTRING_PTR()` it would slow down all C-extensions.

To 2: This is my favorite: I would like to call a Ruby-API function from FFI's C-extension that marks the passed String object as unmovable without keeping it alive. So the pinning should be for the rest of the lifetime of the String object. Since this function must be called for every String passed to a FFI function, it must be fast. So copying the string to non-embedded memory is no good option, since https://github.com/ffi/ffi/pull/811 showed, that it results in a 70% slowdown for each and every string passed to C - regardless how it will be used afterwards.

To 3: As mentioned earlier I don't like this solution. It doesn't fit to the FFI API - it is ugly. The object must be explicit unregistered in contrast to most other FFI objects. Since the object must be stored in a Ruby variable for calling `rb_gc_unregister_address()`, the object is always marked twice - once by the the mark list and once by the ruby variable. So both references have to be released to get the object GC'ed. It's hard to describe when and how this function has to be called.

To 4: If 2. isn't possible to implement, I would prefer this one. Since values on the stack are not moved, it should be safe to use String objects by the C function, even if the function releases the GVL and `GC.compact` is called in the meanwhile. And when the function returns, any access to the pointer is prohibited.

@tenderlovemaking Is it possible to implement something like 2?


----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-86902

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99630] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (16 preceding siblings ...)
  2020-08-02 20:34 ` [ruby-core:99446] " larskanis
@ 2020-08-18 18:14 ` larskanis
  2020-08-18 18:57 ` [ruby-core:99631] " tenderlove
  18 siblings, 0 replies; 22+ messages in thread
From: larskanis @ 2020-08-18 18:14 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by larskanis (Lars Kanis).


@tenderlovemaking The issue in FFI is still unsolved. Could you please have a look at the questions above? Or may I open a new issue for discussion?

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-87111

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:99631] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi
  2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
                   ` (17 preceding siblings ...)
  2020-08-18 18:14 ` [ruby-core:99630] " larskanis
@ 2020-08-18 18:57 ` tenderlove
  18 siblings, 0 replies; 22+ messages in thread
From: tenderlove @ 2020-08-18 18:57 UTC (permalink / raw)
  To: ruby-core

Issue #17023 has been updated by tenderlovemaking (Aaron Patterson).


larskanis (Lars Kanis) wrote in #note-19:
> @tenderlovemaking The issue in FFI is still unsolved. Could you please have a look at the questions above? Or may I open a new issue for discussion?

Hi Lars,

Sorry I totally missed your last email.  The main issue with solution 2 is that there is no place to store that information.  The GC doesn't keep any persistent data about objects between (major) GC cycles.  To support 2 we would have to store the data in the string itself, and I'm not sure if we have header bits available to do that.

The second problem with solution 2 is that this is not a problem unique just to strings.  If an FFI extension holds a reference to any VALUE (maybe by stuffing it in a void* or something) and doesn't rb_gc_mark the value, the GC will assume it is safe to move that object.  Fiddle has the same problem, so I introduced a "pinning reference" here:

  https://github.com/ruby/fiddle/pull/44

All it does is hold a reference to the object and call `rb_gc_mark` on the reference at mark time.

I think adding something like ^^^ to the FFI gem and asking users to wrap objects they know will hold references is the safest solution.  As you mention, this issue only impacts objects that are stored off the stack, so it should be pretty rare for anyone to need this.  Another idea I had was to figure out some way to have FFI automatically wrap constants with the above proxy object, but I didn't get that far.

----------------------------------------
Bug #17023: How to prevent String memory to be relocated in ruby-ffi
https://bugs.ruby-lang.org/issues/17023#change-87112

* Author: larskanis (Lars Kanis)
* Status: Closed
* Priority: Normal
* Assignee: tenderlovemaking (Aaron Patterson)
* ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
----------------------------------------
[ruby-ffi](https://github.com/ffi/ffi) allows to pass String objects to C by using the `:string` argument type. This way the string memory returned by `RSTRING_PTR` is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced `GC.compact`, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:
```ruby
File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)
```

The output looks like so on ruby-2.7.1:
```
get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)
```

So using `GC.compact` while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use `rb_gc_mark()` in order to pin the VALUE to a memory location. But I couldn't find a way to pin a `VALUE` at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


---Files--------------------------------
string-relocate.rb (653 Bytes)
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-08-18 18:57 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10 17:54 [ruby-core:99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi larskanis
2020-07-10 19:51 ` [ruby-core:99116] " chris
2020-07-10 21:11 ` [ruby-core:99118] " larskanis
2020-07-11  0:24 ` [ruby-core:99120] " duerst
2020-07-11 10:11 ` [ruby-core:99126] " nobu
2020-07-12 16:51 ` [ruby-core:99140] " tenderlove
2020-07-13 17:55 ` [ruby-core:99155] " tenderlove
2020-07-15 19:35 ` [ruby-core:99183] " larskanis
2020-07-15 20:00 ` [ruby-core:99184] " tenderlove
2020-07-15 23:35   ` [ruby-core:99185] " Eric Wong
     [not found]     ` <647A6735-04C8-4410-A71A-307A2390CFBD@gmail.com>
2020-07-16  0:04       ` [ruby-core:99186] " Eric Wong
2020-07-19  2:52 ` [ruby-core:99218] " nagachika00
2020-07-21 20:18 ` [ruby-core:99254] " larskanis
2020-07-22  7:41 ` [ruby-core:99260] " hanmac
2020-07-22  9:46 ` [ruby-core:99262] " larskanis
2020-07-22 17:57 ` [ruby-core:99273] " tenderlove
2020-07-22 18:19 ` [ruby-core:99275] " eregontp
2020-07-22 18:23 ` [ruby-core:99276] " tenderlove
2020-07-22 19:47 ` [ruby-core:99280] " eregontp
2020-08-02 20:34 ` [ruby-core:99446] " larskanis
2020-08-18 18:14 ` [ruby-core:99630] " larskanis
2020-08-18 18:57 ` [ruby-core:99631] " tenderlove

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).