ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@ml.ruby-lang.org>
To: ruby-core@ml.ruby-lang.org
Cc: "kjtsanaktsidis (KJ Tsanaktsidis)" <noreply@ruby-lang.org>
Subject: [ruby-core:116562] [Ruby master Bug#20169] `GC.compact` can raises `EFAULT` on IO
Date: Sat, 03 Feb 2024 01:37:09 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-106579.20240203013709.17@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-20169.20240109103749.17@ruby-lang.org

Issue #20169 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).


So I think we can go with that fix, but I want to try and spell out what it means for "the rules" for extensions if we accept it.

AFAICT, by merging https://github.com/ruby/ruby/pull/9817, we are saying the following:

1. It is illegal to dereference or store into any pointer to the Ruby heap in a no-GVL context. This implies:
   a) It is illegal to directly access any field in a `struct RBasic`, `struct RString`, etc,
   b) Thus it implies that things like `RSTRING_PTR(str)` are 100% illegal in a no-GVL context (this shouldn't be controversial)
   c) This rule _does_ allow you to manipulate a C structure/array/etc you received a pointer to from a `struct RThing` (an "external pointer"), provided
       i) you _know_ that this structure is stored in the C heap and not the Ruby heap,
      ii) you obtained this pointer out of the `struct RThing` whilst holding the GVL.
     iii) and you obey the restrictions spelled out in Rule 2 below.
2. You can dereference and read from an "external pointer" in a no-GVL context, provided that
  a) the VALUE from which you obtained the pointer is guaranteed to be GC live AND GC pinned from the beginning of your no-GVL context through to the last access of the pointer.
      i) In practice, normally extension developers will adhere to this rule by inserting an `RB_GC_GUARD` macro after a call to `rb_thread_call_without_gvl` to instruct the compiler to keep the VALUE live on the thread/fiber's C stack and thus visible to Ruby's GC.
     ii) However, there are other ways to meet this requirement too (e.g. adding the object directly as a GC root with `rb_global_variable` or such).
    iii) The reason why it's important for the object to be GC pinned is that if the object is moved, the object's `dcompact` function is called, and this function is allowed to mutate the "external pointer".
  b) No Ruby thread can be using this object, either in C or in Ruby. If an object's "external pointer"s are being used in a no-GVL context, then the object as a whole cannot be used in a GVL context.
  c) the VALUE has not at any point in its lifetime been made visible to Ruby code, IF the value is not frozen.
      i) This obviously means it e.g. cannot have been yielded back to Ruby with `rb_yield`, or previously returned, etc.
     ii) It _also_ means that the object must have been hidden with a call to `rb_gc_hide` _immediately after_ being allocated by the GC. There cannot be any call to Ruby, nor call to the GC, in between where the program allocates the VALUE and hides it. In practice, APIs like the `rb_str_tmp_*` family will return pre-hidden objects which meet these requirements. This ensures that a mutable value being accessed from C was hidden from calls like `ObjectSpace.each_object`.
    iii) However, if an object is frozen, then this requirement does not apply - it _is_ legal to read an "external pointer" in a no-GVL context if the object it was obtained from is frozen, even if that object is visible to Ruby.
     iv) A corollary of iii is that an extension which exposes "external pointers" to C-heap allocated data owned by its T_DATA object cannot mutate this data if the underlying `T_DATA` is frozen. It's fairly uncommon for extensions to actually expose C-level APIs like this, but it is possible by e.g. exporting symbols with `__attribute__((visibility(default)))`.
3. You can write to an "external pointer" in a no-GVL context, provided that you adhere to rule 2, AND that the object is not frozen.
4. When reading from and writing to "external pointers", extension developers are responsible for appropriately synchronizing their reads and writes to this pointer (in practice, most usage of such "external pointers" will be restricted to a single thread, however).


These rules are pretty complicated, but I think this is what's implied by the patch. In particular, i think 2.c.iii is implied by

```
if (OBJ_FROZEN_RAW(orig) && !STR_EMBED_P(orig) && !rb_str_reembeddable_p(orig)) return orig;
```

but was kind of surprising to me. Possibly, although these are the rules that seem to apply _within_ cruby, we want to document a more restrictive set of rules for extension developers which are easier to reason about (perhaps even as far as "don't store results of no-GVL routines into Ruby strings at all in your extension, you must bounce everything through native memory and copy it back into Ruby with the GVL").  

Whatever these rules are, I think we need a clearer description of them in extension.rdoc. If they're different from the internal rules, let's document the "real" messy internal rules in the source tree as well. I'm happy to open a PR to do this once we get some agreement on the broad outlines.

As an aside, I'm researching whether there might be ways to automatically enforce some of this in CI (both for CRuby and for extension developers) by leveraging LLVM's sanitizer infrastructure somehow - I think it would be really helpful to have an executable spec of what the rules are like this!
 

----------------------------------------
Bug #20169: `GC.compact` can raises `EFAULT` on IO
https://bugs.ruby-lang.org/issues/20169#change-106579

* Author: ko1 (Koichi Sasada)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
1. `GC.compact` introduces read barriers to detect read accesses to the pages.
2. I/O operations release GVL to pass the control while their execution, and another thread can call `GC.compact` (or auto compact feature I guess, but not checked yet).
3. Call `write(ptr)` can return `EFAULT` when `GC.compact` is running because `ptr` can point read-barrier protected pages (embed strings).

Reproducible steps:


Apply the following patch to increase possibility:

```patch
diff --git a/io.c b/io.c
index f6cd2c1a56..83d67ba2dc 100644
--- a/io.c
+++ b/io.c
@@ -1212,8 +1212,12 @@ internal_write_func(void *ptr)
         }
     }

+    int cnt = 0;
   retry:
-    do_write_retry(write(iis->fd, iis->buf, iis->capa));
+    for (; cnt < 1000; cnt++) {
+        do_write_retry(write(iis->fd, iis->buf, iis->capa));
+        if (result <= 0) break;
+    }

     if (result < 0 && !iis->nonblock) {
         int e = errno;
```

Run the following code:

```ruby
t1 = Thread.new{ 10_000.times.map{"#{_1}"}; GC.compact while true }
t2 = Thread.new{
  i=0
  $stdout.write "<#{i+=1}>" while true
}
t2.join
```

and 

```
$ make run
(snip)
4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4>#<Thread:0x00007fa61b4dd758 ../../src/trunk/test.rb:3 run> terminated with exception (report_on_exception is true):
../../src/trunk/test.rb:5:in `write': Bad address @ io_write - <STDOUT> (Errno::EFAULT)
        from ../../src/trunk/test.rb:5:in `block in <main>'
../../src/trunk/test.rb:5:in `write': Bad address @ io_write - <STDOUT> (Errno::EFAULT)
        from ../../src/trunk/test.rb:5:in `block in <main>'
make: *** [uncommon.mk:1383: run] Error 1
```

I think this is why we get `EFAULT` on CI. To increase possibilities running many busy processes (`ruby -e 'loop{}'` for example) will help (and on CI environment there are such busy processes accidentally).



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

  parent reply	other threads:[~2024-02-03  1:37 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-09 10:37 [ruby-core:116114] [Ruby master Bug#20169] `GC.compact` can raises `EFAULT` on IO ko1 (Koichi Sasada) via ruby-core
2024-01-09 11:05 ` [ruby-core:116115] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-01-10 23:39 ` [ruby-core:116164] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-01-11  3:28 ` [ruby-core:116167] " luke-gru (Luke Gruber) via ruby-core
2024-01-13  2:58 ` [ruby-core:116188] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-01-15  5:25 ` [ruby-core:116212] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-01-16  9:34 ` [ruby-core:116224] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-02-02 17:03 ` [ruby-core:116560] " peterzhu2118 (Peter Zhu) via ruby-core
2024-02-03  1:37 ` kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core [this message]
2024-02-04 22:14 ` [ruby-core:116577] " peterzhu2118 (Peter Zhu) via ruby-core
2024-02-04 23:41 ` [ruby-core:116578] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-05-28 21:24 ` [ruby-core:118049] " k0kubun (Takashi Kokubun) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-106579.20240203013709.17@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    --cc=noreply@ruby-lang.org \
    --cc=ruby-core@ml.ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).