From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 Received: from nue.mailmanlists.eu (nue.mailmanlists.eu [IPv6:2a01:4f8:1c0c:6b10::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 02FA31F451 for ; Mon, 15 Jan 2024 05:25:41 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (1024-bit key; secure) header.d=ml.ruby-lang.org header.i=@ml.ruby-lang.org header.a=rsa-sha256 header.s=mail header.b=gXhIQ5e5; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=Yqrs+vUj; dkim-atps=neutral Received: from nue.mailmanlists.eu (localhost [127.0.0.1]) by nue.mailmanlists.eu (Postfix) with ESMTP id 08C3281D38; Mon, 15 Jan 2024 05:25:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ml.ruby-lang.org; s=mail; t=1705296333; bh=pW/jVwzBdm+J0BkYkh9JjZ8JZw4GrA1VOrQQTEFOLV8=; h=Date:References:To:Reply-To:Subject:List-Id:List-Archive: List-Help:List-Owner:List-Post:List-Subscribe:List-Unsubscribe: From:Cc:From; b=gXhIQ5e51D5uUJUr4uBOXHPvtwTPRsUA11+mxo0JmW23RKJViLHDbd5y6+DfofzaL iBCogOKAWVzDinLWeD3MjZp2eeOZreFssFtXUEfTBNNMagjetZE6NMC7Gp45e2lXz4 CBkOxYr5MqD8P29RyPKFvmwd+BrRK8/Q84gKg/TU= Received: from wrqvtvtt.outbound-mail.sendgrid.net (wrqvtvtt.outbound-mail.sendgrid.net [149.72.120.119]) by nue.mailmanlists.eu (Postfix) with ESMTPS id 928EB81C76 for ; Mon, 15 Jan 2024 05:25:29 +0000 (UTC) Authentication-Results: nue.mailmanlists.eu; dkim=pass (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=Yqrs+vUj; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ruby-lang.org; h=from:references:subject:mime-version:content-type: content-transfer-encoding:list-id:to:cc:content-type:from:subject:to; s=s1; bh=etnIIKMcDQbsDSis0PPwRcbcNBYlVxkgwqqdvaIL1GQ=; b=Yqrs+vUj1D/hWheJcuOVIgbtuyCRGzT+tYCojM3AXN1XakmSiWIKcYcTWzXjd/deQ4OO zKNIYnlxv0DCHWHN3XQDheqXgs9n8AfqovH5j8xx81tUOWquEJY2IYYeK17Akir6t37y7R CvzV95rdZcZ/AmASLwfJ5oZETsHu/zg5alZovJr+g4Ulh+x+mzkbrmmZvwxPKHTZOHA6Ur vPgp8cAR3PtXY6CF5VTG8DP0Q6vmEhUHFRMi/oiWQLvdHLXqrF20Fjw0YJ7v1RPHEzNmWf RAoBugW3TQPwMz0GVWj3mPGfotN8nM8dFp2eiw+cM6x885TTlnABkcCEzmMdGpCQ== Received: by filterdrecv-d585b8d85-tktxc with SMTP id filterdrecv-d585b8d85-tktxc-1-65A4C1C7-C 2024-01-15 05:25:27.587315066 +0000 UTC m=+317281.815385632 Received: from herokuapp.com (unknown) by geopod-ismtpd-13 (SG) with ESMTP id B5zqyc4JQReAE352qGrGiQ for ; Mon, 15 Jan 2024 05:25:27.446 +0000 (UTC) Date: Mon, 15 Jan 2024 05:25:27 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 20169 X-Redmine-Issue-Author: ko1 X-Redmine-Sender: kjtsanaktsidis X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 92774 X-SG-EID: =?us-ascii?Q?bpuAnwiv5wR3AMKfH76hNK7PI5khDDZ1kji5+D8yFrgaUvTZzAkIlQXnVriLYY?= =?us-ascii?Q?wZntoXP0DUH6R9LNgqzTU1esCZ4KY8M9zX8WDhO?= =?us-ascii?Q?VCp0C7ccEQtKaDyh4qpr3TqBeKFEC+=2F5W34sqeN?= =?us-ascii?Q?3lPBJG59u3XiN7mRqXQDVigh2P8oZskKNfPg+ma?= =?us-ascii?Q?UEETGqzplinj8pN9pyY5VW3Xx6hfSg3gMDVNbrE?= =?us-ascii?Q?1hJPSEw1mtJiUPOtbz4qVh40zsz9DLiNQt4YWXN?= =?us-ascii?Q?nDo+2Zj5TN5ngfHRyR=2Fu9Ml6Gn9cf8YKLdY4ReB?= =?us-ascii?Q?E9Y=3D?= To: ruby-core@ml.ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== Message-ID-Hash: PG76KT5KG4Z6ZMXXBEXQHYZYJBJHDCGF X-Message-ID-Hash: PG76KT5KG4Z6ZMXXBEXQHYZYJBJHDCGF X-MailFrom: bounces+313651-b711-ruby-core=ml.ruby-lang.org@em5188.ruby-lang.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list Reply-To: Ruby developers Subject: [ruby-core:116212] [Ruby master Bug#20169] `GC.compact` can raises `EFAULT` on IO List-Id: Ruby developers Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" Cc: "kjtsanaktsidis (KJ Tsanaktsidis)" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Issue #20169 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). I spent some more time today studying this problem. This is what I've learned so far. I'm definitely far from an expert in the mechanics of how GC's work, so please jump in and correct me if you are! 1. Obviously, as @luke-gru suggested, we can fix this particular manifestation of the problem by passing a stack buffer to the system calls in `io.c` and copying that to/from the Ruby heap in userspace. However, the really concerning part of this bug is what it means for the C extension API. 2. From my research, it seems that read barriers (either `SIGSEGV`/`userfaultfd(2)` based barriers like Ruby's, or explicit barriers around every pointer read emitted by a JIT compiler) are mostly used to support _incremental_ compaction. The relationship between read barriers and incremental compaction is somewhat analogous to the relationship between write barriers (like Ruby's `RB_OBJ_WRITE` macro) and incremental marking. Ruby does not actually do incremental compaction at the moment (although presumably we would like to add that someday). 3. The reason _why_ we need read barriers for incrmental compaction is so that user code can see a consistent view of the world even if the heap is half-compacted. I think there are two specific things read barriers need to do in incremental compaction. Imagine a read barrier macro as `VALUE RB_OBJ_READ(VALUE owning_object, VALUE *object_to_read)`. It needs to, firstly, detect if `object_to_read` is `T_MOVED`, and update its reference in `owning_object` if so, to point to its new location. But, secondly, it also needs to set the pin bit on `*object_to_read`, so that it can't be compacted elsewhere in a subsequent step of GC compaction - once `RB_OBJ_READ` has returned, the value of `object_to_read` is on the machine stack, so it needs to be pinned just as if it was live on the machine stack during the marking phase. 4. Although Ruby doesn't have incremental GC compaction (yet), it _does_ have a mechanism for user code to run in between GC compaction steps. If a page is swept during compaction because of a call to `gc_sweep_page` [here](https://github.com/ruby/ruby/blob/9c3299896e7ea0aa1d9cc0d4786d7be2908ca9be/gc.c#L8644), it might free some objects on that page. If a `T_DATA` is freed, and it has the `RUBY_TYPED_FREE_IMMEDIATELY` flag, then we will call its free function right there. At this point, we have not yet called the `dcompact` function on the object to fix up its references to other objects; so, it might try and read VALUEs that are now `T_MOVED`. We need read barriers in place to detect this, just as much as if we had incremental marking. 5. Rather than explicit read barriers based on `RB_OBJ_READ`, which react to already-happened moves and prevent future moves, Ruby is using `mprotect(2)` wizadry to trap access to pages with `T_MOVED` objects and invalidate moves after-the-fact. I actually can't find a discussion comparing the two anywhere, but presumably the main reason for wanting to do this automagically with page protection is that it doesn't require any changes to how C extensions work. 6. The specific problem in this issue in `io.c` involves access to a pointer to the Ruby heap from inside a thread which doesn't hold the GVL. However, I believe it would be entirely possible to construct a similar problem without any threading at all, because of point 4. Imagine you had a `T_DATA`, which held a reference to an integer file descriptor, a string VALUE, and had the `RUBY_TYPED_FREE_IMMEDIATELY` flag. In its `dfree` function, imagine it called `write(self->fd, RSTRING_PTR(self->str), RSTRING_LEN(self->str))`. As far as I can tell, this is perfectly legal according to our C API rules today; the thread holds the GVL, so use of the `RSTRING_` macros is OK, and it does not allocate anything, which makes it a legal `dfree` function. However, if `str` had been moved, in a prior compaction step, then the page containing `self->str` will be locked, and so the `write` call will return `EFAULT` instead of triggering the read barrier. 7. I think it's currently illegal to access pointers to objects embedded in the Ruby heap without the GVL, and the accesses in `io.c` need to be changed to either un-embed the string or copy it through the C stack. There is no possible way we can make this safe in the presence of compaction without fundamentally rearchitecting the GC to be concurrent. However, we still need to do something about the perfectly legal non-concurrent code in point 6. 8. So I think we also need to do one or more of the following: a) use an API like userfaultfd, when it's available, which allows kernel-mode accesses to be trapped just like user-mode ones. However, this is obviously only viable for Linux (and not even all the time then - the feature must be enabled in `/proc/sys/vm/unprivileged_userfaultfd`). b) overwrite all of the common libc symbols used to perform syscalls with userspace buffers, and provide wrapped versions that detect EFAULT return values, invoke the page-moved logic, and retry. c) declare that it is illegal to access Ruby heap memory in kernel mode. This is kind of impossible to detect, however, without also doing something like b), so it'll just lead to incredibly rare failures like the one which spawned this issue throughout the ecosystem. d) Implement explicit read barriers with a `RB_OBJ_READ` macro and a `RB_TYPED_RB_PROT` flag analogous to how we implement write barriers, and delete our SIGSEGV based automatic read barriers. This would be an incredibly disruptive ecosystem change (I think we'd probably need to disable compaction for any object marked by an object that didn't use read barriers). But on the plus side, it makes explicit what the requirements are for C extensions, allows the barriers to work at VALUE granularity rather than page granularity, and perhaps brings us closer to incremental compaction. ---------------------------------------- Bug #20169: `GC.compact` can raises `EFAULT` on IO https://bugs.ruby-lang.org/issues/20169#change-106232 * Author: ko1 (Koichi Sasada) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- 1. `GC.compact` introduces read barriers to detect read accesses to the pages. 2. I/O operations release GVL to pass the control while their execution, and another thread can call `GC.compact` (or auto compact feature I guess, but not checked yet). 3. Call `write(ptr)` can return `EFAULT` when `GC.compact` is running because `ptr` can point read-barrier protected pages (embed strings). Reproducible steps: Apply the following patch to increase possibility: ```patch diff --git a/io.c b/io.c index f6cd2c1a56..83d67ba2dc 100644 --- a/io.c +++ b/io.c @@ -1212,8 +1212,12 @@ internal_write_func(void *ptr) } } + int cnt = 0; retry: - do_write_retry(write(iis->fd, iis->buf, iis->capa)); + for (; cnt < 1000; cnt++) { + do_write_retry(write(iis->fd, iis->buf, iis->capa)); + if (result <= 0) break; + } if (result < 0 && !iis->nonblock) { int e = errno; ``` Run the following code: ```ruby t1 = Thread.new{ 10_000.times.map{"#{_1}"}; GC.compact while true } t2 = Thread.new{ i=0 $stdout.write "<#{i+=1}>" while true } t2.join ``` and ``` $ make run (snip) 4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4># terminated with exception (report_on_exception is true): ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in
' ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in
' make: *** [uncommon.mk:1383: run] Error 1 ``` I think this is why we get `EFAULT` on CI. To increase possibilities running many busy processes (`ruby -e 'loop{}'` for example) will help (and on CI environment there are such busy processes accidentally). -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/