[ruby-core:110253] [Ruby master Feature#19024] Proposal: Import Modules

ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed

From: "shioyama (Chris Salzberg)" <noreply@ruby-lang.org>
To: ruby-core@neon.ruby-lang.org
Subject: [ruby-core:110253] [Ruby master Feature#19024] Proposal: Import Modules
Date: Tue, 11 Oct 2022 02:51:26 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-99542.20221011025126.13031@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-19024.20220927005127.13031@ruby-lang.org

Issue #19024 has been updated by shioyama (Chris Salzberg).

@jeremyevans0

Thanks for your thoughtful response!

> For similar reasons, making require implicitly support the currently wrapping module would break idempotency and therefore I do not think it should be considered.

I agree, and from the beginning I have not advocated allowing passing extra parameters to `require`. It seems that everyone here agrees that changing `require` in almost any way that alters its basic premises is fundamentally a no-go.

Given that, wouldn't it make sense to close [#10320](https://bugs.ruby-lang.org/issues/10320), ideally with a note explaining why the proposal there is not feasible? Although similar in spirit to this issue, that one entirely centers on changing `require` in such a way that, as I read it, it is no longer exclusively idempotent.

I ask because leaving that issue open invites the interpretation (perhaps mistaken) that the proposal there is feasible given the right implementation, whereas as I see it from discussions here it seems almost entirely _infeasible_ under any circumstances.

> While I understand the goal of reducing namespace "boilerplate", I think it is important to understand that removing explicit namespaces is a tradeoff. If you do not leave the namespaces in the file, but instead let them be implicit, the code likely becomes more difficult to understand. You state that programmers would naturally prefer implicit namespaces over explicit namespaces, but I'm not sure that is true. Implicit code is not necessarily better than explicit code. What you consider "irrelevant" may be very relevant to someone who isn't familiar with the code an all of the implicit namespaces being dealt with.

I agree that there is a tradeoff, as @fxn earlier commented when he wrote that the idea breaks the fundamental assumption that "by looking at your source code, you know the nesting." 

But this is about much more than reducing boilerplate. It is about a fundamental shift in perspective, from one where everything is visible everywhere, to one where the "perspective" is itself something that can be created, nested and isolated.

You write "implicit code is not necessarily better than explicit code". I agree. Autoloading, for example, makes a similar tradeoff of implicit over explicit, and that tradeoff likewise has non-trivial downsides. Autoloading can also be seen as reducing boilerplate (all those `require`s we no longer need), but clearly it is about more than that.

Ruby has many sharp knives like this, and the way we handle those knives is by creating conventions around their usage. Much the way Zeitwerk (and Rails) provided file organization conventions around autoloading, any mechanism in Ruby that would allow code to be imported in the way I'm describing would also invite some kind of conventions around its usage to make it more useful.

I admit that this is very hand-wavy, and I need to provide a clearer demonstration of what those conventions might look like. This is something that is lacking from this proposal, and something I am thinking a lot about. I will come back to this point.

> You describe the current state of affairs as a "terrible tradeoff", but that seems hyperbolic to me. At most, having to use explicit namespaces should be mildly annoying, even if you have full understanding of the code and can deal with implicit namespaces.

I can see how you see that statement as hyperbolic, but I don't see it that way. Maybe I should have qualified the "terrible"-ness of the tradeoff as being relative to the size of the code space involved; in a codebase of a dozen files this is not terrible.

OTOH solving a "mildly annoying" problem is not to me an appropriate characterization of what I am describing. Maybe that's because each time I present one aspect of what I see as a bigger picture change.

Imports as a concept tackles several hard problems at once, including:

- literal namespaces and code grouping, and the misalignment of incentives involved (as described)
- encapsulation/isolation (i.e. Packwerk, packages etc.)
- namespace collisions/conflicts (I want a semantically-meaningful `Platform` in my application but it conflicts with the `platform` gem)

The last one here, which I have barely touched on, is a problem we just live with as Rubyists, and to some degree I think simply internalize as "the way things work". But this is a real problem that deserves a proper solution.

> Note that you can currently support multiple namespaces using `eval`. 

It's interesting you brought up this example, because I have considered implementations for `import` using `eval` at least as a proof-of-concept, but it doesn't work for the very important case where I want to evaluate under an anonymous module namespace; in your example, you need to supply a dedicated named context (`Payments::Nested`) to load the code into. This may seem like a minor point but I don't believe it is.

Only with an anonymous-rooted namespace can we avoid polluting the parent load context's namespace, and avoid potential conflicts with other loaded constants in that same namespace.

i.e. I want this:

```ruby
module Payments
  foo_client = Module.new do
    foo_path = File.expand_path("api_clients/foo_client.rb", __dir__))
    eval File.read(foo_path), binding, foo_path
  end

  # do something with foo_client::FooClient
end
```

but this actually translates to this:

```ruby
module Payments
  foo_client = Module.new do
    class FooClient < MyClientGem::ApiClient
      # ...
    end
  end
end
```

This does not define `foo_client::FooClient`, but rather `::Payments::FooClient`, because any call to `module` or `class` in the evaluated file will resolve back to the closest _named_ context, in this case `Payments`.

`load` with the wrap module is different because it resolves `module` and `class` definitions to the wrap module, _regardless of whether that module is anonymous_. As far as I can tell there is no other way (including "ugly" hacks like `eval` on file contents) in Ruby to do this. It is this (unintended?) "feature" that I am attepmting to leverage here, to make it more powerful.

I love that this concept of an "unrooted" nested namespace (what I am loosely referring to as an "import" here) is actually something that already exists in Ruby. It does not need to be added, it simply needs to be tweaked to make it isolated from its parent.

> I think it would be helpful if, for each of the patches you are proposing, you include tests to make it easier to see what each patch allows and how the behavior changes. 

Thanks, I will do this. There are not many of these; there may in fact only be one or two.

Just to clarify though: I had originally intended to actually do just this. But after discussion at the Developers Meeting it was recommended that I lay out the bigger picture in a new issue separate from [#10320](https://bugs.ruby-lang.org/issues/10320), to motivate individual changes, so that is what I have done. And indeed, I now think that having the big picture is very important in understanding the individual changes, so I will link back here to contextualize and motivate each of them.

----------------------------------------
Feature #19024: Proposal: Import Modules
https://bugs.ruby-lang.org/issues/19024#change-99542

* Author: shioyama (Chris Salzberg)
* Status: Open
* Priority: Normal
----------------------------------------
There is no general way in Ruby to load code outside of the globally-shared namespace. This makes it hard to isolate components of an application from each other and from the application itself, leading to complicated relationships that can become intractable as applications grow in size.

The growing popularity of a gem like [Packwerk](https://github.com/shopify/packwerk), which provides a new concept of "package" to enfoce boundaries statically in CI, is evidence that this is a real problem. But introducing a new packaging concept and CI step is at best only a partial solution, with downsides: it adds complexity and cognitive overhead that wouldn't be necessary if Ruby provided better packaging itself (as Matz has suggested [it should](https://youtu.be/Dp12a3KGNFw?t=2956)).

There is _one_ limited way in Ruby currently to load code without polluting the global namespace: `load` with the `wrap` parameter, which as of https://bugs.ruby-lang.org/issues/6210 can now be a module. However, this option does not apply transitively to `require` calls within the loaded file, so its usefulness is limited.

My proposal here is to enable module imports by doing the following:

1. apply the `wrap` module namespace transitively to `require`s inside the loaded code, including native extensions (or provide a new flag or method that would do this),
2. make the `wrap` module the toplevel context for code loaded under it, so `::Foo` resolves to `<top_wrapper>::Foo` in loaded code (or, again, provide a new flag or method that would do this). _Also make this apply when code under the wrapper module is called outside of the load process (when `top_wrapper` is no longer set) &mdash; this may be quite hard to do_.
3. resolve `name` on anonymous modules under the wrapped module to their names without the top wrapper module, so `<top_wrapper>::Foo.name` evaluates to `"Foo"`. There may be other ways to handle this problem, but a gem like Rails uses `name` to resolve filenames and fails when anonymous modules return something like `#<Module: ...>::ActiveRecord` instead of just `ActiveRecord`.

I have roughly implemented these three things in [this patch](https://github.com/ruby/ruby/compare/master...shioyama:ruby:import_modules). This implementation is incomplete (it does not cover the last highlighted part of 2) but provides enough of a basis to implement an `import` method, which I have done in a gem called [Im](https://github.com/shioyama/im).

Im provides an `import` method which can be used to import gem code under a namespace:

```ruby
require "im"
extend Im

active_model = import "active_model"
#=> <#Im::Import root: active_model>

ActiveModel
#=> NameError

active_model::ActiveModel
#=> ActiveModel

active_record = import "active_record"
#=> <#Im::Import root: active_record>

# Constants defined in the same file under different imports point to the same objects
active_record::ActiveModel == active_model::ActiveModel
#=> true
```

With the constants all loaded under an anonymous namespace, any code importing the gem can name constants however it likes:

```ruby
class Post < active_record::ActiveRecord::Base
end

AR = active_record::ActiveRecord

Post.superclass
#=> AR::Base
```

Note that this enables the importer to completely determine the naming for every constant it imports. So gems can opt to hide their dependencies by "anchoring" them inside their own namespace, like this:

```ruby
# in lib/my_gem.rb
module MyGem
  dep = import "my_gem_dependency"

  # my_gem_dependency is "anchored" under the MyGem namespace, so not exposed to users
  # of the gem unless they also require it.
  MyGemDependency = dep

  #...
end
```

There are a couple important implementation decisions in the gem:

1. _Only load code once._ When the same file is imported again (either directly or transitively), "copy" constants from previously imported namespace to the new namespace using a registry which maps which namespace (import) was used to load which file (as shown above with activerecord/activemodel). This is necessary to ensure that different imports can "see" shared files. A similar registry is used to track autoloads so that they work correctly when used from imported code.
2. Toplevel core types (`NilClass`, `TrueClass`, `FalseClass`, `String`, etc) are "aliased" to constants under each import module to make them available. Thus there can be side-effects of importing code, but this allows a gem like Rails to monkeypatch core classes which it needs to do for it to work.
3. `Object.const_missing` is patched to check the caller location and resolve to the constant defined under an import, if there is an import defined for that file.

To be clear: **I think 1) should be implemented in Ruby, but not 2) and 3).** The last one (`Object.const_missing`) is a hack to support the case where a toplevel constant is referenced from a method called in imported code (at which point the `top_wrapper` is not active.)

I know this is a big proposal, and there are strong opinions held. I would really appreciate constructive feedback on this general idea.

See also similar discussion in: https://bugs.ruby-lang.org/issues/10320

-- 
https://bugs.ruby-lang.org/

next prev parent reply	other threads:[~2022-10-11  2:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27  0:51 [ruby-core:110097] [Ruby master Feature#19024] Proposal: Import Modules shioyama (Chris Salzberg)
2022-09-27  7:02 ` [ruby-core:110101] " fxn (Xavier Noria)
2022-09-27  8:06 ` [ruby-core:110102] " byroot (Jean Boussier)
2022-09-27  8:33 ` [ruby-core:110104] " shioyama (Chris Salzberg)
2022-09-27  8:50 ` [ruby-core:110105] " fxn (Xavier Noria)
2022-09-27 12:19 ` [ruby-core:110106] " shioyama (Chris Salzberg)
2022-09-27 14:36 ` [ruby-core:110108] " austin (Austin Ziegler)
2022-10-03 12:07 ` [ruby-core:110170] " shioyama (Chris Salzberg)
2022-10-04  8:40 ` [ruby-core:110176] " fxn (Xavier Noria)
2022-10-04  8:44 ` [ruby-core:110177] " fxn (Xavier Noria)
2022-10-04 13:22 ` [ruby-core:110179] " shioyama (Chris Salzberg)
2022-10-04 13:46 ` [ruby-core:110180] " fxn (Xavier Noria)
2022-10-04 13:50 ` [ruby-core:110181] " byroot (Jean Boussier)
2022-10-04 13:54 ` [ruby-core:110182] " fxn (Xavier Noria)
2022-10-04 14:29 ` [ruby-core:110183] " austin (Austin Ziegler)
2022-10-04 23:58 ` [ruby-core:110184] " shioyama (Chris Salzberg)
2022-10-06  9:26 ` [ruby-core:110206] " shioyama (Chris Salzberg)
2022-10-06 16:23 ` [ruby-core:110216] " austin (Austin Ziegler)
2022-10-07 12:20 ` [ruby-core:110227] " shioyama (Chris Salzberg)
2022-10-08 14:27 ` [ruby-core:110238] " shioyama (Chris Salzberg)
2022-10-08 18:30 ` [ruby-core:110239] " jeremyevans0 (Jeremy Evans)
2022-10-11  2:51 ` shioyama (Chris Salzberg) [this message]
2022-10-12  4:25 ` [ruby-core:110266] " austin (Austin Ziegler)
2022-10-12  6:29 ` [ruby-core:110269] " shioyama (Chris Salzberg)
2022-10-18  7:55 ` [ruby-core:110379] " shioyama (Chris Salzberg)
2023-02-17  8:35 ` [ruby-core:112466] " rubyFeedback (robert heiler) via ruby-core
2023-02-19  5:49 ` [ruby-core:112492] " shioyama (Chris Salzberg) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-99542.20221011025126.13031@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    --cc=ruby-core@neon.ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).