ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: eregontp@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:92696] [Ruby trunk Feature#14844] Future of RubyVM::AST?
Date: Fri, 17 May 2019 12:56:40 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-78054.20190517125640.f32af9fffb82bd14@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-14844.20180612141613@ruby-lang.org

Issue #14844 has been updated by Eregon (Benoit Daloze).


@mame Thank you for the reply.

Could you or @yui-knk propose a description to include in the documentation, summarizing what was said?

Could you also give your opinion on accessing Node members by name (https://bugs.ruby-lang.org/issues/14844#note-13) ?

> Ripper does not reproduce the details including parser-level optimization.

What kind of details? Could you give an example?
Things like OPCALL instead of CALL? Is that useful for any tool?

I tried a simple expression to compare Ripper and RubyVM::AbstractSyntaxTree:

```ruby
pry(main)> Ripper.sexp("def m(a) a * 2 end")                                                                                           
=> [:program,
 [[:def,
   [:@ident, "m", [1, 4]],
   [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil, nil, nil, nil]],
   [:bodystmt, [[:binary, [:var_ref, [:@ident, "a", [1, 9]]], :*, [:@int, "2", [1, 13]]]], nil, nil, nil]]]]

pry(main)> RubyVM::AbstractSyntaxTree.parse("def m(a) a * 2 end")                                                                      
=> (SCOPE@1:0-1:18
 tbl: []
 args: nil
 body:
   (DEFN@1:0-1:18
    mid: :m
    body:
      (SCOPE@1:0-1:18
       tbl: [:a]
       args:
         (ARGS@1:6-1:7
          pre_num: 1
          pre_init: nil
          opt: nil
          first_post: nil
          post_num: 0
          post_init: nil
          rest: nil
          kw: nil
          kwrest: nil
          block: nil)
       body: (OPCALL@1:9-1:14 (LVAR@1:9-1:10 :a) :* (ARRAY@1:13-1:14 (LIT@1:13-1:14 2) nil)))))
```

Indeed, the RubyVM::AbstractSyntaxTree version seems easier to read (and access once we have `RubyVM::AST::Node#[:field_name]`).
I think one of the main gains is node fields are named, while they are just a flat Array in `Ripper.sexp`.

OTOH, things are far from perfectly clear (so I think "experimental/not for serious use" seems appropriate currently).
For instance, one has to manually associate arguments given as e.g. a number for `pre_num` and their names in `tbl`.
Optional arguments seem exposed more clearly, by having node under `ARGSnode[:opt]`, however the OPT_ARG look nested like a cons-list instead of being an Array which would be more intuitive.

So if we compare a slightly more complex example with the `parser` gem, we see there are lots of opportunities to make RubyVM::AST easier to access/process/read/understand:

```ruby
pry(main)> require 'parser/current'

pry(main)> RubyVM::AbstractSyntaxTree.parse("def m(b,a,c=3,d=4) a * 2 end")                                                             
=> (SCOPE@1:0-1:28
 tbl: []
 args: nil
 body:
   (DEFN@1:0-1:28
    mid: :m
    body:
      (SCOPE@1:0-1:28
       tbl: [:b, :a, :c, :d]
       args:
         (ARGS@1:6-1:17
          pre_num: 2
          pre_init: nil
          opt: (OPT_ARG@1:10-1:17 (LASGN@1:10-1:13 :c (LIT@1:12-1:13 3)) (OPT_ARG@1:14-1:17 (LASGN@1:14-1:17 :d (LIT@1:16-1:17 4)) nil))
          first_post: nil
          post_num: 0
          post_init: nil
          rest: nil
          kw: nil
          kwrest: nil
          block: nil)
       body: (OPCALL@1:19-1:24 (LVAR@1:19-1:20 :a) :* (ARRAY@1:23-1:24 (LIT@1:23-1:24 2) nil)))))

pry(main)> Parser::CurrentRuby.parse("def m(b,a,c=3,d=4) a * 2 end")                                                                    
=> s(:def, :m,
  s(:args,
    s(:arg, :b),
    s(:arg, :a),
    s(:optarg, :c,
      s(:int, 3)),
    s(:optarg, :d,
      s(:int, 4))),
  s(:send,
    s(:lvar, :a), :*,
    s(:int, 2)))
```

I think it would be good to take inspiration from `parser` here, which makes it really convenient to access the AST and still seems to not lose any important information.

In fact, in what cases the additional things in RubyVM::AST such as the SCOPE nodes would be useful beyond debugging the MRI parser?
Would any tool be able to do anything with those that it could not without?

I understand exposing the internal AST directly is the simplest implementation-wise.
But I think it's quite sub-optimal to access, process and understand.
Would it be better to expose an AST more similar, or even exactly the same, as the `parser` gem?

----------------------------------------
Feature #14844: Future of RubyVM::AST? 
https://bugs.ruby-lang.org/issues/14844#change-78054

* Author: rmosolgo (Robert Mosolgo)
* Status: Open
* Priority: Normal
* Assignee: yui-knk (Kaneko Yuichiro)
* Target version: 
----------------------------------------
Hi! Thanks for all your great work on the Ruby language. 

I saw the new RubyVM::AST module in 2.6.0-preview2 and I quickly went to try it out. 

I'd love to have a well-documented, user-friendly way to parse and manipulate Ruby code using the Ruby standard library, so I'm pretty excited to try it out. (I've been trying to learn Ripper recently, too: https://ripper-preview.herokuapp.com/, https://rmosolgo.github.io/ripper_events/ .)

Based on my exploration, I opened a small PR on GitHub with some documentation: https://github.com/ruby/ruby/pull/1888

I'm curious though, are there future plans for this module? For example, we might: 

- Add more details about each node (for example, we could expose the names of identifiers and operators through the node classes)
- Document each node type 

I see there is a lot more information in the C structures that we could expose, and I'm interested to help out if it's valuable. What do you think? 



-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2019-05-17 12:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-14844.20180612141613@ruby-lang.org>
2018-06-12 14:16 ` [ruby-core:87480] [Ruby trunk Feature#14844] Future of RubyVM::AST? rdmosolgo
2018-06-12 15:42 ` [ruby-core:87481] " shevegen
2018-06-30 23:43 ` [ruby-core:87727] " samuel
2018-07-02  0:07 ` [ruby-core:87733] " samuel
2018-07-05  4:13 ` [ruby-core:87799] " samuel
2018-08-10  9:26 ` [ruby-core:88432] " bozhidar
2018-08-17  0:33 ` [ruby-core:88509] " mame
2018-08-28  1:00 ` [ruby-core:88700] " samuel
2018-12-07 11:43 ` [ruby-core:90367] " lucasbuchala
2018-12-20  4:28 ` [ruby-core:90628] " samuel
2019-01-26 11:06 ` [ruby-core:91282] " samuel
2019-04-07 19:07 ` [ruby-core:92185] " eregontp
2019-04-07 19:16 ` [ruby-core:92186] " eregontp
2019-04-18 22:26 ` [ruby-core:92323] " eregontp
2019-05-15 21:37 ` [ruby-core:92670] " eregontp
2019-05-17  1:05 ` [ruby-core:92692] " mame
2019-05-17 12:56 ` eregontp [this message]
2019-05-17 16:22 ` [ruby-core:92701] " mame
2019-05-17 19:53 ` [ruby-core:92703] " eregontp
2019-05-22  7:41 ` [ruby-core:92770] " akr
2019-05-22 10:15 ` [ruby-core:92782] " eregontp
2019-12-14 11:31 ` [ruby-core:96231] [Ruby master " eregontp
2019-12-14 11:50 ` [ruby-core:96232] " eregontp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-78054.20190517125640.f32af9fffb82bd14@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).