ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "yui-knk (Kaneko Yuichiro)" <noreply@ruby-lang.org>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:109977] [Ruby master Feature#19013] Error Tolerant Parser
Date: Wed, 21 Sep 2022 12:28:28 +0000 (UTC)	[thread overview]
Message-ID: <redmine.issue-19013.20220921122828.7872@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-19013.20220921122828.7872@ruby-lang.org

Issue #19013 has been reported by yui-knk (Kaneko Yuichiro).

----------------------------------------
Feature #19013: Error Tolerant Parser
https://bugs.ruby-lang.org/issues/19013

* Author: yui-knk (Kaneko Yuichiro)
* Status: Open
* Priority: Normal
----------------------------------------
# Background

Implementation for Language Server Protocol (LSP) sometimes needs to parse incomplete ruby script for example users want to complement expressions in the middle of statement like below:

```ruby
class A
  def m
    a = 10
    if # here users want to run completion
  end
end
```

In such case, LSP implementation wants to get partial AST instead of syntax error.

# Proposal

At the moment I want to propose 3 types of tolerance

## 1. Complement `end` when lexer hits to end-of-input but `end` is not enough

This is a case. Lexer will generate 1 `end` before generates end-of-input.

```ruby
describe "1" do
  describe "2" do
    describe "3" do
      it "here" do
    end
  end
end
```

## 2. Extract "end" as keyword not identifier based on an indent

This is a case. Normal parser recognizes "end" on line 4 as "local variable or method".
This causes not only syntax error but also `bar` method definition is assumed as `Z::Foo#bar`.
Other approach is suppress `!IS_lex_state(EXPR_DOT)` checks for "end".

```ruby
module Z
  class Foo
    foo.
  end

  def bar
  end
end
```

## 3. Change locations of `error`

Currently `error` is put into `top_stmts` and `stmts` like `top_stmts: error top_stmt` and `stmts: error stmt`.
However these are too strict to catch syntax error then want to move it to `stmt: error` and `expr_value: error`.

# Interface

* Adding `error_tolerant` option to `RubyVM::AbstractSyntaxTree.parse`
* Adding `--error-tolerant-parser` option to ruby command for debugging
  * This option is valid only when `–dump=yydebug`, `--dump=parsetree` or `--dump=parsetree_with_comment` is passed

# Compatibility

Changing the location of `error` can lead incompatibility. At least I observed 2 test cases in ruby/ruby are broken by this change.
I think both of them depend on how ripper behaves after ripper raises syntax error.

* RDoc: https://github.com/yui-knk/ruby/commit/1dabbe508f0cc3dd4f83aa72502bbf347029dd8c
  * However ruby script in heredoc is invalid...
* irb: https://github.com/yui-knk/ruby/commit/e18be19ecd044eb26a56f6f9ba4f19d40c01a9c7
  * Range of error coloring is changed

All other changes are related to not parser but lexer and they are controlled by `error_tolerant` option. Therefore no behavior change is expected for ruby parser and ripper.

# Implementation

https://github.com/yui-knk/ruby/tree/error_recovery_indent_aware




-- 
https://bugs.ruby-lang.org/

       reply	other threads:[~2022-09-21 12:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21 12:28 yui-knk (Kaneko Yuichiro) [this message]
2022-09-22  3:04 ` [ruby-core:109984] [Ruby master Feature#19013] Error Tolerant Parser duerst
2022-09-22 21:10 ` [ruby-core:110003] " matz (Yukihiro Matsumoto)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.issue-19013.20220921122828.7872@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).