ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: akr@fsij.org
To: ruby-core@ruby-lang.org
Subject: [ruby-core:92456] [Ruby trunk Feature#15797] Use realpath(3) instead of custom realpath implementation if available
Date: Sun, 28 Apr 2019 03:37:29 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-77806.20190428033728.f49dfc98ba84a737@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-15797.20190426210040@ruby-lang.org

Issue #15797 has been updated by akr (Akira Tanaka).


PATH_MAX is dangerous.

Quotes from http://man7.org/linux/man-pages/man3/realpath.3.html

```
BUGS
       The POSIX.1-2001 standard version of this function is broken by
       design, since it is impossible to determine a suitable size for the
       output buffer, resolved_path.  According to POSIX.1-2001 a buffer of
       size PATH_MAX suffices, but PATH_MAX need not be a defined constant,
       and may have to be obtained using pathconf(3).  And asking
       pathconf(3) does not really help, since, on the one hand POSIX warns
       that the result of pathconf(3) may be huge and unsuitable for
       mallocing memory, and on the other hand pathconf(3) may return -1 to
       signify that PATH_MAX is not bounded.  The resolved_path == NULL
       feature, not standardized in POSIX.1-2001, but standardized in
       POSIX.1-2008, allows this design problem to be avoided.
```

----------------------------------------
Feature #15797: Use realpath(3) instead of custom realpath implementation if available
https://bugs.ruby-lang.org/issues/15797#change-77806

* Author: jeremyevans0 (Jeremy Evans)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
One reason to do this is simplicity, as this approach is ~30 lines of
code instead of ~200.

Performance wise, this performs 25%-115% better, using the following
benchmark on OpenBSD 6.5:

```ruby
require 'benchmark'

f = File
pwd = Dir.pwd
Dir.mkdir('b') unless f.directory?('b')
f.write('b/a', '') unless f.file?('b/a')

args = [
  ["b/a", nil],
  ["#{pwd}/b/a", nil],
  ['a', 'b'],
  ["#{pwd}/b/a", 'b'],
  ["b/a", pwd]
]

args.each do |path, base|
  print "File.realpath(#{path.inspect}, #{base.inspect}): ".ljust(50)
  puts Benchmark.measure{100000.times{f.realpath(path, base)}}
end
```

Before:

```
File.realpath("b/a", nil):                          4.330000   2.990000   7.320000 (  7.316244)
File.realpath("/home/testr/ruby/b/a", nil):         3.560000   2.680000   6.240000 (  6.240951)
File.realpath("a", "b"):                            4.370000   3.080000   7.450000 (  7.452511)
File.realpath("/home/testr/ruby/b/a", "b"):         3.730000   2.640000   6.370000 (  6.371979)
File.realpath("b/a", "/home/testr/ruby"):           3.590000   2.630000   6.220000 (  6.226824)
```

After:

```
File.realpath("b/a", nil):                          1.370000   2.030000   3.400000 (  3.400775)
File.realpath("/home/testr/ruby/b/a", nil):         1.260000   2.770000   4.030000 (  4.024957)
File.realpath("a", "b"):                            2.090000   1.990000   4.080000 (  4.080284)
File.realpath("/home/testr/ruby/b/a", "b"):         1.400000   2.620000   4.020000 (  4.015505)
File.realpath("b/a", "/home/testr/ruby"):           2.150000   2.760000   4.910000 (  4.910634)
```

If someone could benchmark before/after with this patch on Linux and/or MacOS X,
and post the results here, I would appreciate it.

My personal reason for wanting this is that the custom realpath
implementation does not work with OpenBSD's unveil(2) system call,
which limits access to the file system, allowing for security
similar to chroot(2), without most of the downsides.

This change passes all tests except for one assertion related to
taintedness.  Previously, if either argument to `File.realpath` is an
absolute path, then the returned value is considered not tainted.
However, I believe that behavior to be incorrect, because if there is
a symlink anywhere in the path, the returned value can contain a
section that was taken from the file system (unreliable source) that
was not marked as untainted. Example:

```ruby
Dir.mkdir('b') unless File.directory?('b')
File.write('b/a', '') unless File.file?('b/a')
File.symlink('b', 'c') unless File.symlink?('c')
path = File.realpath('c/a'.untaint, Dir.pwd.untaint)
path # "/home/testr/ruby/b/a"
path.tainted? # should be true, as 'b' comes from file system
```

I believe it is safer to always mark the output of realpath as tainted
to prevent this issue, which is what this commit does.

---Files--------------------------------
use-native-realpath.patch (6.31 KB)
use-native-realpath-v2.patch (4.64 KB)


-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2019-04-28  3:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-15797.20190426210040@ruby-lang.org>
2019-04-26 21:00 ` [ruby-core:92425] [Ruby trunk Feature#15797] Use realpath(3) instead of custom realpath implementation if available merch-redmine
2019-04-27  3:45 ` [ruby-core:92431] " nobu
2019-04-27 17:13 ` [ruby-core:92444] " merch-redmine
2019-04-28  3:23 ` [ruby-core:92455] " merch-redmine
2019-04-28  3:37 ` akr [this message]
2019-04-28  4:06 ` [ruby-core:92457] " merch-redmine
2019-05-16  2:46 ` [ruby-core:92675] " merch-redmine
2019-05-26  3:58 ` [ruby-core:92848] " merch-redmine
2019-05-30 15:17 ` [ruby-core:92888] " merch-redmine
2019-06-17 17:37 ` [ruby-core:93204] " merch-redmine
2019-06-27 22:59 ` [ruby-core:93396] " merch-redmine
2019-07-01 19:17 ` [ruby-core:93459] [Ruby master " merch-redmine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-77806.20190428033728.f49dfc98ba84a737@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).