ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:83772] [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
       [not found] <redmine.issue-14108.20171115032856@ruby-lang.org>
@ 2017-11-15  3:28 ` Greg.mpls
  2017-11-15  3:47 ` [ruby-core:83773] " usa
  1 sibling, 0 replies; 10+ messages in thread
From: Greg.mpls @ 2017-11-15  3:28 UTC (permalink / raw
  To: ruby-core

Issue #14108 has been reported by MSP-Greg (Greg L).

----------------------------------------
Bug #14108: Seg Fault with MinGW on svn 60769
https://bugs.ruby-lang.org/issues/14108

* Author: MSP-Greg (Greg L)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0dev (2017-11-15 trunk 60767) [x64-mingw32]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
The Noon JST ruby-loco MinGW build just failed.  Last build on svn 60767 was successful.

Error shown in log:

```
linking miniruby.exe
generating encdb.h
make: *** [uncommon.mk:704: .rbconfig.time] Segmentation fault
make: *** Waiting for unfinished jobs....
encdb.h updated
==> ERROR: A failure occurred in build().
    Aborting...
Command exited with code 1
```

From a local build with no make `-j` parameter:

```
linking miniruby.exe
generating encdb.h
make: *** [uncommon.mk:704: .rbconfig.time] Segmentation fault
make: *** Waiting for unfinished jobs....
encdb.h updated
^[[1m^[[31m==> ERROR:^[[0;10m^[[1m A failure occurred in build().^[[0;10m
^[[1m    Aborting...^[[0;10m
```

Thanks, Greg



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83773] [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
       [not found] <redmine.issue-14108.20171115032856@ruby-lang.org>
  2017-11-15  3:28 ` [ruby-core:83772] [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769 Greg.mpls
@ 2017-11-15  3:47 ` usa
  2017-11-15  4:14   ` [ruby-core:83774] " Eric Wong
  1 sibling, 1 reply; 10+ messages in thread
From: usa @ 2017-11-15  3:47 UTC (permalink / raw
  To: ruby-core

Issue #14108 has been updated by usa (Usaku NAKAMURA).


I've just committed a workaround for now, because this hides other failures and errors on CI.

----------------------------------------
Bug #14108: Seg Fault with MinGW on svn 60769
https://bugs.ruby-lang.org/issues/14108#change-67810

* Author: MSP-Greg (Greg L)
* Status: Closed
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0dev (2017-11-15 trunk 60767) [x64-mingw32]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
The Noon JST ruby-loco MinGW build just failed.  Last build on svn 60767 was successful.

Error shown in log:

```
linking miniruby.exe
generating encdb.h
make: *** [uncommon.mk:704: .rbconfig.time] Segmentation fault
make: *** Waiting for unfinished jobs....
encdb.h updated
==> ERROR: A failure occurred in build().
    Aborting...
Command exited with code 1
```

From a local build with no make `-j` parameter:

```
linking miniruby.exe
generating encdb.h
make: *** [uncommon.mk:704: .rbconfig.time] Segmentation fault
make: *** Waiting for unfinished jobs....
encdb.h updated
^[[1m^[[31m==> ERROR:^[[0;10m^[[1m A failure occurred in build().^[[0;10m
^[[1m    Aborting...^[[0;10m
```

Thanks, Greg



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83774] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  3:47 ` [ruby-core:83773] " usa
@ 2017-11-15  4:14   ` Eric Wong
  2017-11-15  4:33     ` [ruby-core:83775] " U.NAKAMURA
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wong @ 2017-11-15  4:14 UTC (permalink / raw
  To: ruby-core

usa@garbagecollect.jp wrote:
> I've just committed a workaround for now, because this hides other failures and errors on CI.

Oops, I will need to modify (or revert) r60769 anyways.
It may be possible for concurrent threads to share Dir objects
and we were relying on GVL for protect pure-Ruby code from
stepping over each other.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83775] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  4:14   ` [ruby-core:83774] " Eric Wong
@ 2017-11-15  4:33     ` U.NAKAMURA
  2017-11-15  4:41       ` [ruby-core:83776] " Nobuyoshi Nakada
  2017-11-15  7:12       ` [ruby-core:83779] " Eric Wong
  0 siblings, 2 replies; 10+ messages in thread
From: U.NAKAMURA @ 2017-11-15  4:33 UTC (permalink / raw
  To: Ruby developers

Hi, Eric

In message "[ruby-core:83774] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769"
  on Wed, 15 Nov 2017 04:14:37 +0000, normalperson@yhbt.net wrote:
> Oops, I will need to modify (or revert) r60769 anyways.
> It may be possible for concurrent threads to share Dir objects
> and we were relying on GVL for protect pure-Ruby code from
> stepping over each other.

I've not checked this problem deeply yet.

You can see our implementation of readdir on Windows in
win32/win32.c (rb_w32_readdir).
I doubt the second argument (rb_encoding *enc) is the cause.


Regards,
-- 
U.Nakamaura <usa@garbagecollect.jp>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83776] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  4:33     ` [ruby-core:83775] " U.NAKAMURA
@ 2017-11-15  4:41       ` Nobuyoshi Nakada
  2017-11-15  7:12       ` [ruby-core:83779] " Eric Wong
  1 sibling, 0 replies; 10+ messages in thread
From: Nobuyoshi Nakada @ 2017-11-15  4:41 UTC (permalink / raw
  To: Ruby developers, U.NAKAMURA

On 2017/11/15 13:33, U.NAKAMURA wrote:
> In message "[ruby-core:83774] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769"
>   on Wed, 15 Nov 2017 04:14:37 +0000, normalperson@yhbt.net wrote:
>> Oops, I will need to modify (or revert) r60769 anyways.
>> It may be possible for concurrent threads to share Dir objects
>> and we were relying on GVL for protect pure-Ruby code from
>> stepping over each other.
>
> I've not checked this problem deeply yet.
In mswin/mingw version, command line globbing runs before VM initialization.
`GET_EC()` returns NULL and `GET_THREAD()` segfaults in `call_without_gvl()`.

-- 
Nobu Nakada

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83779] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  4:33     ` [ruby-core:83775] " U.NAKAMURA
  2017-11-15  4:41       ` [ruby-core:83776] " Nobuyoshi Nakada
@ 2017-11-15  7:12       ` Eric Wong
  2017-11-15  7:41         ` [ruby-core:83781] " U.NAKAMURA
  1 sibling, 1 reply; 10+ messages in thread
From: Eric Wong @ 2017-11-15  7:12 UTC (permalink / raw
  To: ruby-core

"U.NAKAMURA" <usa@garbagecollect.jp> wrote:
> Hi, Eric
> 
> In message "[ruby-core:83774] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769"
>   on Wed, 15 Nov 2017 04:14:37 +0000, normalperson@yhbt.net wrote:
> > Oops, I will need to modify (or revert) r60769 anyways.
> > It may be possible for concurrent threads to share Dir objects
> > and we were relying on GVL for protect pure-Ruby code from
> > stepping over each other.
> 
> I've not checked this problem deeply yet.
> 
> You can see our implementation of readdir on Windows in
> win32/win32.c (rb_w32_readdir).
> I doubt the second argument (rb_encoding *enc) is the cause.

Right, enc is no problem.  The problem is using DIR * from stdio
is not reentrant or thread-safe.  readdir_r is deprecated, even,
and so it is up to the application to provide thread-safety when
reading directories.

Usually on modern platforms (but not guaranteed), readdir on
different DIR * pointers is safe from multiple threads, so maybe
we can add a mutex to "struct dir_data".

I will revert r60772, r60770, and r60769 for now...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83781] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  7:12       ` [ruby-core:83779] " Eric Wong
@ 2017-11-15  7:41         ` U.NAKAMURA
  2017-11-15  9:20           ` [ruby-core:83782] " Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: U.NAKAMURA @ 2017-11-15  7:41 UTC (permalink / raw
  To: Ruby developers

Hi, Eric,

In message "[ruby-core:83779] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769"
  on Wed, 15 Nov 2017 07:12:42 +0000, normalperson@yhbt.net wrote:
> > You can see our implementation of readdir on Windows in
> > win32/win32.c (rb_w32_readdir).
> > I doubt the second argument (rb_encoding *enc) is the cause.

As Nobu said at [ruby-core:83776] (thank you Nobu!), my guess was wrong
and the cause was touching GVL before running VM.
Therefore r60770 is the right way to resolve the problem.


> Right, enc is no problem.  The problem is using DIR * from stdio
> is not reentrant or thread-safe.  readdir_r is deprecated, even,
> and so it is up to the application to provide thread-safety when
> reading directories.

Our readdir implementation for Win32 is expected to be thread-safe.
And, it seems that there is no platform which uses non thread-safe
readdir.


> Usually on modern platforms (but not guaranteed), readdir on
> different DIR * pointers is safe from multiple threads, so maybe
> we can add a mutex to "struct dir_data".
> 
> I will revert r60772, r60770, and r60769 for now...

So, IMO you don't need to revert them.
(but this mail is too late...)


Regards,
-- 
U.Nakamaura <usa@garbagecollect.jp>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83782] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  7:41         ` [ruby-core:83781] " U.NAKAMURA
@ 2017-11-15  9:20           ` Eric Wong
  2017-11-15  9:55             ` [ruby-core:83783] " Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wong @ 2017-11-15  9:20 UTC (permalink / raw
  To: ruby-core

"U.NAKAMURA" <usa@garbagecollect.jp> wrote:
> Hi, Eric,
> 
> In message "[ruby-core:83779] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769"
>   on Wed, 15 Nov 2017 07:12:42 +0000, normalperson@yhbt.net wrote:
> > > You can see our implementation of readdir on Windows in
> > > win32/win32.c (rb_w32_readdir).
> > > I doubt the second argument (rb_encoding *enc) is the cause.
> 
> As Nobu said at [ruby-core:83776] (thank you Nobu!), my guess was wrong
> and the cause was touching GVL before running VM.
> Therefore r60770 is the right way to resolve the problem.

Right, that was a real bug, too.

> > Right, enc is no problem.  The problem is using DIR * from stdio
> > is not reentrant or thread-safe.  readdir_r is deprecated, even,
> > and so it is up to the application to provide thread-safety when
> > reading directories.
> 
> Our readdir implementation for Win32 is expected to be thread-safe.
> And, it seems that there is no platform which uses non thread-safe
> readdir.

OK, but do we know all platforms?  Reading glibc, it seems
readdir(3) is thread-safe (but not reentrant) at dirstream level
(see below)

> > Usually on modern platforms (but not guaranteed), readdir on
> > different DIR * pointers is safe from multiple threads, so maybe
> > we can add a mutex to "struct dir_data".
> > 
> > I will revert r60772, r60770, and r60769 for now...
> 
> So, IMO you don't need to revert them.
> (but this mail is too late...)

Linux readdir(3) man-page says:

  In  the  current POSIX.1 specification (POSIX.1-2008), readdir() is not
  required  to  be  thread-safe.   However,  in  modern   implementations
  (including  the  glibc  implementation),  concurrent calls to readdir()
  that specify different directory streams  are  thread-safe.   In  cases
  where  multiple threads must read from the same directory stream, using
  readdir() with external synchronization is still preferable to the  use
  of  the deprecated readdir_r(3) function.  It is expected that a future
  version of POSIX.1 will require that readdir() be thread-safe when con‐
  currently employed on different directory streams.

So based on that paragraph alone, my revert seems correct.
Ruby Dir#read method can be called in multiple threads on the
same Dir object.  That is dangerous.

However; when reading glibc source[1], I noticed every DIR stream has
a lock which guards readdir calls since 1996[2].

So maybe whitelist glibc (and maybe win32) implementations
for GVL release on readdir?


[1] git://sourceware.org/git/glibc.git - sysdeps/posix/readdir.c
[2] commit c150923988933b5db75a974d4cc08cd7f7aaf3dc in glibc[1]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83783] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  9:20           ` [ruby-core:83782] " Eric Wong
@ 2017-11-15  9:55             ` Eric Wong
  2017-11-17 21:48               ` [ruby-core:83808] " Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wong @ 2017-11-15  9:55 UTC (permalink / raw
  To: ruby-core

Eric Wong <normalperson@yhbt.net> wrote:
> However; when reading glibc source[1], I noticed every DIR stream has
> a lock which guards readdir calls since 1996[2].
> 
> So maybe whitelist glibc (and maybe win32) implementations
> for GVL release on readdir?

No on whitelist.   Actual dirent returned by readdir can still
be clobbered despite that lock.  That glibc lock is only to protect
internal glibc structures, not dirent data the user sees.
GVL to protect readdir is probably still the best way going
forward.  We may introduce new lock(s) (while releasing GVL),
but it's too complex and late for 2.5.

> [1] git://sourceware.org/git/glibc.git - sysdeps/posix/readdir.c
> [2] commit c150923988933b5db75a974d4cc08cd7f7aaf3dc in glibc[1]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ruby-core:83808] Re: [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769
  2017-11-15  9:55             ` [ruby-core:83783] " Eric Wong
@ 2017-11-17 21:48               ` Eric Wong
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2017-11-17 21:48 UTC (permalink / raw
  To: ruby-core

Btw, reiterating what I wrote privately, the correct way
to do this would be:

> Dir#read:
>	release GVL
>	acquire dir_data.lock
>	readdir() call /* dirent pointer returned */
>	copy dirent result to tmp
>	release dir_data.lock /* done using dirent */
>	acquire GVL
>	make String from tmp

Too much engineering effort and the performance hit would suck
(even more than it does now).

We may consider using non-portable getdents/getdents64 syscalls on
some platforms (Linux, FreeBSD) directly for Ruby 2.6/3.x.  That
would allow us to avoid introducing dir_data.lock and extra copy,
but it's too late for 2.5.

readdir(3) is high-level stdio crap; we've already purged most
of the stdio overhead in io.c for 1.9+; so maybe we can continue
that trend in dir.c.  Of course, read(2)/write(2) are portable
while getdents(2) isn't :<

However, opendir can safely release GVL we should be able to
get that into 2.5...

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-11-17 21:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-14108.20171115032856@ruby-lang.org>
2017-11-15  3:28 ` [ruby-core:83772] [Ruby trunk Bug#14108] Seg Fault with MinGW on svn 60769 Greg.mpls
2017-11-15  3:47 ` [ruby-core:83773] " usa
2017-11-15  4:14   ` [ruby-core:83774] " Eric Wong
2017-11-15  4:33     ` [ruby-core:83775] " U.NAKAMURA
2017-11-15  4:41       ` [ruby-core:83776] " Nobuyoshi Nakada
2017-11-15  7:12       ` [ruby-core:83779] " Eric Wong
2017-11-15  7:41         ` [ruby-core:83781] " U.NAKAMURA
2017-11-15  9:20           ` [ruby-core:83782] " Eric Wong
2017-11-15  9:55             ` [ruby-core:83783] " Eric Wong
2017-11-17 21:48               ` [ruby-core:83808] " Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).