unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc
@ 2020-05-29  9:40 Sergei Trofimovich via Libc-alpha
  2020-05-29 10:48 ` John Paul Adrian Glaubitz
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Sergei Trofimovich via Libc-alpha @ 2020-05-29  9:40 UTC (permalink / raw)
  To: sparclinux, libc-alpha
  Cc: sparc, David S. Miller, Michał Górny, toolchain

On most targets glibc defines O_NDELAY as O_NONBLOCK.

glibc's manual/llio.texi manual says they are supposed to be equal:

"""
@deftypevr Macro int O_NDELAY
@standards{BSD, fcntl.h}
This is an obsolete name for @code{O_NONBLOCK}, provided for
compatibility with BSD.  It is not defined by the POSIX.1 standard.
@end deftypevr
"""

A bunch of packages rely on it and find out that this assumption
breaks on sparc in unusual ways. Recently it popped up as:
    https://github.com/eventlet/eventlet/pull/615
Older workarounds:
    https://github.com/libuv/libuv/issues/1830

What is more confusing for me:

linux kernel's uapi definition of O_NDELAY is ABI-dependent:
  arch/sparc/include/uapi/asm/fcntl.h
"""
#if defined(__sparc__) && defined(__arch64__)
#define O_NDELAY        0x0004
#else
#define O_NDELAY        (0x0004 | O_NONBLOCK)
#endif
"""

while glibc's is not:
  sysdeps/unix/sysv/linux/sparc/bits/fcntl.h
"""
#define O_NONBLOCK      0x4000
#define O_NDELAY        (0x0004 | O_NONBLOCK)
"""

Spot-checking preprocessor's output that seems to corroborate:

"""
$ printf "#include <sys/fcntl.h>'\n int o_ndelay = O_NDELAY; int o_nonblock = O_NONBLOCK;" | sparc-unknown-linux-gnu-gcc -E -x c - | fgrep -A3 o_
int o_ndelay =
               (0x0004 | 0x4000)
                       ; int o_nonblock =
                                          0x4000

$ printf "#include <sys/fcntl.h>'\n int o_ndelay = O_NDELAY; int o_nonblock = O_NONBLOCK;" | sparc64-unknown-linux-gnu-gcc -E -x c - | fgrep -A3 o_

int o_ndelay =
               (0x0004 | 0x4000)
                       ; int o_nonblock =
                                          0x4000
"""

I think this skew causes strange effects when you run sparc32
binary on sparc64 kernel (compared to sparc32 binary on sparc32
kernel) as kernel disagrees with userspace on O_NDELAY definition.

https://github.com/libuv/libuv/issues/1830 has more details.

I tried to trace the O_NDELAY definition and stopped at linux-2.1.29:
  https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/diff/include/asm-sparc/fcntl.h?id=b7b4d2d2c1809575374269e14d86ee1953bd168c
which brought O_NDELAY to O_NONBLOCK but did not make them
match exactly.

Question time:

1. Why is sparc32 special? Does it have something to do with
   compatibility to other OSes of that time? (Solaris? BSD?)

   fs/fcntl.c has kernel handling:
        /* required for strict SunOS emulation */
        if (O_NONBLOCK != O_NDELAY)
               if (arg & O_NDELAY)
                   arg |= O_NONBLOCK;
   but why does it leak to to userspace header definition?

   I think it should not.

2. Should sparc64-glibc change it's definition? Say, from
    #define O_NDELAY        (0x0004 | O_NONBLOCK)
   to
    #define O_NDELAY        O_NONBLOCK

    I think it should.

3. Should sparc32-linux (and glibc) change it's definition? Say, from
   #if defined(__sparc__) && defined(__arch64__)
   #define O_NDELAY        0x0004
   #else
   #define O_NDELAY        (0x0004 | O_NONBLOCK)
   #endif
  to
   #define O_NDELAY        (0x0004 | O_NONBLOCK)
  or even to 
  #define O_NDELAY        O_NONBLOCK
  and make sure kernel maps old O_NDELAY to O_NONBLOCK?

  I think '#define O_NDELAY O_NONBLOCK' would be most
  consistent.

What do you think?

Thanks!

-- 

  Sergei

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc
  2020-05-29  9:40 sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc Sergei Trofimovich via Libc-alpha
@ 2020-05-29 10:48 ` John Paul Adrian Glaubitz
  2020-06-22 19:08 ` Adhemerval Zanella via Libc-alpha
  2020-08-11 23:42 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: John Paul Adrian Glaubitz @ 2020-05-29 10:48 UTC (permalink / raw)
  To: Sergei Trofimovich, sparclinux, libc-alpha
  Cc: sparc, David S. Miller, Michał Górny, toolchain

On 5/29/20 11:40 AM, Sergei Trofimovich wrote:
> On most targets glibc defines O_NDELAY as O_NONBLOCK.
> (...)
>   I think '#define O_NDELAY O_NONBLOCK' would be most
>   consistent.
> 
> What do you think?

Would this, by any chance, also fix some of the glibc testsuite failures
we are seeing on SPARC? [1]

Adrian

> [1] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=sparc64&ver=2.30-8&stamp=1589400224&raw=0

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc
  2020-05-29  9:40 sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc Sergei Trofimovich via Libc-alpha
  2020-05-29 10:48 ` John Paul Adrian Glaubitz
@ 2020-06-22 19:08 ` Adhemerval Zanella via Libc-alpha
  2020-08-11 23:42 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2020-06-22 19:08 UTC (permalink / raw)
  To: Sergei Trofimovich, sparclinux, libc-alpha
  Cc: sparc, toolchain, David S. Miller, Michał Górny



On 29/05/2020 06:40, Sergei Trofimovich via Libc-alpha wrote:
> On most targets glibc defines O_NDELAY as O_NONBLOCK.
> 
> glibc's manual/llio.texi manual says they are supposed to be equal:
> 
> """
> @deftypevr Macro int O_NDELAY
> @standards{BSD, fcntl.h}
> This is an obsolete name for @code{O_NONBLOCK}, provided for
> compatibility with BSD.  It is not defined by the POSIX.1 standard.
> @end deftypevr
> """
> 
> A bunch of packages rely on it and find out that this assumption
> breaks on sparc in unusual ways. Recently it popped up as:
>     https://github.com/eventlet/eventlet/pull/615
> Older workarounds:
>     https://github.com/libuv/libuv/issues/1830
> 
> What is more confusing for me:
> 
> linux kernel's uapi definition of O_NDELAY is ABI-dependent:
>   arch/sparc/include/uapi/asm/fcntl.h
> """
> #if defined(__sparc__) && defined(__arch64__)
> #define O_NDELAY        0x0004
> #else
> #define O_NDELAY        (0x0004 | O_NONBLOCK)
> #endif
> """
> 
> while glibc's is not:
>   sysdeps/unix/sysv/linux/sparc/bits/fcntl.h
> """
> #define O_NONBLOCK      0x4000
> #define O_NDELAY        (0x0004 | O_NONBLOCK)
> """

Doing some archeology it seems that sparc32 originally defined
O_NDELAY as 0x0004, but it has changed it to 0x0004 | O_NONBLOCK
on 2.1.29.

> 
> Spot-checking preprocessor's output that seems to corroborate:
> 
> """
> $ printf "#include <sys/fcntl.h>'\n int o_ndelay = O_NDELAY; int o_nonblock = O_NONBLOCK;" | sparc-unknown-linux-gnu-gcc -E -x c - | fgrep -A3 o_
> int o_ndelay =
>                (0x0004 | 0x4000)
>                        ; int o_nonblock =
>                                           0x4000
> 
> $ printf "#include <sys/fcntl.h>'\n int o_ndelay = O_NDELAY; int o_nonblock = O_NONBLOCK;" | sparc64-unknown-linux-gnu-gcc -E -x c - | fgrep -A3 o_
> 
> int o_ndelay =
>                (0x0004 | 0x4000)
>                        ; int o_nonblock =
>                                           0x4000
> """
> 
> I think this skew causes strange effects when you run sparc32
> binary on sparc64 kernel (compared to sparc32 binary on sparc32
> kernel) as kernel disagrees with userspace on O_NDELAY definition.
> 
> https://github.com/libuv/libuv/issues/1830 has more details.
> 
> I tried to trace the O_NDELAY definition and stopped at linux-2.1.29:
>   https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/diff/include/asm-sparc/fcntl.h?id=b7b4d2d2c1809575374269e14d86ee1953bd168c
> which brought O_NDELAY to O_NONBLOCK but did not make them
> match exactly.
> 
> Question time:
> 
> 1. Why is sparc32 special? Does it have something to do with
>    compatibility to other OSes of that time? (Solaris? BSD?)
> 
>    fs/fcntl.c has kernel handling:
>         /* required for strict SunOS emulation */
>         if (O_NONBLOCK != O_NDELAY)
>                if (arg & O_NDELAY)
>                    arg |= O_NONBLOCK;
>    but why does it leak to to userspace header definition?
> 
>    I think it should not.

It seems to provide some compatibility with SunOS since on Solaris11
O_NDELAY is 0x4 on both 32 and 64 bits. 

> 
> 2. Should sparc64-glibc change it's definition? Say, from
>     #define O_NDELAY        (0x0004 | O_NONBLOCK)
>    to
>     #define O_NDELAY        O_NONBLOCK
> 
>     I think it should.

This will make:

  fcntl(fd, F_SETFL, flags | O_NONBLOCK);
  flags = fcntl(fd, F_GETFL);
  fcntl(fd, F_SETFL, flags & ~O_NDELAY);

Not clearing the flag.

> 
> 3. Should sparc32-linux (and glibc) change it's definition? Say, from
>    #if defined(__sparc__) && defined(__arch64__)
>    #define O_NDELAY        0x0004
>    #else
>    #define O_NDELAY        (0x0004 | O_NONBLOCK)
>    #endif
>   to
>    #define O_NDELAY        (0x0004 | O_NONBLOCK)
>   or even to 
>   #define O_NDELAY        O_NONBLOCK
>   and make sure kernel maps old O_NDELAY to O_NONBLOCK?
> 
>   I think '#define O_NDELAY O_NONBLOCK' would be most
>   consistent.
> 
> What do you think?

I think the main issue here is in fact FIONBIO historical inconsistency
over different system that Linux originally tried to accommodate it:

fs/ioctl.c:

545 static int ioctl_fionbio(struct file *filp, int __user *argp)
546 {
547         unsigned int flag;
548         int on, error;
549 
550         error = get_user(on, argp);
551         if (error)
552                 return error;
553         flag = O_NONBLOCK;
554 #ifdef __sparc__
555         /* SunOS compatibility item. */
556         if (O_NONBLOCK != O_NDELAY)
557                 flag |= O_NDELAY;
558 #endif
559         spin_lock(&filp->f_lock);
560         if (on)
561                 filp->f_flags |= flag;
562         else
563                 filp->f_flags &= ~flag;
564         spin_unlock(&filp->f_lock);
565         return error;
566 }

The issue on sparc is FIONBIO will always try to set/reset *both*
flags at the same time. I think what would be better would be to
either define O_NDELAY and O_NONBLOCK to be the same value of
0x4004 on both sparc32 and sparc64 (since the kernel does treat
them semantically as the same) or try to avoid use FIONBIO set the
socket as non-blocking in favor or fcntl(fd, F_SETFL, ... O_NONBLOCK).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc
  2020-05-29  9:40 sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc Sergei Trofimovich via Libc-alpha
  2020-05-29 10:48 ` John Paul Adrian Glaubitz
  2020-06-22 19:08 ` Adhemerval Zanella via Libc-alpha
@ 2020-08-11 23:42 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2020-08-11 23:42 UTC (permalink / raw)
  To: slyfox; +Cc: sparclinux, sparc, libc-alpha, mgorny, toolchain

From: Sergei Trofimovich <slyfox@gentoo.org>
Date: Fri, 29 May 2020 10:40:19 +0100

> Question time:
> 
> 1. Why is sparc32 special? Does it have something to do with
>    compatibility to other OSes of that time? (Solaris? BSD?)
> 
>    fs/fcntl.c has kernel handling:
>         /* required for strict SunOS emulation */
>         if (O_NONBLOCK != O_NDELAY)
>                if (arg & O_NDELAY)
>                    arg |= O_NONBLOCK;
>    but why does it leak to to userspace header definition?
> 
>    I think it should not.

The original sparc value was meant to match the SunOS value
exactly in order to make the SunOS emulation support easier.

The current situation is a mess and I don't doubt that it accounts
for various kinds of weird behavior we've seen over the years :-/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-11 23:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-29  9:40 sparc vs sparc64: O_NDELAY and O_NONBLOCK mismatch in kernel and in glibc Sergei Trofimovich via Libc-alpha
2020-05-29 10:48 ` John Paul Adrian Glaubitz
2020-06-22 19:08 ` Adhemerval Zanella via Libc-alpha
2020-08-11 23:42 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).