unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* bind(2): Missing [[gnu::nonnull]]
@ 2022-12-03 15:33 Alejandro Colomar via Libc-alpha
  2022-12-03 15:55 ` Xi Ruoyao via Libc-alpha
  0 siblings, 1 reply; 9+ messages in thread
From: Alejandro Colomar via Libc-alpha @ 2022-12-03 15:33 UTC (permalink / raw)
  To: GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 309 bytes --]

Hi!

I'm documenting NULLness of parameters in the Linux man-pages.  While doing 
that, I noticed bind(2) is not prototyped with nonnull, but I don't think it 
makes sense to accept NULL.  Is it a mistake?  Should I send a patch for adding it?

Cheers,
Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-03 15:33 bind(2): Missing [[gnu::nonnull]] Alejandro Colomar via Libc-alpha
@ 2022-12-03 15:55 ` Xi Ruoyao via Libc-alpha
  2022-12-03 16:41   ` Alejandro Colomar via Libc-alpha
  2022-12-03 19:05   ` Andreas Schwab
  0 siblings, 2 replies; 9+ messages in thread
From: Xi Ruoyao via Libc-alpha @ 2022-12-03 15:55 UTC (permalink / raw)
  To: Alejandro Colomar, GNU C Library

On Sat, 2022-12-03 at 16:33 +0100, Alejandro Colomar via Libc-alpha
wrote:
> Hi!
> 
> I'm documenting NULLness of parameters in the Linux man-pages.  While doing 
> that, I noticed bind(2) is not prototyped with nonnull, but I don't think it 
> makes sense to accept NULL.  Is it a mistake?  Should I send a patch for adding it?

Hi Alejandro,

Currently the man page says:

EFAULT: addr points outside the user's accessible address space.

And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
passed as addr.

gnu::nonnull is not only a diagnostic attribute: it also allows the
compiler to assume addr is never NULL.  i. e. if addr was gnu::nonnull
and bind(2) is called with addr == NULL, the behavior would be
undefined.

So this will be an API change.  Yes I agree calling bind with NULL does
not make any sense, but I guess we still need to keep API
"compatibility" with those nonsense code...

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-03 15:55 ` Xi Ruoyao via Libc-alpha
@ 2022-12-03 16:41   ` Alejandro Colomar via Libc-alpha
  2022-12-03 19:05   ` Andreas Schwab
  1 sibling, 0 replies; 9+ messages in thread
From: Alejandro Colomar via Libc-alpha @ 2022-12-03 16:41 UTC (permalink / raw)
  To: Xi Ruoyao, GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 1600 bytes --]

Hi Xi,

On 12/3/22 16:55, Xi Ruoyao wrote:
> On Sat, 2022-12-03 at 16:33 +0100, Alejandro Colomar via Libc-alpha
> wrote:
>> Hi!
>>
>> I'm documenting NULLness of parameters in the Linux man-pages.  While doing
>> that, I noticed bind(2) is not prototyped with nonnull, but I don't think it
>> makes sense to accept NULL.  Is it a mistake?  Should I send a patch for adding it?
> 
> Hi Alejandro,
> 
> Currently the man page says:
> 
> EFAULT: addr points outside the user's accessible address space.
> 
> And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
> passed as addr.
> 
> gnu::nonnull is not only a diagnostic attribute: it also allows the
> compiler to assume addr is never NULL.  i. e. if addr was gnu::nonnull
> and bind(2) is called with addr == NULL, the behavior would be
> undefined.
> 
> So this will be an API change.  Yes I agree calling bind with NULL does
> not make any sense, but I guess we still need to keep API
> "compatibility" with those nonsense code...
> 

Hmmm, makes sense.  What I'll do (unless someone opposes and suggests that I do 
otherwise), though, is ignore that problem in the man-pages.  I'm documenting 
_Nullable instead of _Nonnull, since there are much less places where it needs 
to be added, so what I'll do is be silent in the case of bind(2).  I'm not 
specifying _Nonnull, so it's not a lie, but I'm not suggesting programmers that 
they should be careless about NULL, which I wouldn't want to do.

Thanks for the prompt answer!

Cheers,

Alex


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-03 15:55 ` Xi Ruoyao via Libc-alpha
  2022-12-03 16:41   ` Alejandro Colomar via Libc-alpha
@ 2022-12-03 19:05   ` Andreas Schwab
  2022-12-03 19:12     ` Alejandro Colomar via Libc-alpha
  2022-12-04  5:59     ` Xi Ruoyao via Libc-alpha
  1 sibling, 2 replies; 9+ messages in thread
From: Andreas Schwab @ 2022-12-03 19:05 UTC (permalink / raw)
  To: Xi Ruoyao via Libc-alpha; +Cc: Alejandro Colomar, Xi Ruoyao

On Dez 03 2022, Xi Ruoyao via Libc-alpha wrote:

> Currently the man page says:
>
> EFAULT: addr points outside the user's accessible address space.
>
> And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
> passed as addr.

You can never depend on EFAULT for invalid addresses.

> gnu::nonnull is not only a diagnostic attribute: it also allows the
> compiler to assume addr is never NULL.  i. e. if addr was gnu::nonnull
> and bind(2) is called with addr == NULL, the behavior would be
> undefined.

It is already undefined now, so this would be a valid change.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-03 19:05   ` Andreas Schwab
@ 2022-12-03 19:12     ` Alejandro Colomar via Libc-alpha
  2022-12-04  5:59     ` Xi Ruoyao via Libc-alpha
  1 sibling, 0 replies; 9+ messages in thread
From: Alejandro Colomar via Libc-alpha @ 2022-12-03 19:12 UTC (permalink / raw)
  To: Andreas Schwab, Xi Ruoyao via Libc-alpha; +Cc: Xi Ruoyao


[-- Attachment #1.1: Type: text/plain, Size: 1726 bytes --]

Hi Andreas,

On 12/3/22 20:05, Andreas Schwab wrote:
> On Dez 03 2022, Xi Ruoyao via Libc-alpha wrote:
> 
>> Currently the man page says:
>>
>> EFAULT: addr points outside the user's accessible address space.
>>
>> And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
>> passed as addr.
> 
> You can never depend on EFAULT for invalid addresses.
> 
>> gnu::nonnull is not only a diagnostic attribute: it also allows the
>> compiler to assume addr is never NULL.  i. e. if addr was gnu::nonnull
>> and bind(2) is called with addr == NULL, the behavior would be
>> undefined.
> 
> It is already undefined now, so this would be a valid change.

Hmm, if so, please CC me on any such changes.  I'm interested in them.  So far 
I'm being very careful about it, with the following approach:

I'm using _Nullable (Clang syntax), which is less invasive (there are very few 
calls that would need it, compared to either _Nonnull or [[gnu::nonnull]]). 
Also, in cases like this one (bind(2)), I can leave the prototype untouched, so 
I'm not really saying it's nonnull (but I'm implying it very much).  But 
certainly, I won't be adding _Nullable to functions like bind(2).

However, since there are a lot of libc syscall wrappers (and maybe functions, 
but I didn't yet arrive to that, so don't know) that don't specify __nonnull 
when they should, my work is very manual, and might make some mistakes.  I can 
send the patches to anyone that want to have a look at them before I push them 
(please tell me if so), but there will be many changes, and I'd prefer if I 
could just follow the glibc qualifiers.

Thank you!


Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-03 19:05   ` Andreas Schwab
  2022-12-03 19:12     ` Alejandro Colomar via Libc-alpha
@ 2022-12-04  5:59     ` Xi Ruoyao via Libc-alpha
  2022-12-04 11:14       ` Alejandro Colomar via Libc-alpha
  1 sibling, 1 reply; 9+ messages in thread
From: Xi Ruoyao via Libc-alpha @ 2022-12-04  5:59 UTC (permalink / raw)
  To: Andreas Schwab, Xi Ruoyao via Libc-alpha; +Cc: Alejandro Colomar

On Sat, 2022-12-03 at 20:05 +0100, Andreas Schwab wrote:
> > Currently the man page says:
> > 
> > EFAULT: addr points outside the user's accessible address space.
> > 
> > And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
> > passed as addr.
> 
> You can never depend on EFAULT for invalid addresses.

Hmm, is this documented somewhere?

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-04  5:59     ` Xi Ruoyao via Libc-alpha
@ 2022-12-04 11:14       ` Alejandro Colomar via Libc-alpha
  2022-12-04 18:46         ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 9+ messages in thread
From: Alejandro Colomar via Libc-alpha @ 2022-12-04 11:14 UTC (permalink / raw)
  To: Xi Ruoyao, Andreas Schwab, Xi Ruoyao via Libc-alpha


[-- Attachment #1.1: Type: text/plain, Size: 1684 bytes --]

Hi Xi,

On 12/4/22 06:59, Xi Ruoyao wrote:
> On Sat, 2022-12-03 at 20:05 +0100, Andreas Schwab wrote:
>>> Currently the man page says:
>>>
>>> EFAULT: addr points outside the user's accessible address space.
>>>
>>> And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
>>> passed as addr.
>>
>> You can never depend on EFAULT for invalid addresses.
> 
> Hmm, is this documented somewhere?

I don't know, but let me have an educated guess:

Holding a pointer to invalid memory is Undefined Behavior by the standard, 
except if that pointer is NULL, or is still indeterminate because the pointer 
has not yet been initialized with a valid address.  Using an uninitialized 
pointer is UB as using any uninitialized variable.  Using a NULL pointer is only 
okay for comparisons, or as a sentinel value, but never for accessing memory. 
So chances are high that the program will already have invoked UB at the time 
bind(2) is called with an invalid address.

I wonder what's the rationale for the kernel reporting EFAULT; I don't seem to 
make any sense of it.  If a program tries to access memory with an invalid 
pointer, the kernel will crash it with SEGV, but if the same program tries that 
the kernel accesses the same memory with the same invalid pointer, it will 
receive an error code and continue running fine; that's not coherent or 
consistent.  If I were the kernel I'd just do in bind(2) (and in many other 
syscalls that are similar):

if (invalid_pointer(addr))
     crash_program();

That would probably help find many hidden cases of UB around the world.

Cheers,

Alex


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-04 11:14       ` Alejandro Colomar via Libc-alpha
@ 2022-12-04 18:46         ` Florian Weimer via Libc-alpha
  2022-12-05 18:53           ` Zack Weinberg via Libc-alpha
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Weimer via Libc-alpha @ 2022-12-04 18:46 UTC (permalink / raw)
  To: Alejandro Colomar via Libc-alpha
  Cc: Xi Ruoyao, Andreas Schwab, Alejandro Colomar

* Alejandro Colomar via Libc-alpha:

> Hi Xi,
>
> On 12/4/22 06:59, Xi Ruoyao wrote:
>> On Sat, 2022-12-03 at 20:05 +0100, Andreas Schwab wrote:
>>>> Currently the man page says:
>>>>
>>>> EFAULT: addr points outside the user's accessible address space.
>>>>
>>>> And bind(2) indeed sets errno to EFAULT and return -1 when NULL is
>>>> passed as addr.
>>>
>>> You can never depend on EFAULT for invalid addresses.
>> Hmm, is this documented somewhere?
>
> I don't know, but let me have an educated guess:
>
> Holding a pointer to invalid memory is Undefined Behavior by the
> standard, except if that pointer is NULL, or is still indeterminate
> because the pointer has not yet been initialized with a valid address.
> Using an uninitialized pointer is UB as using any uninitialized
> variable.  Using a NULL pointer is only okay for comparisons, or as a
> sentinel value, but never for accessing memory. So chances are high
> that the program will already have invoked UB at the time bind(2) is
> called with an invalid address.

Currently, Linux does not report for vDSO-accelerated system calls, but
generates SIGSEGV.  We received bug reports when we added vDSO support
for time/gettimeofday/clock_gettime because some tests were relying on
the EFAULT behavior.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: bind(2): Missing [[gnu::nonnull]]
  2022-12-04 18:46         ` Florian Weimer via Libc-alpha
@ 2022-12-05 18:53           ` Zack Weinberg via Libc-alpha
  0 siblings, 0 replies; 9+ messages in thread
From: Zack Weinberg via Libc-alpha @ 2022-12-05 18:53 UTC (permalink / raw)
  To: libc-alpha

On 2022-12-04 1:46 PM, Florian Weimer via Libc-alpha wrote:
> * Alejandro Colomar via Libc-alpha:
>> On 12/4/22 06:59, Xi Ruoyao wrote:
>>> On Sat, 2022-12-03 at 20:05 +0100, Andreas Schwab wrote:
>>>>> Currently the man page says:
>>>>
>>>> You can never depend on EFAULT for invalid addresses.
>>> Hmm, is this documented somewhere?
>>
>> I don't know, but let me have an educated guess:
>>
>> Holding a pointer to invalid memory is Undefined Behavior by the
>> standard, except if that pointer is NULL, or is still indeterminate
>> because the pointer has not yet been initialized with a valid address.
>> Using an uninitialized pointer is UB as using any uninitialized
>> variable.  Using a NULL pointer is only okay for comparisons, or as a
>> sentinel value, but never for accessing memory. So chances are high
>> that the program will already have invoked UB at the time bind(2) is
>> called with an invalid address.
> 
> Currently, Linux does not report for vDSO-accelerated system calls, but
> generates SIGSEGV.  We received bug reports when we added vDSO support
> for time/gettimeofday/clock_gettime because some tests were relying on
> the EFAULT behavior.

I don't have time to dig through POSIX and find it now, but I'm like 95% 
sure there's explicit wording to the effect that the OS is allowed to 
send SIGSEGV under any circumstance that's documented to result in an 
EFAULT error, and vice versa.  That should be good enough for the 
_manpages_, but I think the _headers_ probably need to be more cautious 
about adding annotations that can cause compilers to draw new inferences 
about which control flow paths are unreachable.

I imagine this would be a hard sell with the "never break userspace" 
crowd but I would _like_ Linux to at least have a mode in which it would 
always send SIGSEGV rather than producing EFAULT.  It seems like it 
would be helpful for debugging.

zw

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-12-05 18:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-03 15:33 bind(2): Missing [[gnu::nonnull]] Alejandro Colomar via Libc-alpha
2022-12-03 15:55 ` Xi Ruoyao via Libc-alpha
2022-12-03 16:41   ` Alejandro Colomar via Libc-alpha
2022-12-03 19:05   ` Andreas Schwab
2022-12-03 19:12     ` Alejandro Colomar via Libc-alpha
2022-12-04  5:59     ` Xi Ruoyao via Libc-alpha
2022-12-04 11:14       ` Alejandro Colomar via Libc-alpha
2022-12-04 18:46         ` Florian Weimer via Libc-alpha
2022-12-05 18:53           ` Zack Weinberg via Libc-alpha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).