bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* *alloc-gnu on Solaris 11
@ 2021-05-14 17:04 Bruno Haible
  2021-05-15 14:04 ` *alloc-gnu on glibc Pádraig Brady
  0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-05-14 17:04 UTC (permalink / raw)
  To: bug-gnulib

On Solaris 11.3 (the machine gcc211.fsffrance.org, in 64-bit mode), there
are test failures:
  FAIL: test-calloc-gnu
  FAIL: test-malloc-gnu
  FAIL: test-realloc-gnu
  FAIL: test-reallocarray
Each of these tests takes about 30 seconds before failing, and is not
immediately reactive to Ctrl-C.

$ time ./test-malloc-gnu
errno=11=EAGAIN
../../tests/test-malloc-gnu.c:42: assertion 'errno == ENOMEM' failed
Abort (Speicherabzug geschrieben)

real    0m28,908s
user    0m0,001s
sys     0m0,012s


$ truss ./test-malloc-gnu
...
brk(0x00000000)                                 = 0x100102248
brk(0x100102250)                                = 0x00000000
brk(0x100106250)                                = 0x00000000
brk(0x100106250)                                = 0x00000000
brk(0x8000000100106240)                         Err#11 EAGAIN  ; <== this takes 30 sec
brk(0x100106250)                                = 0x00000000
...

So, two things are wrong here:
  * We should enforce an errno ENOMEM, not EAGAIN.
    1. for compliance with POSIX.
    2. for a reasonable error message.
    3. so that programs don't attempt to repeat the call and thus bring the
       system to its knees.
  * We should avoid running such a resource-consuming test as part of
    "make check".

This patch fixes the first issue and, as a side effect, the second issue as
well.


2021-05-14  Bruno Haible  <bruno@clisp.org>

	malloc-gnu, realloc-gnu, calloc-gnu: Ensure errno gets set to ENOMEM.
	* m4/malloc.m4 (gl_CHECK_MALLOC_POSIX): Set gl_cv_func_malloc_posix to
	'no' also on Solaris.

diff --git a/m4/malloc.m4 b/m4/malloc.m4
index 6fcd4ad..972e808 100644
--- a/m4/malloc.m4
+++ b/m4/malloc.m4
@@ -1,4 +1,4 @@
-# malloc.m4 serial 26
+# malloc.m4 serial 27
 dnl Copyright (C) 2007, 2009-2021 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -113,7 +113,7 @@ AC_DEFUN([gl_FUNC_MALLOC_POSIX],
   fi
 ])
 
-# Test whether malloc, realloc, calloc set errno on failure.
+# Test whether malloc, realloc, calloc set errno to ENOMEM on failure.
 # Set gl_cv_func_malloc_posix to yes or no accordingly.
 AC_DEFUN([gl_CHECK_MALLOC_POSIX],
 [
@@ -129,9 +129,13 @@ AC_DEFUN([gl_CHECK_MALLOC_POSIX],
       case "$host_os" in
         mingw*)
           gl_cv_func_malloc_posix=no ;;
-        irix*)
-          dnl The three functions return NULL with errno unset when the
-          dnl argument is larger than PTRDIFF_MAX. Here is a test program:
+        irix* | solaris*)
+          dnl On IRIX 6.5, the three functions return NULL with errno unset
+          dnl when the argument is larger than PTRDIFF_MAX.
+          dnl On Solaris 11.3, the three functions return NULL with errno set
+          dnl to EAGAIN, not ENOMEM, when the argument is larger than
+          dnl PTRDIFF_MAX.
+          dnl Here is a test program:
 m4_divert_push([KILL])
 #include <errno.h>
 #include <stdio.h>



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: *alloc-gnu on glibc
  2021-05-14 17:04 *alloc-gnu on Solaris 11 Bruno Haible
@ 2021-05-15 14:04 ` Pádraig Brady
  2021-05-15 15:36   ` Paul Eggert
  2021-05-15 15:49   ` Bruno Haible
  0 siblings, 2 replies; 6+ messages in thread
From: Pádraig Brady @ 2021-05-15 14:04 UTC (permalink / raw)
  To: Bruno Haible, bug-gnulib

On glibc-2.31-5.fc32.x86_64 I'm seeing this test failure with coreutils,
since the tests are now checking for a specific errno.
I just checked a later Fedora 34 system which has the same issue.

Specifically test-realloc-gnu is enabled in coreutils
and it's failing as realloc is returning NULL as expected, but errno is 0.
Specifically on glibc realloc(NULL, PTRDIFF_MAX+1) does return 0,errno=ENOMEM,
but realloc(non_null, PTRDIFF_MAX+1) returns 0,errno=0.

I'm wondering should we be more relaxed here as,
there is only one standardised ENOMEM error from realloc,
so code doesn't have to behave differently based on what the errno is.
Especially if it's using xrealloc etc.
Now I accept that error messages may be more accurate,
but one should opt into that I think.

Should we adjust things so depending on realloc-gnu
would not have the more stringent requirement of ENOMEM
being returned from realloc(), and for projects that wanted that,
they could depend on the realloc-posix module instead?

In any case I'd be in favor of relaxing the test,
rather than replacing realloc everywhere.

some notes on solaris below...

On 14/05/2021 18:04, Bruno Haible wrote:
> On Solaris 11.3 (the machine gcc211.fsffrance.org, in 64-bit mode), there
> are test failures:
>    FAIL: test-calloc-gnu
>    FAIL: test-malloc-gnu
>    FAIL: test-realloc-gnu
>    FAIL: test-reallocarray
> Each of these tests takes about 30 seconds before failing, and is not
> immediately reactive to Ctrl-C.

These EAGAIN failure modes sound like the value passed to realloc() etc.
is too small to ensure ENOMEM was return immediately.
I've not looked into how the compiler for the m4 test is invoked,
but I did check the code in isolation on a 64 bit solaris 10 system.
The following shows that it's important to ensure the compiler
is explicitly running in 64 bit mode, as the default is 32 bit,
resulting in 32 bit longs and hence too small of values passed in:

 > gcc malloc-m4.c
 > ./a.out
PTRDIFF_MAX = 2147483647
p=20f90 errno=0
p=0 errno=12
p=0 errno=12

 > gcc -m64 malloc-m4.c
 > ./a.out
PTRDIFF_MAX = 9223372036854775807
p=0 errno=12
p=0 errno=12
p=0 errno=12

Both the above returned immediately for me.
Though I also notice the existing note in malloc.m4 on this
check being too dangerous to run in general due to this issue.

cheers,
Pádraig


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: *alloc-gnu on glibc
  2021-05-15 14:04 ` *alloc-gnu on glibc Pádraig Brady
@ 2021-05-15 15:36   ` Paul Eggert
  2021-05-15 15:49   ` Bruno Haible
  1 sibling, 0 replies; 6+ messages in thread
From: Paul Eggert @ 2021-05-15 15:36 UTC (permalink / raw)
  To: Pádraig Brady; +Cc: bug-gnulib, Bruno Haible

On 5/15/21 9:04 AM, Pádraig Brady wrote:
>
> I'm wondering should we be more relaxed here as,
> there is only one standardised ENOMEM error from realloc,
> so code doesn't have to behave differently based on what the errno is. 

We've already relaxed the errno test for reallocarray, where NetBSD 
fails with EOVERFLOW on ptrdiff_t overflow. (My own feeling is that 
NetBSD is right and glibc is wrong here; perhaps I should file a glibc 
bug report.)

The allocator modules are in some sense backwards. malloc-gnu requires 
malloc-posix because GNU guarantees are a superset of POSIX guarantees. 
And yet GNU programs often don't care about the POSIX guarantee of errno 
being set, whereas some do care about the GNU guarantee that malloc 
returns NULL only on failure.

> Now I accept that error messages may be more accurate,
> but one should opt into that I think. 

That opt-in was what malloc-posix was supposed to be.

I see three options:

1. Make the realloc-gnu test more generous.

2. Fix realloc-posix to check for this glibc bug and replace realloc if 
the test fails.

3. Complicate the set of Gnulib allocator modules so that apps can 
more-accurately express their needs.

Although (3) is in some sense the "best" it's probably too complicated. 
What we have is already too complicated. I'd rather have just malloc-gnu 
and not worry about non-GNU behavior (that's the Gnulib way, right?).

(2) is surely the right way to go in the long run. The glibc behavior is 
simply a bug.

In the short run I can see the temptation of (1).



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: *alloc-gnu on glibc
  2021-05-15 14:04 ` *alloc-gnu on glibc Pádraig Brady
  2021-05-15 15:36   ` Paul Eggert
@ 2021-05-15 15:49   ` Bruno Haible
  2021-05-15 17:08     ` Pádraig Brady
  1 sibling, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-05-15 15:49 UTC (permalink / raw)
  To: Pádraig Brady; +Cc: bug-gnulib

Hi Pádraig,

> On glibc-2.31-5.fc32.x86_64 I'm seeing this test failure with coreutils,
> since the tests are now checking for a specific errno.
> I just checked a later Fedora 34 system which has the same issue.

For me, a testdir created through

   ./gnulib-tool --create-testdir --dir=../testdir --single-configure \
                  malloc-gnu realloc-gnu calloc-gnu reallocarray

passes all tests on Fedora 32 (glibc 2.31) and Fedora 33 (glibc 2.32).

> Specifically test-realloc-gnu is enabled in coreutils
> and it's failing as realloc is returning NULL as expected, but errno is 0.
> Specifically on glibc realloc(NULL, PTRDIFF_MAX+1) does return 0,errno=ENOMEM,
> but realloc(non_null, PTRDIFF_MAX+1) returns 0,errno=0.

POSIX <https://pubs.opengroup.org/onlinepubs/9699919799/functions/realloc.html>
says two things:

  1) By enumerating the error values, it implies that when realloc() fails,
     errno gets set to an error number. See
     <https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_03>
  2) "If there is not enough available memory, realloc() shall return a null
      pointer and set errno to [ENOMEM]."

If realloc() is failing but does not set errno, or sets errno to 0, that's a
POSIX violation. To be reported in the glibc bugzilla.

If realloc() is failing and sets errno to some value != ENOMEM, that is
technically valid (since you can debate which is the "cause" of the failure
and which errno value provides the best information). Personally I think
that
  - for an argument > PTRDIFF_MAX, EOVERFLOW is a reasonable error code,
  - EAGAIN is not a good error code, since "Out of memory" is a better error
    message than "Resource temporarily unavailable".

> In any case I'd be in favor of relaxing the test,

I'm in favour of reporting the POSIX violation in the first place.

> some notes on solaris below...
> ...
> but I did check the code in isolation on a 64 bit solaris 10 system.
> ...
> The following shows that it's important to ensure the compiler
> is explicitly running in 64 bit mode, as the default is 32 bit,

My knowledge about how to invoke the compiler is here:
https://gitlab.com/ghwiki/gnow-how/-/wikis/Platforms/Configuration
If you think some of the table entries are wrong, please say so.

> Both the above returned immediately for me.

Solaris 11 apparently behaves differently than Solaris 10.

Bruno



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: *alloc-gnu on glibc
  2021-05-15 15:49   ` Bruno Haible
@ 2021-05-15 17:08     ` Pádraig Brady
  2021-05-15 18:07       ` Bruno Haible
  0 siblings, 1 reply; 6+ messages in thread
From: Pádraig Brady @ 2021-05-15 17:08 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 820 bytes --]

On 15/05/2021 16:49, Bruno Haible wrote:
> Hi Pádraig,
> 
>> On glibc-2.31-5.fc32.x86_64 I'm seeing this test failure with coreutils,
>> since the tests are now checking for a specific errno.
>> I just checked a later Fedora 34 system which has the same issue.
> 
> For me, a testdir created through
> 
>     ./gnulib-tool --create-testdir --dir=../testdir --single-configure \
>                    malloc-gnu realloc-gnu calloc-gnu reallocarray
> 
> passes all tests on Fedora 32 (glibc 2.31) and Fedora 33 (glibc 2.32).

Oh. Ah this issue is restricted to glibc's MALLOC_CHECK_ implementation.
The attached avoids this part of the test if it's being used.
I've reported the glibc bug at:
https://sourceware.org/bugzilla/show_bug.cgi?id=27870

With this applied, your command above passes all tests.

cheers,
Pádraig

[-- Attachment #2: gnulib-realloc-check.patch --]
[-- Type: text/x-patch, Size: 1796 bytes --]

From defe5bebcd1e2bb75f15eb54ae1135afaae3c477 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <P@draigBrady.com>
Date: Sat, 15 May 2021 17:50:33 +0100
Subject: [PATCH] realloc-gnu: avoid glibc MALLOC_CHECK_ issue

* tests/test-realloc-gnu.c (main): if MALLOC_CHECK_ env var
is set then don't check ENOMEM is returned from realloc().
See https://sourceware.org/bugzilla/show_bug.cgi?id=27870
Note it doesn't suffice to unsetenv() this var within the program,
as the hooks have already been set up at that stage.
---
 ChangeLog                | 9 +++++++++
 tests/test-realloc-gnu.c | 4 +++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index b802233cf..30663cb38 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2021-05-15  Pádraig Brady  <P@draigBrady.com>
+
+	realloc-gnu: avoid glibc MALLOC_CHECK_ issue
+	* tests/test-realloc-gnu.c (main): if MALLOC_CHECK_ env var
+	is set then don't check ENOMEM is returned from realloc().
+	See https://sourceware.org/bugzilla/show_bug.cgi?id=27870
+	Note it doesn't suffice to unsetenv() this var within the program,
+	as the hooks have already been set up at that stage.
+
 2021-05-14  Paul Eggert  <eggert@cs.ucla.edu>
 
 	c-stack: work around Solaris 11 bugs
diff --git a/tests/test-realloc-gnu.c b/tests/test-realloc-gnu.c
index 3a787ed91..5534f2ea1 100644
--- a/tests/test-realloc-gnu.c
+++ b/tests/test-realloc-gnu.c
@@ -38,7 +38,9 @@ main (int argc, char **argv)
       size_t one = argc != 12345;
       p = realloc (p, PTRDIFF_MAX + one);
       ASSERT (p == NULL);
-      ASSERT (errno == ENOMEM);
+      /* Avoid https://sourceware.org/bugzilla/show_bug.cgi?id=27870  */
+      if (!getenv ("MALLOC_CHECK_"))
+        ASSERT (errno == ENOMEM);
     }
 
   free (p);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: *alloc-gnu on glibc
  2021-05-15 17:08     ` Pádraig Brady
@ 2021-05-15 18:07       ` Bruno Haible
  0 siblings, 0 replies; 6+ messages in thread
From: Bruno Haible @ 2021-05-15 18:07 UTC (permalink / raw)
  To: Pádraig Brady; +Cc: bug-gnulib

Pádraig Brady wrote:
> Ah this issue is restricted to glibc's MALLOC_CHECK_ implementation.

Ah, that explains why I got different results than you did.

> I've reported the glibc bug at:
> https://sourceware.org/bugzilla/show_bug.cgi?id=27870

Thanks!

> The attached avoids this part of the test if it's being used.

Applied, with a comment tweak.

Bruno



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-05-15 18:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-14 17:04 *alloc-gnu on Solaris 11 Bruno Haible
2021-05-15 14:04 ` *alloc-gnu on glibc Pádraig Brady
2021-05-15 15:36   ` Paul Eggert
2021-05-15 15:49   ` Bruno Haible
2021-05-15 17:08     ` Pádraig Brady
2021-05-15 18:07       ` Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).