[PATCH] IBM z/OS + EBCDIC support

bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed

* [PATCH] IBM z/OS + EBCDIC support
@ 2015-09-22  2:28 Daniel Richard G.
  2015-09-22 15:23 ` Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-22  2:28 UTC (permalink / raw)
  To: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 8000 bytes --]

Hello list,

The attached patch, against Git master, addresses numerous
incompatibilities in Gnulib with IBM z/OS (a mainframe operating system)
and the EBCDIC encoding.

With my changes, Gnulib builds successfully, and most of the tests
succeed. The remaining failures are as follows.

These appear to expose bugs in the system implementation, and have been
reported to IBM. (A few others have already received APAR fixes):

    FAIL: test-fdopendir
    FAIL: test-getopt
    FAIL: test-mbsrtowcs1.sh

A number of floating-point tests appear to be in the same boat. These
failure modes have yet to be evaluated:

    FAIL: test-fma2
    FAIL: test-fmaf2
    FAIL: test-fmodl-ieee
    FAIL: test-isinf
    FAIL: test-isnan
    FAIL: test-isnanl-nolibm
    FAIL: test-isnanl
    FAIL: test-ldexpf
    FAIL: test-remainderl-ieee
    FAIL: test-truncl-ieee

These require more investigation and/or discussion on this list:

    FAIL: test-perror.sh
    FAIL: test-poll
    FAIL: test-select-in.sh
    FAIL: test-select-out.sh
    FAIL: test-sigpipe.sh
    FAIL: test-symlink
    FAIL: test-symlinkat

One more issue for now: In order to build Gnulib on this system, it is
necessary to use a compiler wrapper script, due to the inexplicably
broken way xlc handles #include paths. I recently submitted some changes
to Gawk to work around this (look in the feature/zOS-try2 branch,
m4/arch.m4 file; search for "zos-cc"). It's possible that a similar
workaround will need to be bundled here.


In any event, below is a walk-through of my changes in the patch.
Comments and questions are welcome.


+++ lib/alloca.in.h

* z/OS has the alloca() definitions in stdlib.h.

+++ lib/c-ctype.c

* Implementing ctype functions that support EBCDIC from scratch is not
  feasible, not least because there isn't even one specific EBCDIC
  variant that should be targeted. So I just call through to the system
  routines, while ensuring that the compile-time environment is set
  correctly, and working around the system routines' input-range issues
  with signed chars.

* In EBCDIC, normal chars like 'A' occur in the upper half of the 8-bit
  range. This interferes with the idiom of using "switch (c)" and then
  "case 'A':" et al. because c can have two distinct values (-63 and
  193) that should match to 'A'.

  My fix, then, is a macro which converts the input codepoint to the
  range that will match literal chars, when necessary. (Obviously, in
  ASCII, it's a no-op.) Any takers on a better name for this macro than
  CHAR_LITERAL()?

+++ lib/c-ctype.h

* Ensure that ASCII optimizations are applied only when building in
  ASCII.

+++ lib/fnmatch.c

* Fixed an error from __GNUC__ not being defined.

+++ lib/get-rusage-as.c

* Added z/OS awareness.

+++ lib/glob.c

* Avoid this #define on z/OS, because...

    $ grep alloca /usr/include/stdlib.h
            #ifndef alloca
              #define alloca(x) __alloca(x)
                #pragma linkage(__alloca,builtin)
                void *__alloca(unsigned int x);

+++ lib/glthread/thread.c

* Added z/OS awareness. pthread_t does not have a .p field on z/OS, but
  this does otherwise seem to apply.

  For what it's worth, this is pthread_t, from /usr/include/sys/types.h:

          typedef struct {
                     char __[0x08];
          } pthread_t;

+++ lib/glthread/thread.h

* Best guess at a gl_thread implementation for z/OS.

+++ lib/math.in.h

* The system defines these functions as macros, and the compiler did not
  like seeing them redefined.

+++ lib/ptsname_r.c

* Likewise.

+++ lib/regex.h

* Ensure that "__string" does not expand to "1" when it is used as a
  formal parameter name.

+++ lib/string.in.h

* Likewise.

+++ lib/strtod.c

* The system strtod() sets ERANGE for some reason when parsing "0x".

* It also returns a value of 0.0 for "nan()".

+++ m4/fclose.m4

* This system has a broken fclose(); without this bit, the test-fclose
  test fails:

    $ ./test-fclose
    /path/to/gltests/test-fclose.c:74: assertion 'lseek (fd, 0, SEEK_CUR) == 3' failed
    CEE5207E The signal SIGABRT was received.
    ABORT instruction

  However, the existing conditions didn't enable it, so I added a
  host-platform check.

+++ m4/strstr.m4

* The IBM runtime sucks; signal delivery is delayed until strstr()
  exits, so this test results in a hang that can only be SIGKILL'ed.

+++ m4/wchar_h.m4

* The linker on this system cares way too much about the object file's
  original name.

  Slightly longer explanation: In 64-bit builds, the toolchain uses the
  XPLINK object format (as opposed to GOFF for 31-bit builds). XPLINK
  has the notion of CSECTs, and these are named. By default, the main
  code CSECT is named after the source-file basename. If the linker
  encounters two CSECTs with the same name, it will consider them to be
  duplicates, and discard one---even if they contain completely
  orthogonal definitions.

  This can be worked around by specifying the CSECT names explicitly
  with -qcsect=foobaz (using different values of "foobaz" for the two
  files), but IMO it is easier just to compile the two source files for
  these tests from differently-named source files in the first place.

+++ tests/infinity.h

* xlc doesn't like constant div-by-zero expressions.

+++ tests/nan.h

* z/OS, in addition to supporting IEEE floating-point, also supports an
  older "hexadecimal" format that does not support NaN. Bomb out if this
  is in use.

+++ tests/test-c-ctype.c

* We need the same CHAR_LITERAL() hack here as in c-ctype.c.

+++ tests/test-c-strcasecmp.c

* In EBCDIC-1047, the tests

    ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
    ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);

  are actually

    ASSERT (c_strcasecmp ("turkish", "TURKD¬SH") < 0);
    ASSERT (c_strcasecmp ("TURKD¬SH", "turkish") > 0);

  which, of course, fail.

+++ tests/test-c-strncasecmp.c

* Likewise.

+++ tests/test-canonicalize-lgpl.c

* Addressed a strange z/OS corner case. This system has
  DOUBLE_SLASH_IS_DISTINCT_ROOT, yet the dev/ino numbers are the same.

+++ tests/test-iconv-utf.c

* When compiling in (normal) EBCDIC mode on z/OS, the compiler
  translates char and string literals to EBCDIC. (Numerical escapes like
  "\346" are not remapped.) This messes up the test, because the input
  strings are supposed to have their literal characters represented in
  ASCII. So I moved all the input strings to the top of the file, added
  an appropriate compiler #pragma to change the conversion behavior, and
  modified the tests to refer to these.

  (Note that a #define would not work for the input strings, because the
  text is converted at the point of use, not the point of definition.)

+++ tests/test-iconv.c

* The system iconv implementation does not recognize "ISO-8859-1", but
  it does recognize "ISO8859-1".

* Similar issue with converting input strings. (This leaves open the
  possibility that any ASSERT() failures will be reported in ISO 8859-1,
  not EBCDIC, thus resulting in gibberish on the user's terminal. But I
  kept the changes to the minimum needed to get this test to pass. I can
  do the full nine yards if desired.)

+++ tests/test-nonblocking-pipe.h

* Added z/OS awareness. (I tested this and found that exact
  boundary value; the test fails with 131072.)

+++ tests/test-nonblocking-reader.h

* Nonblocking read() returns EWOULDBLOCK on this system.

+++ tests/test-nonblocking-writer.h

* Nonblocking write() returns EWOULDBLOCK on this system.

+++ tests/test-sigpipe.sh

* Fixed an apparent typo.

+++ tests/test-wcwidth.c

* Only run ASCII-specific tests in ASCII mode.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gnulib-zos-v1.patch --]
[-- Type: text/x-patch; name="gnulib-zos-v1.patch", Size: 37481 bytes --]

diff --git a/lib/alloca.in.h b/lib/alloca.in.h
index d5664b6..6606984 100644
--- a/lib/alloca.in.h
+++ b/lib/alloca.in.h
@@ -51,6 +51,8 @@ extern "C"
 void *_alloca (unsigned short);
 #  pragma intrinsic (_alloca)
 #  define alloca _alloca
+# elif defined __MVS__
+#  include <stdlib.h>
 # else
 #  include <stddef.h>
 #  ifdef  __cplusplus
diff --git a/lib/c-ctype.c b/lib/c-ctype.c
index 6635d34..bbc543f 100644
--- a/lib/c-ctype.c
+++ b/lib/c-ctype.c
@@ -17,16 +17,54 @@ along with this program; if not, see <http://www.gnu.org/licenses/>.  */
 
 #include <config.h>
 
+/* On z/OS with EBCDIC, we punt and just use the system functions.
+   IBM created this mess; let them deal with it.
+
+   Note that if we are not building with -D_ALL_SOURCE, then isascii()
+   interprets its input as an ASCII codepoint, even in an EBCDIC build.
+
+   Also, the z/OS ctype functions do not handle negative-valued chars
+   at all (especially helpful when signed EBCDIC 'A' == -63), so we
+   adjust their arguments accordingly.  */
+#if defined __MVS__ && !C_CTYPE_ASCII
+# ifndef _ALL_SOURCE
+#  error "Please compile me with -D_ALL_SOURCE, or else isascii() will not work correctly with EBCDIC input."
+# endif
+# include <ctype.h>
+# define USE_SYSTEM_CTYPE
+# define SYSTEM_CTYPE_CHAR(C) ((C) & 0xff)
+#endif
+
 /* Specification.  */
 #define NO_C_CTYPE_MACROS
 #include "c-ctype.h"
 
+/* In EBCDIC, literal chars like 'A' may be represented by a signed
+   negative (-63) as well as unsigned positive (193) value. If we are
+   comparing a char integer value to a literal, then we want the
+   former to be on the same side of the "fence" as the latter.  */
+#if C_CTYPE_ASCII
+# define CHAR_LITERAL(C) (C)
+#elif 'A' < 0
+# define CHAR_LITERAL(C) ((C) >= 128 && (C) < 256 ? (C) - 256 : (C))
+#else
+# define CHAR_LITERAL(C) ((C) >= -128 && (C) < 0 ? (C) + 256 : (C))
+#endif
+
 /* The function isascii is not locale dependent. Its use in EBCDIC is
    questionable. */
 bool
 c_isascii (int c)
 {
+#if C_CTYPE_ASCII
   return (c >= 0x00 && c <= 0x7f);
+#elif defined USE_SYSTEM_CTYPE
+  /* On z/OS, the ctype functions return zero or non-zero,
+     not necessarily 0 or 1.  */
+  return isascii (SYSTEM_CTYPE_CHAR (c)) != 0;
+#else
+# error "No suitable implementation for c_isascii()"
+#endif
 }
 
 bool
@@ -42,8 +80,8 @@ c_isalnum (int c)
           || (c >= 'A' && c <= 'Z')
           || (c >= 'a' && c <= 'z'));
 #endif
-#else
-  switch (c)
+#else /* Non-consecutive alphanumerics */
+  switch (CHAR_LITERAL (c))
     {
     case '0': case '1': case '2': case '3': case '4': case '5':
     case '6': case '7': case '8': case '9':
@@ -74,7 +112,7 @@ c_isalpha (int c)
   return ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'));
 #endif
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
     case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -104,8 +142,10 @@ c_iscntrl (int c)
 {
 #if C_CTYPE_ASCII
   return ((c & ~0x1f) == 0 || c == 0x7f);
+#elif defined USE_SYSTEM_CTYPE
+  return iscntrl(SYSTEM_CTYPE_CHAR (c)) != 0;
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case ' ': case '!': case '"': case '#': case '$': case '%':
     case '&': case '\'': case '(': case ')': case '*': case '+':
@@ -137,9 +177,9 @@ bool
 c_isdigit (int c)
 {
 #if C_CTYPE_CONSECUTIVE_DIGITS
-  return (c >= '0' && c <= '9');
+  return (CHAR_LITERAL (c) >= '0' && CHAR_LITERAL (c) <= '9');
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case '0': case '1': case '2': case '3': case '4': case '5':
     case '6': case '7': case '8': case '9':
@@ -156,7 +196,7 @@ c_islower (int c)
 #if C_CTYPE_CONSECUTIVE_LOWERCASE
   return (c >= 'a' && c <= 'z');
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
     case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
@@ -175,8 +215,10 @@ c_isgraph (int c)
 {
 #if C_CTYPE_ASCII
   return (c >= '!' && c <= '~');
+#elif defined USE_SYSTEM_CTYPE
+  return isgraph(SYSTEM_CTYPE_CHAR (c)) != 0;
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case '!': case '"': case '#': case '$': case '%': case '&':
     case '\'': case '(': case ')': case '*': case '+': case ',':
@@ -209,8 +251,10 @@ c_isprint (int c)
 {
 #if C_CTYPE_ASCII
   return (c >= ' ' && c <= '~');
+#elif defined USE_SYSTEM_CTYPE
+  return isprint(SYSTEM_CTYPE_CHAR (c)) != 0;
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case ' ': case '!': case '"': case '#': case '$': case '%':
     case '&': case '\'': case '(': case ')': case '*': case '+':
@@ -245,8 +289,10 @@ c_ispunct (int c)
   return ((c >= '!' && c <= '~')
           && !((c >= '0' && c <= '9')
                || ((c & ~0x20) >= 'A' && (c & ~0x20) <= 'Z')));
+#elif defined USE_SYSTEM_CTYPE
+  return ispunct(SYSTEM_CTYPE_CHAR (c)) != 0;
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case '!': case '"': case '#': case '$': case '%': case '&':
     case '\'': case '(': case ')': case '*': case '+': case ',':
@@ -275,7 +321,7 @@ c_isupper (int c)
 #if C_CTYPE_CONSECUTIVE_UPPERCASE
   return (c >= 'A' && c <= 'Z');
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
     case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -303,7 +349,7 @@ c_isxdigit (int c)
           || (c >= 'a' && c <= 'f'));
 #endif
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case '0': case '1': case '2': case '3': case '4': case '5':
     case '6': case '7': case '8': case '9':
@@ -322,7 +368,7 @@ c_tolower (int c)
 #if C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
   return (c >= 'A' && c <= 'Z' ? c - 'A' + 'a' : c);
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case 'A': return 'a';
     case 'B': return 'b';
@@ -361,7 +407,7 @@ c_toupper (int c)
 #if C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
   return (c >= 'a' && c <= 'z' ? c - 'a' + 'A' : c);
 #else
-  switch (c)
+  switch (CHAR_LITERAL (c))
     {
     case 'a': return 'A';
     case 'b': return 'B';
diff --git a/lib/c-ctype.h b/lib/c-ctype.h
index d622973..d94f526 100644
--- a/lib/c-ctype.h
+++ b/lib/c-ctype.h
@@ -141,11 +141,13 @@ extern int c_toupper (int c) _GL_ATTRIBUTE_CONST;
 
 /* ASCII optimizations. */
 
+#ifdef C_CTYPE_ASCII
 #undef c_isascii
 #define c_isascii(c) \
   ({ int __c = (c); \
      (__c >= 0x00 && __c <= 0x7f); \
    })
+#endif
 
 #if C_CTYPE_CONSECUTIVE_DIGITS \
     && C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
diff --git a/lib/fnmatch.c b/lib/fnmatch.c
index a607672..58754fa 100644
--- a/lib/fnmatch.c
+++ b/lib/fnmatch.c
@@ -22,7 +22,7 @@
 # define _GNU_SOURCE    1
 #endif
 
-#if ! defined __builtin_expect && __GNUC__ < 3
+#if ! defined __builtin_expect && defined __GNUC__ && __GNUC__ < 3
 # define __builtin_expect(expr, expected) (expr)
 #endif
 
diff --git a/lib/get-rusage-as.c b/lib/get-rusage-as.c
index 2bad20a..4db1596 100644
--- a/lib/get-rusage-as.c
+++ b/lib/get-rusage-as.c
@@ -355,7 +355,7 @@ get_rusage_as_via_iterator (void)
 uintptr_t
 get_rusage_as (void)
 {
-#if (defined __APPLE__ && defined __MACH__) || defined _AIX || defined __CYGWIN__ /* Mac OS X, AIX, Cygwin */
+#if (defined __APPLE__ && defined __MACH__) || defined _AIX || defined __CYGWIN__ || defined __MVS__ /* Mac OS X, AIX, Cygwin, z/OS */
   /* get_rusage_as_via_setrlimit() does not work.
      Prefer get_rusage_as_via_iterator().  */
   return get_rusage_as_via_iterator ();
diff --git a/lib/glob.c b/lib/glob.c
index ed49a9d..9fd6482 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -144,7 +144,9 @@
 # define __stat64(fname, buf)   stat (fname, buf)
 # define __fxstatat64(_, d, f, st, flag) fstatat (d, f, st, flag)
 # define struct_stat64          struct stat
-# define __alloca               alloca
+# ifndef __MVS__
+#  define __alloca              alloca
+# endif
 # define __readdir              readdir
 # define __glob_pattern_p       glob_pattern_p
 #endif /* _LIBC */
diff --git a/lib/glthread/thread.c b/lib/glthread/thread.c
index d4e2921..28a2797 100644
--- a/lib/glthread/thread.c
+++ b/lib/glthread/thread.c
@@ -33,7 +33,7 @@
 
 #include <pthread.h>
 
-#ifdef PTW32_VERSION
+#if defined(PTW32_VERSION) || defined(__MVS__)
 
 const gl_thread_t gl_null_thread /* = { .p = NULL } */;
 
diff --git a/lib/glthread/thread.h b/lib/glthread/thread.h
index 2febe34..01ec45b 100644
--- a/lib/glthread/thread.h
+++ b/lib/glthread/thread.h
@@ -172,6 +172,15 @@ typedef pthread_t gl_thread_t;
 #  define gl_thread_self_pointer() \
      (pthread_in_use () ? pthread_self ().p : NULL)
 extern const gl_thread_t gl_null_thread;
+# elif defined(__MVS__)
+   /* On IBM z/OS, pthread_t is a struct with an 8-byte '__' field.
+      The first three bytes of this field appear to uniquely identify a
+      pthread_t, though not necessarily representing a pointer.  */
+#  define gl_thread_self() \
+     (pthread_in_use () ? pthread_self () : gl_null_thread)
+#  define gl_thread_self_pointer() \
+     (pthread_in_use () ? *((void **) pthread_self ().__) : NULL)
+extern const gl_thread_t gl_null_thread;
 # else
 #  define gl_thread_self() \
      (pthread_in_use () ? pthread_self () : (pthread_t) NULL)
diff --git a/lib/math.in.h b/lib/math.in.h
index 62a089a..59293fd 100644
--- a/lib/math.in.h
+++ b/lib/math.in.h
@@ -406,6 +406,7 @@ _GL_WARN_ON_USE (ceilf, "ceilf is unportable - "
 #if @GNULIB_CEIL@
 # if @REPLACE_CEIL@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef ceil
 #   define ceil rpl_ceil
 #  endif
 _GL_FUNCDECL_RPL (ceil, double, (double x));
@@ -753,6 +754,7 @@ _GL_WARN_ON_USE (floorf, "floorf is unportable - "
 #if @GNULIB_FLOOR@
 # if @REPLACE_FLOOR@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef floor
 #   define floor rpl_floor
 #  endif
 _GL_FUNCDECL_RPL (floor, double, (double x));
@@ -973,6 +975,7 @@ _GL_WARN_ON_USE (frexpf, "frexpf is unportable - "
 #if @GNULIB_FREXP@
 # if @REPLACE_FREXP@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef frexp
 #   define frexp rpl_frexp
 #  endif
 _GL_FUNCDECL_RPL (frexp, double, (double x, int *expptr) _GL_ARG_NONNULL ((2)));
@@ -1958,6 +1961,7 @@ _GL_WARN_ON_USE (tanhf, "tanhf is unportable - "
 #if @GNULIB_TRUNCF@
 # if @REPLACE_TRUNCF@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef truncf
 #   define truncf rpl_truncf
 #  endif
 _GL_FUNCDECL_RPL (truncf, float, (float x));
@@ -1980,6 +1984,7 @@ _GL_WARN_ON_USE (truncf, "truncf is unportable - "
 #if @GNULIB_TRUNC@
 # if @REPLACE_TRUNC@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef trunc
 #   define trunc rpl_trunc
 #  endif
 _GL_FUNCDECL_RPL (trunc, double, (double x));
diff --git a/lib/ptsname_r.c b/lib/ptsname_r.c
index faa33fb..809388a 100644
--- a/lib/ptsname_r.c
+++ b/lib/ptsname_r.c
@@ -34,6 +34,11 @@
 #  define _PATH_DEV "/dev/"
 # endif
 
+# undef __set_errno
+# undef __stat
+# undef __ttyname_r
+# undef __ptsname_r
+
 # define __set_errno(e) errno = (e)
 # define __isatty isatty
 # define __stat stat
diff --git a/lib/regex.h b/lib/regex.h
index 6f3bae3..64d7a43 100644
--- a/lib/regex.h
+++ b/lib/regex.h
@@ -23,6 +23,12 @@
 
 #include <sys/types.h>
 
+/* IBM z/OS uses -D__string=1 as an inclusion guard.  */
+#if defined(__MVS__) && defined(__string)
+# undef __string
+# define __string __string
+#endif
+
 /* Allow the use in C++ code.  */
 #ifdef __cplusplus
 extern "C" {
diff --git a/lib/string.in.h b/lib/string.in.h
index b3356bb..6359ea3 100644
--- a/lib/string.in.h
+++ b/lib/string.in.h
@@ -44,6 +44,12 @@
 #ifndef _@GUARD_PREFIX@_STRING_H
 #define _@GUARD_PREFIX@_STRING_H
 
+/* IBM z/OS uses -D__string=1 as an inclusion guard.  */
+#if defined(__MVS__) && defined(__string)
+# undef __string
+# define __string __string
+#endif
+
 /* NetBSD 5.0 mis-defines NULL.  */
 #include <stddef.h>
 
diff --git a/lib/strtod.c b/lib/strtod.c
index 9fd0170..09bc76a 100644
--- a/lib/strtod.c
+++ b/lib/strtod.c
@@ -239,7 +239,13 @@ strtod (const char *nptr, char **endptr)
       if (*s == '0' && c_tolower (s[1]) == 'x')
         {
           if (! c_isxdigit (s[2 + (s[2] == '.')]))
-            end = s + 1;
+            {
+              end = s + 1;
+
+              /* strtod() on z/OS is confused by "0x".  */
+              if (errno == ERANGE)
+                errno = 0;
+            }
           else if (end <= s + 2)
             {
               num = parse_number (s + 2, 16, 2, 4, 'p', &endbuf);
@@ -321,7 +327,7 @@ strtod (const char *nptr, char **endptr)
          better to use the underlying implementation's result, since a
          nice implementation populates the bits of the NaN according
          to interpreting n-char-sequence as a hexadecimal number.  */
-      if (s != end)
+      if (s != end || !isnand(num))
         num = NAN;
       errno = saved_errno;
     }
diff --git a/m4/fclose.m4 b/m4/fclose.m4
index 6bd1ad8..92ba457 100644
--- a/m4/fclose.m4
+++ b/m4/fclose.m4
@@ -17,4 +17,8 @@ AC_DEFUN([gl_FUNC_FCLOSE],
   if test $REPLACE_CLOSE = 1; then
     REPLACE_FCLOSE=1
   fi
+
+  case "$host" in
+    *-ibm-openedition) REPLACE_FCLOSE=1 ;;
+  esac
 ])
diff --git a/m4/strstr.m4 b/m4/strstr.m4
index 040c0b9..e623e28 100644
--- a/m4/strstr.m4
+++ b/m4/strstr.m4
@@ -79,6 +79,11 @@ static void quit (int sig) { exit (sig + 128); }
     char *needle = (char *) malloc (m + 2);
     /* Failure to compile this test due to missing alarm is okay,
        since all such platforms (mingw) also have quadratic strstr.  */
+#ifdef __MVS__
+    /* Except for z/OS, which does not deliver signals while strstr()
+       is running (thanks to restrictions on its LE runtime).  */
+    return 1;
+#endif
     signal (SIGALRM, quit);
     alarm (5);
     /* Check for quadratic performance.  */
diff --git a/m4/wchar_h.m4 b/m4/wchar_h.m4
index 9d1b0f8..35ece60 100644
--- a/m4/wchar_h.m4
+++ b/m4/wchar_h.m4
@@ -81,8 +81,14 @@ AC_DEFUN([gl_WCHAR_H_INLINE_OK],
 extern int zero (void);
 int main () { return zero(); }
 ]])])
+     dnl Do not rename the object file from conftest.$ac_objext to
+     dnl conftest1.$ac_objext, as this will cause the link to fail on
+     dnl z/OS when using the XPLINK object format (due to duplicate
+     dnl CSECT names). Instead, we temporarily redefine $ac_compile so
+     dnl that the object file has the latter name from the start.
+     save_ac_compile="$ac_compile"
+     ac_compile=`echo "$save_ac_compile" | sed s/conftest/conftest1/`
      if AC_TRY_EVAL([ac_compile]); then
-       mv conftest.$ac_objext conftest1.$ac_objext
        AC_LANG_CONFTEST([
          AC_LANG_SOURCE([[#define wcstod renamed_wcstod
 /* Tru64 with Desktop Toolkit C has a bug: <stdio.h> must be included before
@@ -95,8 +101,9 @@ int main () { return zero(); }
 #include <wchar.h>
 int zero (void) { return 0; }
 ]])])
+       dnl See note above about renaming object files.
+       ac_compile=`echo "$save_ac_compile" | sed s/conftest/conftest2/`
        if AC_TRY_EVAL([ac_compile]); then
-         mv conftest.$ac_objext conftest2.$ac_objext
          if $CC -o conftest$ac_exeext $CFLAGS $LDFLAGS conftest1.$ac_objext conftest2.$ac_objext $LIBS >&AS_MESSAGE_LOG_FD 2>&1; then
            :
          else
@@ -104,6 +111,7 @@ int zero (void) { return 0; }
          fi
        fi
      fi
+     ac_compile="$save_ac_compile"
      rm -f conftest1.$ac_objext conftest2.$ac_objext conftest$ac_exeext
     ])
   if test $gl_cv_header_wchar_h_correct_inline = no; then
diff --git a/tests/infinity.h b/tests/infinity.h
index 45c30bd..4e8a755 100644
--- a/tests/infinity.h
+++ b/tests/infinity.h
@@ -17,8 +17,9 @@
 
 /* Infinityf () returns a 'float' +Infinity.  */
 
-/* The Microsoft MSVC 9 compiler chokes on the expression 1.0f / 0.0f.  */
-#if defined _MSC_VER
+/* The Microsoft MSVC 9 compiler chokes on the expression 1.0f / 0.0f.
+   The IBM XL C compiler on z/OS complains.  */
+#if defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static float
 Infinityf ()
 {
@@ -32,8 +33,9 @@ Infinityf ()
 
 /* Infinityd () returns a 'double' +Infinity.  */
 
-/* The Microsoft MSVC 9 compiler chokes on the expression 1.0 / 0.0.  */
-#if defined _MSC_VER
+/* The Microsoft MSVC 9 compiler chokes on the expression 1.0 / 0.0.
+   The IBM XL C compiler on z/OS complains.  */
+#if defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static double
 Infinityd ()
 {
@@ -47,9 +49,10 @@ Infinityd ()
 
 /* Infinityl () returns a 'long double' +Infinity.  */
 
-/* The Microsoft MSVC 9 compiler chokes on the expression 1.0L / 0.0L.  */
-#if defined _MSC_VER
-static double
+/* The Microsoft MSVC 9 compiler chokes on the expression 1.0L / 0.0L.
+   The IBM XL C compiler on z/OS complains.  */
+#if defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
+static long double
 Infinityl ()
 {
   static long double zero = 0.0L;
diff --git a/tests/nan.h b/tests/nan.h
index 9f6819c..10b393e 100644
--- a/tests/nan.h
+++ b/tests/nan.h
@@ -15,11 +15,18 @@
    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
 
+/* IBM z/OS supports both hexadecimal and IEEE floating-point formats. The
+   former does not support NaN and its isnan() implementation returns zero
+   for all values.  */
+#if defined __MVS__ && defined __IBMC__ && !defined __BFP__
+# error "NaN is not supported with IBM's hexadecimal floating-point format; please re-compile with -qfloat=ieee"
+#endif
+
 /* NaNf () returns a 'float' not-a-number.  */
 
 /* The Compaq (ex-DEC) C 6.4 compiler and the Microsoft MSVC 9 compiler choke
-   on the expression 0.0 / 0.0.  */
-#if defined __DECC || defined _MSC_VER
+   on the expression 0.0 / 0.0.  The IBM XL C compiler on z/OS complains.  */
+#if defined __DECC || defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static float
 NaNf ()
 {
@@ -34,8 +41,8 @@ NaNf ()
 /* NaNd () returns a 'double' not-a-number.  */
 
 /* The Compaq (ex-DEC) C 6.4 compiler and the Microsoft MSVC 9 compiler choke
-   on the expression 0.0 / 0.0.  */
-#if defined __DECC || defined _MSC_VER
+   on the expression 0.0 / 0.0.  The IBM XL C compiler on z/OS complains.  */
+#if defined __DECC || defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static double
 NaNd ()
 {
@@ -51,14 +58,15 @@ NaNd ()
 
 /* On Irix 6.5, gcc 3.4.3 can't compute compile-time NaN, and needs the
    runtime type conversion.
-   The Microsoft MSVC 9 compiler chokes on the expression 0.0L / 0.0L.  */
+   The Microsoft MSVC 9 compiler chokes on the expression 0.0L / 0.0L.
+   The IBM XL C compiler on z/OS complains.  */
 #ifdef __sgi
 static long double NaNl ()
 {
   double zero = 0.0;
   return zero / zero;
 }
-#elif defined _MSC_VER
+#elif defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static long double
 NaNl ()
 {
diff --git a/tests/test-c-ctype.c b/tests/test-c-ctype.c
index 81fe936..f7a2e39 100644
--- a/tests/test-c-ctype.c
+++ b/tests/test-c-ctype.c
@@ -24,6 +24,12 @@
 
 #include "macros.h"
 
+#if 'A' < 0
+# define CHAR_LITERAL(C) ((C) >= 128 && (C) < 256 ? (C) - 256 : (C))
+#else
+# define CHAR_LITERAL(C) ((C) >= -128 && (C) < 0 ? (C) + 256 : (C))
+#endif
+
 static void
 test_all (void)
 {
@@ -31,9 +37,11 @@ test_all (void)
 
   for (c = -0x80; c < 0x100; c++)
     {
+#ifdef C_CTYPE_ASCII
       ASSERT (c_isascii (c) == (c >= 0 && c < 0x80));
+#endif
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
         case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -54,7 +62,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
         case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -73,7 +81,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case '\t': case ' ':
           ASSERT (c_isblank (c) == 1);
@@ -83,9 +91,13 @@ test_all (void)
           break;
         }
 
+#ifdef C_CTYPE_ASCII
       ASSERT (c_iscntrl (c) == ((c >= 0 && c < 0x20) || c == 0x7f));
+#else
+      ASSERT (!! c_iscntrl (c) == !! iscntrl (c & 0xff));
+#endif
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case '0': case '1': case '2': case '3': case '4': case '5':
         case '6': case '7': case '8': case '9':
@@ -96,7 +108,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
         case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
@@ -110,13 +122,27 @@ test_all (void)
           break;
         }
 
+#ifdef C_CTYPE_ASCII
       ASSERT (c_isgraph (c) == ((c >= 0x20 && c < 0x7f) && c != ' '));
+#else
+      ASSERT (!! c_isgraph (c) == !! isgraph (c & 0xff));
+#endif
 
+#ifdef C_CTYPE_ASCII
       ASSERT (c_isprint (c) == (c >= 0x20 && c < 0x7f));
+#else
+      ASSERT (!! c_isprint (c) == !! isprint (c & 0xff));
+#endif
 
+#ifdef C_CTYPE_ASCII
       ASSERT (c_ispunct (c) == (c_isgraph (c) && !c_isalnum (c)));
+#else
+      /* EBCDIC contains characters like accented letters, which fail
+         the above test.  */
+      ASSERT (!! c_ispunct (c) == !! ispunct (c & 0xff));
+#endif
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case ' ': case '\t': case '\n': case '\v': case '\f': case '\r':
           ASSERT (c_isspace (c) == 1);
@@ -126,7 +152,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
         case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -140,7 +166,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case '0': case '1': case '2': case '3': case '4': case '5':
         case '6': case '7': case '8': case '9':
@@ -153,7 +179,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (CHAR_LITERAL (c))
         {
         case 'A':
           ASSERT (c_tolower (c) == 'a');
diff --git a/tests/test-c-strcasecmp.c b/tests/test-c-strcasecmp.c
index f7f6b43..47feac8 100644
--- a/tests/test-c-strcasecmp.c
+++ b/tests/test-c-strcasecmp.c
@@ -19,6 +19,7 @@
 #include <config.h>
 
 #include "c-strcase.h"
+#include "c-ctype.h"
 
 #include <locale.h>
 #include <string.h>
@@ -57,9 +58,11 @@ main (int argc, char *argv[])
   ASSERT (c_strcasecmp ("\303\266zg\303\274r", "\303\226ZG\303\234R") > 0); /* özgür */
   ASSERT (c_strcasecmp ("\303\226ZG\303\234R", "\303\266zg\303\274r") < 0); /* özgür */
 
+#if C_CTYPE_ASCII
   /* This test shows how strings of different size cannot compare equal.  */
   ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
   ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);
+#endif
 
   return 0;
 }
diff --git a/tests/test-c-strncasecmp.c b/tests/test-c-strncasecmp.c
index 4027b5b..20c64e3 100644
--- a/tests/test-c-strncasecmp.c
+++ b/tests/test-c-strncasecmp.c
@@ -19,6 +19,7 @@
 #include <config.h>
 
 #include "c-strcase.h"
+#include "c-ctype.h"
 
 #include <locale.h>
 #include <string.h>
@@ -71,9 +72,11 @@ main (int argc, char *argv[])
   ASSERT (c_strncasecmp ("\303\266zg\303\274r", "\303\226ZG\303\234R", 99) > 0); /* özgür */
   ASSERT (c_strncasecmp ("\303\226ZG\303\234R", "\303\266zg\303\274r", 99) < 0); /* özgür */
 
+#if C_CTYPE_ASCII
   /* This test shows how strings of different size cannot compare equal.  */
   ASSERT (c_strncasecmp ("turkish", "TURK\304\260SH", 7) < 0);
   ASSERT (c_strncasecmp ("TURK\304\260SH", "turkish", 7) > 0);
+#endif
 
   return 0;
 }
diff --git a/tests/test-canonicalize-lgpl.c b/tests/test-canonicalize-lgpl.c
index 12d2bb0..49c0221 100644
--- a/tests/test-canonicalize-lgpl.c
+++ b/tests/test-canonicalize-lgpl.c
@@ -191,12 +191,16 @@ main (void)
     ASSERT (result2);
     ASSERT (stat ("/", &st1) == 0);
     ASSERT (stat ("//", &st2) == 0);
+    /* On IBM z/OS, "/" and "//" are distinct, yet they both have
+       st_dev == st_ino == 1.  */
+#ifndef __MVS__
     if (SAME_INODE (st1, st2))
       {
         ASSERT (strcmp (result1, "/") == 0);
         ASSERT (strcmp (result2, "/") == 0);
       }
     else
+#endif
       {
         ASSERT (strcmp (result1, "//") == 0);
         ASSERT (strcmp (result2, "//") == 0);
diff --git a/tests/test-iconv-utf.c b/tests/test-iconv-utf.c
index c1589f6..547e859 100644
--- a/tests/test-iconv-utf.c
+++ b/tests/test-iconv-utf.c
@@ -27,20 +27,39 @@
 
 #include "macros.h"
 
+/* If we're compiling on an EBCDIC-based system, we need the test strings
+   to remain in ASCII.  */
+#if 'A' != 0x41 && defined(__IBMC__)
+# pragma convert("ISO8859-1")
+# define CONVERT_ENABLED
+#endif
+
+/* The text is "Japanese (日本語) [\U0001D50D\U0001D51E\U0001D52D]".  */
+
+const char test_utf8_string[] = "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+
+const char test_utf16be_string[] = "\000J\000a\000p\000a\000n\000e\000s\000e\000 \000(\145\345\147\054\212\236\000)\000 \000[\330\065\335\015\330\065\335\036\330\065\335\055\000]";
+
+const char test_utf16le_string[] = "J\000a\000p\000a\000n\000e\000s\000e\000 \000(\000\345\145\054\147\236\212)\000 \000[\000\065\330\015\335\065\330\036\335\065\330\055\335]\000";
+
+const char test_utf32be_string[] = "\000\000\000J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\145\345\000\000\147\054\000\000\212\236\000\000\000)\000\000\000 \000\000\000[\000\001\325\015\000\001\325\036\000\001\325\055\000\000\000]";
+
+const char test_utf32le_string[] = "J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\000\345\145\000\000\054\147\000\000\236\212\000\000)\000\000\000 \000\000\000[\000\000\000\015\325\001\000\036\325\001\000\055\325\001\000]\000\000\000";
+
+#ifdef CONVERT_ENABLED
+# pragma convert(pop)
+#endif
+
 int
 main ()
 {
 #if HAVE_ICONV
   /* Assume that iconv() supports at least the encoding UTF-8.  */
 
-  /* The text is "Japanese (日本語) [\U0001D50D\U0001D51E\U0001D52D]".  */
-
   /* Test conversion from UTF-8 to UTF-16BE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "\000J\000a\000p\000a\000n\000e\000s\000e\000 \000(\145\345\147\054\212\236\000)\000 \000[\330\065\335\015\330\065\335\036\330\065\335\055\000]";
+#define input    test_utf8_string
+#define expected test_utf16be_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -64,14 +83,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-8 to UTF-16LE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "J\000a\000p\000a\000n\000e\000s\000e\000 \000(\000\345\145\054\147\236\212)\000 \000[\000\065\330\015\335\065\330\036\335\065\330\055\335]\000";
+#define input    test_utf8_string
+#define expected test_utf16le_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -95,14 +115,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-8 to UTF-32BE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "\000\000\000J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\145\345\000\000\147\054\000\000\212\236\000\000\000)\000\000\000 \000\000\000[\000\001\325\015\000\001\325\036\000\001\325\055\000\000\000]";
+#define input    test_utf8_string
+#define expected test_utf32be_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -126,14 +147,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-8 to UTF-32LE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\000\345\145\000\000\054\147\000\000\236\212\000\000)\000\000\000 \000\000\000[\000\000\000\015\325\001\000\036\325\001\000\055\325\001\000]\000\000\000";
+#define input    test_utf8_string
+#define expected test_utf32le_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -157,14 +179,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-16BE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "\000J\000a\000p\000a\000n\000e\000s\000e\000 \000(\145\345\147\054\212\236\000)\000 \000[\330\065\335\015\330\065\335\036\330\065\335\055\000]";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf16be_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -188,14 +211,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-16LE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "J\000a\000p\000a\000n\000e\000s\000e\000 \000(\000\345\145\054\147\236\212)\000 \000[\000\065\330\015\335\065\330\036\335\065\330\055\335]\000";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf16le_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -219,14 +243,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-32BE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "\000\000\000J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\145\345\000\000\147\054\000\000\212\236\000\000\000)\000\000\000 \000\000\000[\000\001\325\015\000\001\325\036\000\001\325\055\000\000\000]";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf32be_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -250,14 +275,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-32LE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\000\345\145\000\000\054\147\000\000\236\212\000\000)\000\000\000 \000\000\000[\000\000\000\015\325\001\000\036\325\001\000\055\325\001\000]\000\000\000";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf32le_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -281,6 +307,9 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 #endif
 
diff --git a/tests/test-iconv.c b/tests/test-iconv.c
index ed715bd..a64c6dd 100644
--- a/tests/test-iconv.c
+++ b/tests/test-iconv.c
@@ -44,8 +44,14 @@ main ()
 #if HAVE_ICONV
   /* Assume that iconv() supports at least the encodings ASCII, ISO-8859-1,
      and UTF-8.  */
-  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO-8859-1");
-  iconv_t cd_utf8_to_88591 = iconv_open ("ISO-8859-1", "UTF-8");
+  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO8859-1");
+  iconv_t cd_utf8_to_88591 = iconv_open ("ISO8859-1", "UTF-8");
+
+#if defined __MVS__ && defined __IBMC__
+  /* String literals below are in ASCII, not EBCDIC.  */
+# pragma convert("ISO8859-1")
+# define CONVERT_ENABLED
+#endif
 
   ASSERT (cd_88591_to_utf8 != (iconv_t)(-1));
   ASSERT (cd_utf8_to_88591 != (iconv_t)(-1));
@@ -142,7 +148,12 @@ main ()
 
   iconv_close (cd_88591_to_utf8);
   iconv_close (cd_utf8_to_88591);
+
+#ifdef CONVERT_ENABLED
+# pragma convert(pop)
 #endif
 
+#endif /* HAVE_ICONV */
+
   return 0;
 }
diff --git a/tests/test-nonblocking-pipe.h b/tests/test-nonblocking-pipe.h
index 5b3646e..01c992c 100644
--- a/tests/test-nonblocking-pipe.h
+++ b/tests/test-nonblocking-pipe.h
@@ -31,10 +31,11 @@
      OSF/1                           >= 262145
      Solaris <= 7                    >= 10241
      Solaris >= 8                    >= 20481
+     z/OS                            >= 131073
      Cygwin                          >= 65537
      native Windows                  >= 4097 (depends on the _pipe argument)
  */
-#if defined __osf__ || (defined __linux__ && (defined __ia64__ || defined __mips__))
+#if defined __MVS__ || defined __osf__ || (defined __linux__ && (defined __ia64__ || defined __mips__))
 # define PIPE_DATA_BLOCK_SIZE 270000
 #elif defined __linux__ && defined __sparc__
 # define PIPE_DATA_BLOCK_SIZE 140000
diff --git a/tests/test-nonblocking-reader.h b/tests/test-nonblocking-reader.h
index 8cba131..d8eaa32 100644
--- a/tests/test-nonblocking-reader.h
+++ b/tests/test-nonblocking-reader.h
@@ -110,7 +110,7 @@ full_read_from_nonblocking_fd (size_t fd, void *buf, size_t count)
       ASSERT (spent_time < 0.5);
       if (ret < 0)
         {
-          ASSERT (saved_errno == EAGAIN);
+          ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
           usleep (SMALL_DELAY);
         }
       else
diff --git a/tests/test-nonblocking-writer.h b/tests/test-nonblocking-writer.h
index 0ecf996..ff148dc 100644
--- a/tests/test-nonblocking-writer.h
+++ b/tests/test-nonblocking-writer.h
@@ -124,7 +124,7 @@ main_writer_loop (int test, size_t data_block_size, int fd,
                         (long) ret, dbgstrerror (ret < 0, saved_errno));
             if (ret < 0 && bytes_written >= data_block_size)
               {
-                ASSERT (saved_errno == EAGAIN);
+                ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
                 ASSERT (spent_time < 0.5);
                 break;
               }
@@ -133,7 +133,7 @@ main_writer_loop (int test, size_t data_block_size, int fd,
             ASSERT (spent_time < 0.5);
             if (ret < 0)
               {
-                ASSERT (saved_errno == EAGAIN);
+                ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
                 usleep (SMALL_DELAY);
               }
             else
@@ -165,7 +165,7 @@ main_writer_loop (int test, size_t data_block_size, int fd,
             ASSERT (spent_time < 0.5);
             if (ret < 0)
               {
-                ASSERT (saved_errno == EAGAIN);
+                ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
                 usleep (SMALL_DELAY);
               }
             else
diff --git a/tests/test-sigpipe.sh b/tests/test-sigpipe.sh
index bc2baf2..6cf3242 100755
--- a/tests/test-sigpipe.sh
+++ b/tests/test-sigpipe.sh
@@ -21,7 +21,7 @@ fi
 
 # Test signal's behaviour when a handler is installed.
 tmpfiles="$tmpfiles t-sigpipeC.tmp"
-./test-sigpipe${EXEEXT} B 2> t-sigpipeC.tmp | head -1 > /dev/null
+./test-sigpipe${EXEEXT} C 2> t-sigpipeC.tmp | head -1 > /dev/null
 if test -s t-sigpipeC.tmp; then
   LC_ALL=C tr -d '\r' < t-sigpipeC.tmp
   rm -fr $tmpfiles; exit 1
diff --git a/tests/test-wcwidth.c b/tests/test-wcwidth.c
index 9fad785..fdbecc3 100644
--- a/tests/test-wcwidth.c
+++ b/tests/test-wcwidth.c
@@ -26,6 +26,7 @@ SIGNATURE_CHECK (wcwidth, int, (wchar_t));
 #include <locale.h>
 #include <string.h>
 
+#include "c-ctype.h"
 #include "localcharset.h"
 #include "macros.h"
 
@@ -34,9 +35,11 @@ main ()
 {
   wchar_t wc;
 
+#ifdef C_CTYPE_ASCII
   /* Test width of ASCII characters.  */
   for (wc = 0x20; wc < 0x7F; wc++)
     ASSERT (wcwidth (wc) == 1);
+#endif
 
   /* Switch to an UTF-8 locale.  */
   if (setlocale (LC_ALL, "fr_FR.UTF-8") != NULL

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22  2:28 [PATCH] IBM z/OS + EBCDIC support Daniel Richard G.
@ 2015-09-22 15:23 ` Eric Blake
  2015-09-22 19:27   ` Daniel Richard G.
  2015-09-22 19:32 ` Paul Eggert
  2015-09-22 19:50 ` [PATCH] IBM z/OS + EBCDIC support Paul Eggert
  2 siblings, 1 reply; 49+ messages in thread
From: Eric Blake @ 2015-09-22 15:23 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 3950 bytes --]

On 09/21/2015 08:28 PM, Daniel Richard G. wrote:
> Hello list,
> 
> The attached patch, against Git master, addresses numerous
> incompatibilities in Gnulib with IBM z/OS (a mainframe operating system)
> and the EBCDIC encoding.
> 
> With my changes, Gnulib builds successfully, and most of the tests
> succeed. The remaining failures are as follows.

Thanks for the work. Can you please split the patch into a series of
multiple pieces, one patch per issue, so that we can apply the
obviously-correct ones while still discussing the other pieces, rather
than holding the entire large patch hostage to review?

Also, while I see you have copyright assignment on file for Gawk, I
don't see it for gnulib. You'll want to repeat the assignment process
for gnulib before we can take more than the most trivial patches.

Some quick comments, without having reviewed any code:

> 
> * In EBCDIC, normal chars like 'A' occur in the upper half of the 8-bit
>   range. This interferes with the idiom of using "switch (c)" and then
>   "case 'A':" et al. because c can have two distinct values (-63 and
>   193) that should match to 'A'.
> 
>   My fix, then, is a macro which converts the input codepoint to the
>   range that will match literal chars, when necessary. (Obviously, in
>   ASCII, it's a no-op.) Any takers on a better name for this macro than
>   CHAR_LITERAL()?

coreutils uses to_uchar() to force the conversion of a byte to an
unsigned character, useful for cases where sign extension of a byte is
not desired.  Sounds like it does the same thing as what you are doing here.

> +++ lib/math.in.h
> 
> * The system defines these functions as macros, and the compiler did not
>   like seeing them redefined.

No underlying functions with linkage? POSIX generally requires that, so
you may want to submit a bug, but it's certainly not the first time
we've worked around that.

> 
> +++ lib/regex.h
> 
> * Ensure that "__string" does not expand to "1" when it is used as a
>   formal parameter name.

Sounds like we shouldn't be naming our formal parameter __string, since
that's a name reserved to the internal implementation namespace.

> 
> +++ m4/strstr.m4
> 
> * The IBM runtime sucks; signal delivery is delayed until strstr()
>   exits, so this test results in a hang that can only be SIGKILL'ed.

Not a hang, just a reallllllly long execution time; and all because the
libc implementation is O(n^2) instead of O(n).  But they really block
signals during the call?  Ouch.

> +++ tests/nan.h
> 
> * z/OS, in addition to supporting IEEE floating-point, also supports an
>   older "hexadecimal" format that does not support NaN. Bomb out if this
>   is in use.

C, and POSIX, allow for platforms without NaN (in part because of cases
like the z/OS non-IEEE mode).  I'm not surprised if we have baked in
assumptions that don't hold when IEEE is not around.

> +++ tests/test-c-strcasecmp.c
> 
> * In EBCDIC-1047, the tests
> 
>     ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
>     ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);
> 
>   are actually
> 
>     ASSERT (c_strcasecmp ("turkish", "TURKD¬SH") < 0);
>     ASSERT (c_strcasecmp ("TURKD¬SH", "turkish") > 0);
> 
>   which, of course, fail.

Basically, EBCDIC lacks the Turkish i, and since it is not a UTF-8
locale, we should probably be skipping the test in that environment.

> +++ tests/test-canonicalize-lgpl.c
> 
> * Addressed a strange z/OS corner case. This system has
>   DOUBLE_SLASH_IS_DISTINCT_ROOT, yet the dev/ino numbers are the same.

What? Does that mean 'ls -a /' and 'ls -a //' see different contents?
If they do, then sharing dev/ino is a bug; if they are identical, then
DOUBLE_SLASH_IS_DISTINCT_ROOT is defined incorrectly.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 15:23 ` Eric Blake
@ 2015-09-22 19:27   ` Daniel Richard G.
  2015-09-22 20:00     ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-22 19:27 UTC (permalink / raw)
  To: Eric Blake, bug-gnulib

Hi Eric,

On Tue, 2015 Sep 22 09:23-0600, Eric Blake wrote:
>
> Thanks for the work. Can you please split the patch into a series of
> multiple pieces, one patch per issue, so that we can apply the obviously-
> correct ones while still discussing the other pieces, rather than
> holding the entire large patch hostage to review?

Wouldn't it be easier to apply everything to a feature branch, and
integrate it bit by bit? I can split up the patch, but my idea of a
sensible partitioning might not agree with yours...

> Also, while I see you have copyright assignment on file for Gawk, I
> don't see it for gnulib. You'll want to repeat the assignment process
> for gnulib before we can take more than the most trivial patches.

Okay, I've sent in the request form.

> coreutils uses to_uchar() to force the conversion of a byte to an
> unsigned character, useful for cases where sign extension of a byte
> is not desired.  Sounds like it does the same thing as what you are
> doing here.

Kind of, except that character literals may be signed, and thus
potentially have a negative value. to_uchar() could be applied to both
sides, but then that wouldn't work in a "switch" block.

What my macro does, then, is "convert to the value that a matching char
literal would have, if it's not there already."

> > +++ lib/math.in.h
> >
> > * The system defines these functions as macros, and the compiler did
> >   not like seeing them redefined.
>
> No underlying functions with linkage? POSIX generally requires that,
> so you may want to submit a bug, but it's certainly not the first time
> we've worked around that.

It wasn't that the functions had no linkage (though that may or may not
be the case), just that the compiler borked on the macro redefinition.

> > +++ lib/regex.h
> >
> > * Ensure that "__string" does not expand to "1" when it is used as a
> >   formal parameter name.
>
> Sounds like we shouldn't be naming our formal parameter __string,
> since that's a name reserved to the internal implementation namespace.

That would be a better fix, yes. But doesn't this file come from glibc?
Is it feasible to make such a change happen there?

(The Gawk maintainer was reluctant to make local changes to that
project's regex files.)

> > +++ m4/strstr.m4
> >
> > * The IBM runtime sucks; signal delivery is delayed until strstr()
> >   exits, so this test results in a hang that can only be SIGKILL'ed.
>
> Not a hang, just a reallllllly long execution time; and all because
> the libc implementation is O(n^2) instead of O(n).  But they really
> block signals during the call?  Ouch.

Yes, that was one of many forehead-slapping moments in this work :>

And the silly thing is, if you provide your own implementation of
strstr(), it won't have this problem, because it won't be in the
system runtime where signal delivery is verboten!

> > +++ tests/nan.h
> >
> > * z/OS, in addition to supporting IEEE floating-point, also supports
> >   an older "hexadecimal" format that does not support NaN. Bomb out
> >   if this is in use.
>
> C, and POSIX, allow for platforms without NaN (in part because of
> cases like the z/OS non-IEEE mode).  I'm not surprised if we have
> baked in assumptions that don't hold when IEEE is not around.

If you'd like to disable the NaN stuff cleanly when IEEE is not in use,
I'd be happy to help make that happen. But for my purposes, I want the
same floating-point gestalt as all other platforms of interest have, so
I punted on the hex floats.

> >     ASSERT (c_strcasecmp ("turkish", "TURKD¬SH") < 0);
> >     ASSERT (c_strcasecmp ("TURKD¬SH", "turkish") > 0);
> > 
> >   which, of course, fail.
>
> Basically, EBCDIC lacks the Turkish i, and since it is not a UTF-8
> locale, we should probably be skipping the test in that environment.

I see no harm in checking for unexpected-UTF-8 behavior; it's just the
fact that this is not ASCII that is throwing things off.

> > +++ tests/test-canonicalize-lgpl.c
> >
> > * Addressed a strange z/OS corner case. This system has
> >   DOUBLE_SLASH_IS_DISTINCT_ROOT, yet the dev/ino numbers are the
> >   same.
>
> What? Does that mean 'ls -a /' and 'ls -a //' see different contents?
> If they do, then sharing dev/ino is a bug; if they are identical, then
> DOUBLE_SLASH_IS_DISTINCT_ROOT is defined incorrectly.

Well... those two "ls" commands give the same output, but the
double-slash is used within z/OS Unix System Services to refer to a
different sort of file space:

    http://www-01.ibm.com/support/knowledgecenter/SSLTBW_2.1.0/com.ibm.zos.v2r1.bpxa500/mvsds.htm?lang=en

Some commands support that syntax, but not all. As for what the
configure test does...

    $ ls -di / //
        1 /       1 //
    $ wc /dev/null
          0       0       0    /dev/null
    $ wc //dev/null
    wc: file "//dev/null": EDC5047I An invalid file name was specified as a function parameter.

And yet...

    $ cd //dev
    $ ls -l null
    crwxrwxrwx   1 BPXROOT  SYS1       4,  0 Sep  3  2013 null

It's not clear exactly how this "alternate root" is implemented---
possibly by intercepting pathnames in open(). Might be worth
special-casing the DOUBLE_SLASH test for this platform...


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22  2:28 [PATCH] IBM z/OS + EBCDIC support Daniel Richard G.
  2015-09-22 15:23 ` Eric Blake
@ 2015-09-22 19:32 ` Paul Eggert
  2015-09-22 19:46   ` Paul Eggert
  2015-09-22 20:37   ` Daniel Richard G.
  2015-09-22 19:50 ` [PATCH] IBM z/OS + EBCDIC support Paul Eggert
  2 siblings, 2 replies; 49+ messages in thread
From: Paul Eggert @ 2015-09-22 19:32 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1647 bytes --]

Thanks for looking into this.  I have some questions about the c-ctype 
changes.  It appears that the proposed patch defers to the system 
functions (which use the current locale), but that's not the intent of 
c-ctype: it's supposed to correspond to a stripped down POSIX "C" locale 
regardless of the current locale settings.  Is there something special 
in z/OS that requires using the system functions?  (E.g., does the "C" 
locale behave differently depending on some *other* setting regarding 
character set?)

With the above in mind, it's not clear what c_isascii should do. Should 
it return 1 for bytes in the range 0..127, or for bytes that correspond 
to ASCII bytes if one assumes the standard translation from EBCDIC code 
page 037 to ASCII?  (Is there a standard?)  If the former, the current 
code is OK; if the latter, does the system isascii always return the 
same results regardless of locale and do these results make sense?

Anyway, in looking through the code I see that it's hard to test a port 
to EBCDIC because it uses ifdef rather than if, and I do see some 
promotion bugs that you noted but we can fix these with inline functions 
rather than macros (cleaner and safer nowadays), and there are a few 
other style glitches (e.g., boolean values, overuse of >=) so I 
installed the attached patch.  This patch assumes EBCDIC control 
characters are either less than ' ' or are all 1 bits, which I think is 
right.  The patch also tightens up the tests a bit.

This patch doesn't address the isascii problem, nor the "something 
special in z/OS" problem, so quite possibly further patches will be 
needed to this module.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-c-ctype-port-better-to-EBCDIC.patch --]
[-- Type: text/x-patch; name="0001-c-ctype-port-better-to-EBCDIC.patch", Size: 20517 bytes --]

>From 1b0f778e32f73c8601e7c517a0b83098996363a9 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 22 Sep 2015 12:17:06 -0700
Subject: [PATCH] c-ctype: port better to EBCDIC

Problems reported by Daniel Richard G. in
http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00020.html
* lib/c-ctype.c: Include <limits.h>, for CHAR_MIN and CHAR_MAX.
Include "verify.h".
(C_CTYPE_ASCII, C_CTYPE_CONSECUTIVE_DIGITS)
(C_CTYPE_CONSECUTIVE_LOWERCASE, C_CTYPE_CONSECUTIVE_UPPERCASE):
Define as enum constants with value false, if not defined, so that
code can use 'if' instead of 'ifdef'.  Using 'if' helps make the
code more portable, as both branches of the 'if' are compiled on
all platforms.
(C_CTYPE_EBCDIC): New constant.
(to_char): New static function.
(c_isalnum, c_isalpha, c_isdigit, c_islower, c_isgraph, c_isprint)
(c_ispunct, c_isupper, c_isxdigit, c_tolower, c_toupper):
Rewrite to use 'if' instead of 'ifdef'.
Use to_char if non-ASCII.  Prefer <= to >=.
Prefer true and false to 1 and 0, for booleans.
(c_iscntrl): Use 'if', not 'ifdef'.  Special case for EBCDIC.
Verify that the character set is either ASCII or EBCDIC.
* tests/test-c-ctype.c: Include <limits.h>, for CHAR_MIN
(to_char): New function.
(test_all): Port to EBCDIC.  Add some more tests, e.g., for c_ispunct.
---
 ChangeLog            |  26 ++++++
 lib/c-ctype.c        | 253 ++++++++++++++++++++++++++-------------------------
 tests/test-c-ctype.c | 106 +++++++++++----------
 3 files changed, 216 insertions(+), 169 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index c552225..8723b38 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,29 @@
+2015-09-22  Paul Eggert  <eggert@cs.ucla.edu>
+
+	c-ctype: port better to EBCDIC
+	Problems reported by Daniel Richard G. in
+	http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00020.html
+	* lib/c-ctype.c: Include <limits.h>, for CHAR_MIN and CHAR_MAX.
+	Include "verify.h".
+	(C_CTYPE_ASCII, C_CTYPE_CONSECUTIVE_DIGITS)
+	(C_CTYPE_CONSECUTIVE_LOWERCASE, C_CTYPE_CONSECUTIVE_UPPERCASE):
+	Define as enum constants with value false, if not defined, so that
+	code can use 'if' instead of 'ifdef'.  Using 'if' helps make the
+	code more portable, as both branches of the 'if' are compiled on
+	all platforms.
+	(C_CTYPE_EBCDIC): New constant.
+	(to_char): New static function.
+	(c_isalnum, c_isalpha, c_isdigit, c_islower, c_isgraph, c_isprint)
+	(c_ispunct, c_isupper, c_isxdigit, c_tolower, c_toupper):
+	Rewrite to use 'if' instead of 'ifdef'.
+	Use to_char if non-ASCII.  Prefer <= to >=.
+	Prefer true and false to 1 and 0, for booleans.
+	(c_iscntrl): Use 'if', not 'ifdef'.  Special case for EBCDIC.
+	Verify that the character set is either ASCII or EBCDIC.
+	* tests/test-c-ctype.c: Include <limits.h>, for CHAR_MIN
+	(to_char): New function.
+	(test_all): Port to EBCDIC.  Add some more tests, e.g., for c_ispunct.
+
 2015-09-21  Pádraig Brady  <P@draigBrady.com>
 
 	nanosleep: fix return code for interrupted replacement
diff --git a/lib/c-ctype.c b/lib/c-ctype.c
index 6635d34..916d46e 100644
--- a/lib/c-ctype.c
+++ b/lib/c-ctype.c
@@ -21,6 +21,34 @@ along with this program; if not, see <http://www.gnu.org/licenses/>.  */
 #define NO_C_CTYPE_MACROS
 #include "c-ctype.h"
 
+#include <limits.h>
+#include "verify.h"
+
+#ifndef C_CTYPE_ASCII
+enum { C_CTYPE_ASCII = false };
+#endif
+#ifndef C_CTYPE_CONSECUTIVE_DIGITS
+enum { C_CTYPE_CONSECUTIVE_DIGITS = false };
+#endif
+#ifndef C_CTYPE_CONSECUTIVE_LOWERCASE
+enum { C_CTYPE_CONSECUTIVE_LOWERCASE = false };
+#endif
+#ifndef C_CTYPE_CONSECUTIVE_UPPERCASE
+enum { C_CTYPE_CONSECUTIVE_UPPERCASE = false };
+#endif
+
+/* Convert an int, which may be promoted from either an unsigned or a
+   signed char, to the corresponding char.  */
+
+static char
+to_char (int c)
+{
+  enum { nchars = CHAR_MAX - CHAR_MIN + 1 };
+  if (CHAR_MIN < 0 && CHAR_MAX < c && c < nchars)
+    return c - nchars;
+  return c;
+}
+
 /* The function isascii is not locale dependent. Its use in EBCDIC is
    questionable. */
 bool
@@ -32,18 +60,20 @@ c_isascii (int c)
 bool
 c_isalnum (int c)
 {
-#if C_CTYPE_CONSECUTIVE_DIGITS \
-    && C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
-#if C_CTYPE_ASCII
-  return ((c >= '0' && c <= '9')
-          || ((c & ~0x20) >= 'A' && (c & ~0x20) <= 'Z'));
-#else
-  return ((c >= '0' && c <= '9')
-          || (c >= 'A' && c <= 'Z')
-          || (c >= 'a' && c <= 'z'));
-#endif
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_DIGITS
+      && C_CTYPE_CONSECUTIVE_UPPERCASE
+      && C_CTYPE_CONSECUTIVE_LOWERCASE)
+    {
+      if (C_CTYPE_ASCII)
+        return (('0' <= c && c <= '9')
+                || ('A' <= (c & ~0x20) && (c & ~0x20) <= 'Z'));
+      else
+        return (('0' <= c && c <= '9')
+                || ('A' <= c && c <= 'Z')
+                || ('a' <= c && c <= 'z'));
+    }
+
+  switch (to_char (c))
     {
     case '0': case '1': case '2': case '3': case '4': case '5':
     case '6': case '7': case '8': case '9':
@@ -57,24 +87,24 @@ c_isalnum (int c)
     case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
     case 's': case 't': case 'u': case 'v': case 'w': case 'x':
     case 'y': case 'z':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
 c_isalpha (int c)
 {
-#if C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
-#if C_CTYPE_ASCII
-  return ((c & ~0x20) >= 'A' && (c & ~0x20) <= 'Z');
-#else
-  return ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'));
-#endif
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE)
+    {
+      if (C_CTYPE_ASCII)
+        return 'A' <= (c & ~0x20) && (c & ~0x20) <= 'Z';
+      else
+        return ('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z');
+    }
+
+  switch (to_char (c))
     {
     case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
     case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -86,11 +116,10 @@ c_isalpha (int c)
     case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
     case 's': case 't': case 'u': case 'v': case 'w': case 'x':
     case 'y': case 'z':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
@@ -102,81 +131,65 @@ c_isblank (int c)
 bool
 c_iscntrl (int c)
 {
-#if C_CTYPE_ASCII
-  return ((c & ~0x1f) == 0 || c == 0x7f);
-#else
-  switch (c)
-    {
-    case ' ': case '!': case '"': case '#': case '$': case '%':
-    case '&': case '\'': case '(': case ')': case '*': case '+':
-    case ',': case '-': case '.': case '/':
-    case '0': case '1': case '2': case '3': case '4': case '5':
-    case '6': case '7': case '8': case '9':
-    case ':': case ';': case '<': case '=': case '>': case '?':
-    case '@':
-    case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
-    case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
-    case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
-    case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
-    case 'Y': case 'Z':
-    case '[': case '\\': case ']': case '^': case '_': case '`':
-    case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
-    case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
-    case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
-    case 's': case 't': case 'u': case 'v': case 'w': case 'x':
-    case 'y': case 'z':
-    case '{': case '|': case '}': case '~':
-      return 0;
-    default:
-      return 1;
-    }
-#endif
+  enum { C_CTYPE_EBCDIC = (' ' == 64 && '0' == 240
+                           && 'A' == 193 && 'J' == 209 && 'S' == 226
+                           && 'A' == 129 && 'J' == 145 && 'S' == 162) };
+  verify (C_CTYPE_ASCII || C_CTYPE_EBCDIC);
+
+  if (0 <= c && c < ' ')
+    return true;
+  if (C_CTYPE_ASCII)
+    return c == 0x7f;
+  else
+    return c == 0xff || c == -1;
 }
 
 bool
 c_isdigit (int c)
 {
-#if C_CTYPE_CONSECUTIVE_DIGITS
-  return (c >= '0' && c <= '9');
-#else
+  if (C_CTYPE_ASCII)
+    return '0' <= c && c <= '9';
+
+  c = to_char (c);
+  if (C_CTYPE_CONSECUTIVE_DIGITS)
+    return '0' <= c && c <= '9';
+
   switch (c)
     {
     case '0': case '1': case '2': case '3': case '4': case '5':
     case '6': case '7': case '8': case '9':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
 c_islower (int c)
 {
-#if C_CTYPE_CONSECUTIVE_LOWERCASE
-  return (c >= 'a' && c <= 'z');
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_LOWERCASE)
+    return 'a' <= c && c <= 'z';
+
+  switch (to_char (c))
     {
     case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
     case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
     case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
     case 's': case 't': case 'u': case 'v': case 'w': case 'x':
     case 'y': case 'z':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
 c_isgraph (int c)
 {
-#if C_CTYPE_ASCII
-  return (c >= '!' && c <= '~');
-#else
-  switch (c)
+  if (C_CTYPE_ASCII)
+    return '!' <= c && c <= '~';
+
+  switch (to_char (c))
     {
     case '!': case '"': case '#': case '$': case '%': case '&':
     case '\'': case '(': case ')': case '*': case '+': case ',':
@@ -197,20 +210,19 @@ c_isgraph (int c)
     case 's': case 't': case 'u': case 'v': case 'w': case 'x':
     case 'y': case 'z':
     case '{': case '|': case '}': case '~':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
 c_isprint (int c)
 {
-#if C_CTYPE_ASCII
-  return (c >= ' ' && c <= '~');
-#else
-  switch (c)
+  if (C_CTYPE_ASCII)
+    return ' ' <= c && c <= '~';
+
+  switch (to_char (c))
     {
     case ' ': case '!': case '"': case '#': case '$': case '%':
     case '&': case '\'': case '(': case ')': case '*': case '+':
@@ -231,22 +243,21 @@ c_isprint (int c)
     case 's': case 't': case 'u': case 'v': case 'w': case 'x':
     case 'y': case 'z':
     case '{': case '|': case '}': case '~':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
 c_ispunct (int c)
 {
-#if C_CTYPE_ASCII
-  return ((c >= '!' && c <= '~')
-          && !((c >= '0' && c <= '9')
-               || ((c & ~0x20) >= 'A' && (c & ~0x20) <= 'Z')));
-#else
-  switch (c)
+  if (C_CTYPE_ASCII)
+    return (('!' <= c && c <= '~')
+            && !(('0' <= c && c <= '9')
+                 || ('A' <= (c & ~0x20) && (c & ~0x20) <= 'Z')));
+
+  switch (to_char (c))
     {
     case '!': case '"': case '#': case '$': case '%': case '&':
     case '\'': case '(': case ')': case '*': case '+': case ',':
@@ -255,11 +266,10 @@ c_ispunct (int c)
     case '@':
     case '[': case '\\': case ']': case '^': case '_': case '`':
     case '{': case '|': case '}': case '~':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
@@ -272,57 +282,56 @@ c_isspace (int c)
 bool
 c_isupper (int c)
 {
-#if C_CTYPE_CONSECUTIVE_UPPERCASE
-  return (c >= 'A' && c <= 'Z');
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_UPPERCASE)
+    return 'A' <= c && c <= 'Z';
+
+  switch (to_char (c))
     {
     case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
     case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
     case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
     case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
     case 'Y': case 'Z':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 bool
 c_isxdigit (int c)
 {
-#if C_CTYPE_CONSECUTIVE_DIGITS \
-    && C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
-#if C_CTYPE_ASCII
-  return ((c >= '0' && c <= '9')
-          || ((c & ~0x20) >= 'A' && (c & ~0x20) <= 'F'));
-#else
-  return ((c >= '0' && c <= '9')
-          || (c >= 'A' && c <= 'F')
-          || (c >= 'a' && c <= 'f'));
-#endif
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_DIGITS
+      && C_CTYPE_CONSECUTIVE_UPPERCASE
+      && C_CTYPE_CONSECUTIVE_LOWERCASE)
+    {
+      if ('0' <= c && c <= '9')
+        return true;
+      if (C_CTYPE_ASCII)
+        return 'A' <= (c & ~0x20) && (c & ~0x20) <= 'F';
+      return (('A' <= c && c <= 'F')
+              || ('a' <= c && c <= 'f'));
+    }
+
+  switch (to_char (c))
     {
     case '0': case '1': case '2': case '3': case '4': case '5':
     case '6': case '7': case '8': case '9':
     case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
     case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
-      return 1;
+      return true;
     default:
-      return 0;
+      return false;
     }
-#endif
 }
 
 int
 c_tolower (int c)
 {
-#if C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
-  return (c >= 'A' && c <= 'Z' ? c - 'A' + 'a' : c);
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE)
+    return c_isupper (c) ? c - 'A' + 'a' : c;
+
+  switch (to_char (c))
     {
     case 'A': return 'a';
     case 'B': return 'b';
@@ -352,16 +361,15 @@ c_tolower (int c)
     case 'Z': return 'z';
     default: return c;
     }
-#endif
 }
 
 int
 c_toupper (int c)
 {
-#if C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE
-  return (c >= 'a' && c <= 'z' ? c - 'a' + 'A' : c);
-#else
-  switch (c)
+  if (C_CTYPE_CONSECUTIVE_UPPERCASE && C_CTYPE_CONSECUTIVE_LOWERCASE)
+    return c_islower (c) ? c - 'a' + 'A' : c;
+
+  switch (to_char (c))
     {
     case 'a': return 'A';
     case 'b': return 'B';
@@ -391,5 +399,4 @@ c_toupper (int c)
     case 'z': return 'Z';
     default: return c;
     }
-#endif
 }
diff --git a/tests/test-c-ctype.c b/tests/test-c-ctype.c
index 81fe936..63d0af9 100644
--- a/tests/test-c-ctype.c
+++ b/tests/test-c-ctype.c
@@ -20,10 +20,19 @@
 
 #include "c-ctype.h"
 
+#include <limits.h>
 #include <locale.h>
 
 #include "macros.h"
 
+static char
+to_char (int c)
+{
+  if (CHAR_MIN < 0 && CHAR_MAX < c)
+    return c - CHAR_MAX - 1 + CHAR_MIN;
+  return c;
+}
+
 static void
 test_all (void)
 {
@@ -31,49 +40,32 @@ test_all (void)
 
   for (c = -0x80; c < 0x100; c++)
     {
-      ASSERT (c_isascii (c) == (c >= 0 && c < 0x80));
-
-      switch (c)
+      if (c < 0)
         {
-        case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
-        case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
-        case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
-        case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
-        case 'Y': case 'Z':
-        case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
-        case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
-        case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
-        case 's': case 't': case 'u': case 'v': case 'w': case 'x':
-        case 'y': case 'z':
-        case '0': case '1': case '2': case '3': case '4': case '5':
-        case '6': case '7': case '8': case '9':
-          ASSERT (c_isalnum (c) == 1);
-          break;
-        default:
-          ASSERT (c_isalnum (c) == 0);
-          break;
+          ASSERT (c_isascii (c) == c_isascii (c + 0x100));
+          ASSERT (c_isalnum (c) == c_isalnum (c + 0x100));
+          ASSERT (c_isalpha (c) == c_isalpha (c + 0x100));
+          ASSERT (c_isblank (c) == c_isblank (c + 0x100));
+          ASSERT (c_iscntrl (c) == c_iscntrl (c + 0x100));
+          ASSERT (c_isdigit (c) == c_isdigit (c + 0x100));
+          ASSERT (c_islower (c) == c_islower (c + 0x100));
+          ASSERT (c_isgraph (c) == c_isgraph (c + 0x100));
+          ASSERT (c_isprint (c) == c_isprint (c + 0x100));
+          ASSERT (c_ispunct (c) == c_ispunct (c + 0x100));
+          ASSERT (c_isspace (c) == c_isspace (c + 0x100));
+          ASSERT (c_isupper (c) == c_isupper (c + 0x100));
+          ASSERT (c_isxdigit (c) == c_isxdigit (c + 0x100));
+          ASSERT (to_char (c_tolower (c)) == to_char (c_tolower (c + 0x100)));
+          ASSERT (to_char (c_toupper (c)) == to_char (c_toupper (c + 0x100)));
         }
 
-      switch (c)
-        {
-        case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
-        case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
-        case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
-        case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
-        case 'Y': case 'Z':
-        case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
-        case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
-        case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
-        case 's': case 't': case 'u': case 'v': case 'w': case 'x':
-        case 'y': case 'z':
-          ASSERT (c_isalpha (c) == 1);
-          break;
-        default:
-          ASSERT (c_isalpha (c) == 0);
-          break;
-        }
+      ASSERT (c_isascii (c) == (c >= 0 && c < 0x80));
+
+      ASSERT (c_isalnum (c) == (c_isalpha (c) || c_isdigit (c)));
+
+      ASSERT (c_isalpha (c) == (c_islower (c) || c_isupper (c)));
 
-      switch (c)
+      switch (to_char (c))
         {
         case '\t': case ' ':
           ASSERT (c_isblank (c) == 1);
@@ -83,9 +75,13 @@ test_all (void)
           break;
         }
 
+#ifdef C_CTYPE_ASCII
       ASSERT (c_iscntrl (c) == ((c >= 0 && c < 0x20) || c == 0x7f));
+#endif
 
-      switch (c)
+      ASSERT (! (c_iscntrl (c) && c_isprint (c)));
+
+      switch (to_char (c))
         {
         case '0': case '1': case '2': case '3': case '4': case '5':
         case '6': case '7': case '8': case '9':
@@ -96,7 +92,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (to_char (c))
         {
         case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
         case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
@@ -110,13 +106,31 @@ test_all (void)
           break;
         }
 
+#ifdef C_CTYPE_ASCII
       ASSERT (c_isgraph (c) == ((c >= 0x20 && c < 0x7f) && c != ' '));
 
       ASSERT (c_isprint (c) == (c >= 0x20 && c < 0x7f));
+#endif
+
+      ASSERT (c_isgraph (c) == (c_isalnum (c) || c_ispunct (c)));
+
+      ASSERT (c_isprint (c) == (c_isgraph (c) || c == ' '));
 
-      ASSERT (c_ispunct (c) == (c_isgraph (c) && !c_isalnum (c)));
+      switch (to_char (c))
+        {
+        case '!': case '"': case '#': case '$': case '%': case '&': case '\'':
+        case '(': case ')': case '*': case '+': case ',': case '-': case '.':
+        case '/': case ':': case ';': case '<': case '=': case '>': case '?':
+        case '@': case '[': case'\\': case ']': case '^': case '_': case '`':
+        case '{': case '|': case '}': case '~':
+          ASSERT (c_ispunct (c) == 1);
+          break;
+        default:
+          ASSERT (c_ispunct (c) == 0);
+          break;
+        }
 
-      switch (c)
+      switch (to_char (c))
         {
         case ' ': case '\t': case '\n': case '\v': case '\f': case '\r':
           ASSERT (c_isspace (c) == 1);
@@ -126,7 +140,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (to_char (c))
         {
         case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
         case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -140,7 +154,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (to_char (c))
         {
         case '0': case '1': case '2': case '3': case '4': case '5':
         case '6': case '7': case '8': case '9':
@@ -153,7 +167,7 @@ test_all (void)
           break;
         }
 
-      switch (c)
+      switch (to_char (c))
         {
         case 'A':
           ASSERT (c_tolower (c) == 'a');
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 19:32 ` Paul Eggert
@ 2015-09-22 19:46   ` Paul Eggert
  2015-09-22 20:37   ` Daniel Richard G.
  1 sibling, 0 replies; 49+ messages in thread
From: Paul Eggert @ 2015-09-22 19:46 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 80 bytes --]

Ooops, I forgot to add a dependency.  I installed the attached followup 
patch.

[-- Attachment #2: 0001-modules-c-ctype-Depends-on-Add-verify.patch --]
[-- Type: text/x-patch, Size: 1002 bytes --]

>From 07ed58f3c5b54fe3935ce522dbc1c1a716185e67 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 22 Sep 2015 12:44:25 -0700
Subject: [PATCH] * modules/c-ctype (Depends-on): Add verify.

---
 ChangeLog       | 1 +
 modules/c-ctype | 1 +
 2 files changed, 2 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 8723b38..4ae3a57 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -20,6 +20,7 @@
 	Prefer true and false to 1 and 0, for booleans.
 	(c_iscntrl): Use 'if', not 'ifdef'.  Special case for EBCDIC.
 	Verify that the character set is either ASCII or EBCDIC.
+	* modules/c-ctype (Depends-on): Add verify.
 	* tests/test-c-ctype.c: Include <limits.h>, for CHAR_MIN
 	(to_char): New function.
 	(test_all): Port to EBCDIC.  Add some more tests, e.g., for c_ispunct.
diff --git a/modules/c-ctype b/modules/c-ctype
index 7be209f..b172d13 100644
--- a/modules/c-ctype
+++ b/modules/c-ctype
@@ -7,6 +7,7 @@ lib/c-ctype.c
 
 Depends-on:
 stdbool
+verify
 
 configure.ac:
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22  2:28 [PATCH] IBM z/OS + EBCDIC support Daniel Richard G.
  2015-09-22 15:23 ` Eric Blake
  2015-09-22 19:32 ` Paul Eggert
@ 2015-09-22 19:50 ` Paul Eggert
  2015-09-22 20:47   ` Daniel Richard G.
  2 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-22 19:50 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

A few non-ctype-related comments:

Omit parens around arguments of 'defined', e.g., say "defined __MVS__" 
not "defined (__MVS__)".

I agree with Eric that we should rename "__string" rather than fiddle 
with #undefing it.  It's just a placeholder name.  I suggest renaming it 
to "__str".  We can backport this to glibc eventually.

In strtod.c, don't bother with "if (errno == ERANGE) errno = 0;". Just 
do "errno = 0;".

Also in strtod.c, don't assume isnand exists.  That is, replace 
"!isnand(num)" with "num == num".

Update serial numbers in changed .m4 files.

In m4/fclose.m4, gl_FUNC_FCLOSE should AC_REQUIRE([AC_CANONICAL_HOST]).  
Also, it should test $host_os rather than $host.

In m4/strstr.m4, the __MVS__ failure should be at compile-time, with 
#error, rather than at run-time.  That's better if cross-compiling.

In comments, prefer imperatives, e.g., "Instead, temporarily redefine 
..." rather than "Instead, we temporarily redefine ...". This is 
standard GNU style and is shorter.

The get_rusage_as code has duplications.  Simpler would be:

   uintptr_t
   get_rusage_as (void)
   {
     /* On Mac OS X, AIX, Cygwin, and z/OS, get_rusage_as_via_setrlimit
        exists but does not work.  */
   #if (! ((defined __APPLE__ && defined __MACH__)                         \
       || defined _AIX || defined __CYGWIN__ || defined __MVS__)       \
        && HAVE_SETRLIMIT && defined RLIMIT_AS && HAVE_SYS_MMAN_H && 
HAVE_MPROTECT)
     /* Prefer get_rusage_as_via_setrlimit() if it succeeds,
        because the caller may want to use the result with setrlimit().  */
     uintptr_t result = get_rusage_as_via_setrlimit ();
     if (result != 0)
       return result;
   #endif
     return get_rusage_as_via_iterator ();
   }

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 19:27   ` Daniel Richard G.
@ 2015-09-22 20:00     ` Paul Eggert
  2015-09-22 20:08       ` Eric Blake
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-22 20:00 UTC (permalink / raw)
  To: Daniel Richard G., Eric Blake, bug-gnulib

On 09/22/2015 12:27 PM, Daniel Richard G. wrote:
> Wouldn't it be easier to apply everything to a feature branch, and 
> integrate it bit by bit?

You can do that in your own repository if it makes things simpler, but 
for something this small I'd rather just see patches via email.

> wc: file "//dev/null": EDC5047I An invalid file name was specified as 
> a function parameter

How about if we add //dev/null to the configure-time test as to whether 
/ and // are the same?  If //dev/null doesn't work, then / and // are 
not the same.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 20:00     ` Paul Eggert
@ 2015-09-22 20:08       ` Eric Blake
  2015-09-22 20:51         ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Eric Blake @ 2015-09-22 20:08 UTC (permalink / raw)
  To: Paul Eggert, Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

On 09/22/2015 02:00 PM, Paul Eggert wrote:
>> wc: file "//dev/null": EDC5047I An invalid file name was specified as
>> a function parameter
> 
> How about if we add //dev/null to the configure-time test as to whether
> / and // are the same?  If //dev/null doesn't work, then / and // are
> not the same.

Rather, it sounds like configure is already correct, and we are
correctly deducing that // is different; but that the difference is odd
on this platform in that it is not distinguishable via dev/ino (every
other platform with distinct // at least has the decency to give a
distinct dev/ino).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 19:32 ` Paul Eggert
  2015-09-22 19:46   ` Paul Eggert
@ 2015-09-22 20:37   ` Daniel Richard G.
  2015-09-22 22:03     ` Paul Eggert
  1 sibling, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-22 20:37 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

Hi Paul,

On Tue, 2015 Sep 22 12:32-0700, Paul Eggert wrote:
> Thanks for looking into this.  I have some questions about the c-ctype
> changes.  It appears that the proposed patch defers to the system
> functions (which use the current locale), but that's not the intent of
> c-ctype: it's supposed to correspond to a stripped down POSIX "C"
> locale regardless of the current locale settings.  Is there something
> special in z/OS that requires using the system functions?  (E.g., does
> the "C" locale behave differently depending on some *other* setting
> regarding character set?)

Mainly, it was the attempt to answer the question "so what specific
variant of EBCDIC are we going to target here?" that led me to use
the system functions. EBCDIC-1047 is favored in z/OS, but EBCDIC-037
is also popular, and then there are the Russian/Japanese/etc. code
pages that some far-flung users might want. However, unlike "normal"
8-bit encodings like ISO 8859-#, KOI8-R et al., there is no agreement
in the 7-bit range, and even ASCII characters like "[" and "]" are
not consistently encoded between EBCDIC variants. We don't have the
option of saying, "Okay, screw all that, we'll just limit ourselves
to this common subset," unless said subset excludes things like
punctuation marks.

My view is, it's not worth the hassle. Yes, c-ctype is not supposed to
be locale-dependent. It's going to be a lot more work, and a lot more
code to maintain to overcome that, and it's not likely the users of
these systems will see a corresponding benefit. I think it would be
better to have this for now---it's better than nothing---and if a clear
need arises in the future for locale-independent behavior on z/OS
(possibly by selecting an EBCDIC variant at compile time), then cross
that bridge then.

> With the above in mind, it's not clear what c_isascii should do.
> Should it return 1 for bytes in the range 0..127, or for bytes that
> correspond to ASCII bytes if one assumes the standard translation
> from EBCDIC code page 037 to ASCII?  (Is there a standard?)  If the
> former, the current code is OK; if the latter, does the system
> isascii always return the same results regardless of locale and do
> these results make sense?

The latter behavior is the right one, IMO. If the former, there wouldn't
even be a point to having an isascii() function at all; you would just
do a range check.

Yes, there's a standard... a whole smorgasbord to choose from ^_^

The system isascii() function is locale-dependent. With "[" and "]"
depending on that, I don't see a way to get around this, unless you
deliberately support one EBCDIC variant at the expense of all others.

    http://www-01.ibm.com/support/knowledgecenter/SSLTBW_2.1.0/com.ibm.zos.v2r1.bpxbd00/risasc.htm?lang=en

> Anyway, in looking through the code I see that it's hard to test a port 
> to EBCDIC because it uses ifdef rather than if, and I do see some 
> promotion bugs that you noted but we can fix these with inline functions 
> rather than macros (cleaner and safer nowadays), and there are a few 
> other style glitches (e.g., boolean values, overuse of >=) so I 
> installed the attached patch.  This patch assumes EBCDIC control 
> characters are either less than ' ' or are all 1 bits, which I think is 
> right.  The patch also tightens up the tests a bit.

Yes, all control characters appear to be in [\x00-\x3F], but not
everything in that range is a control character. (I remember 0x04 was
not.) I tried making c_iscntrl() a simple range check at first, but that
did not agree with the system iscntrl().

> This patch doesn't address the isascii problem, nor the "something 
> special in z/OS" problem, so quite possibly further patches will be 
> needed to this module.
> Email had 1 attachment:
> + 0001-c-ctype-port-better-to-EBCDIC.patch
>   21k (text/x-patch)

I'll be happy to test your [revised] patch this evening.

--Daniel

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 19:50 ` [PATCH] IBM z/OS + EBCDIC support Paul Eggert
@ 2015-09-22 20:47   ` Daniel Richard G.
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-22 20:47 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

On Tue, 2015 Sep 22 12:50-0700, Paul Eggert wrote:
> A few non-ctype-related comments:
> 
> Omit parens around arguments of 'defined', e.g., say "defined __MVS__"
> not "defined (__MVS__)".

Understood.

> I agree with Eric that we should rename "__string" rather than fiddle
> with #undefing it.  It's just a placeholder name.  I suggest renaming
> it to "__str".  We can backport this to glibc eventually.

Might it be better to forgo the double-underscore prefix? (Why not
just use "s"?)

> In strtod.c, don't bother with "if (errno == ERANGE) errno = 0;". Just
> do "errno = 0;".

Understood. I was after a narrow fix to this issue.

> Also in strtod.c, don't assume isnand exists.  That is, replace
> "!isnand(num)" with "num == num".
>
> Update serial numbers in changed .m4 files.
>
> In m4/fclose.m4, gl_FUNC_FCLOSE should
> AC_REQUIRE([AC_CANONICAL_HOST]). Also, it should test $host_os rather
> than $host.
>
> In m4/strstr.m4, the __MVS__ failure should be at compile-time, with
> #error, rather than at run-time.  That's better if cross-compiling.
>
> In comments, prefer imperatives, e.g., "Instead, temporarily redefine
> ..." rather than "Instead, we temporarily redefine ...". This is
> standard GNU style and is shorter.

Roger all that.

> The get_rusage_as code has duplications.  Simpler would be:

Agreed, but that's in the "general clean-up" category :)


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 20:08       ` Eric Blake
@ 2015-09-22 20:51         ` Daniel Richard G.
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-22 20:51 UTC (permalink / raw)
  To: Eric Blake, Paul Eggert, bug-gnulib

On Tue, 2015 Sep 22 14:08-0600, Eric Blake wrote:
> On 09/22/2015 02:00 PM, Paul Eggert wrote:
> >>
> >> wc: file "//dev/null": EDC5047I An invalid file name was specified
> >> as a function parameter
> >
> > How about if we add //dev/null to the configure-time test as to
> > whether / and // are the same?  If //dev/null doesn't work, then /
> > and // are not the same.
>
> Rather, it sounds like configure is already correct, and we are
> correctly deducing that // is different; but that the difference is
> odd on this platform in that it is not distinguishable via dev/ino
> (every other platform with distinct // at least has the decency to
> give a distinct dev/ino).

Yes, my intent was to show why the DOUBLE_SLASH_IS_DISTINCT_ROOT test
was giving the result that it had.

There's certainly much about this platform that is... odd.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 20:37   ` Daniel Richard G.
@ 2015-09-22 22:03     ` Paul Eggert
  2015-09-22 23:44       ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-22 22:03 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1445 bytes --]

Thanks for explaining.  I still see a problem with the proposed patch, 
though, in that (if I'm understanding it correctly) it would cause 
c_isalpha (120) to succeed, even though EBCDIC 120 corresponds to U+00CC 
LATIN CAPITAL LETTER I WITH GRAVE, and that is not supposed to be an 
alphabetic character in the stripped-down C locale.  Code that uses 
c-ctype wants only ASCII letters, and departing from this would likely 
break things.

Worse, the C expression "c_ispunct ('[')" might return false, as the 
library may be in a locale that's incompatible with the mode the 
compiler was in when it compiled the '['.

Looking at the web page you mentioned, it appears that one approach is 
to assume EBCDIC 1047 (this seems to be the default and typical setting 
for C programs) at both compile-time and run-time.  We can check the 
compile-time assumption without any code overhead.  The proposed patch 
does that.  If someone ally wants to use a different code page, either 
at compile-time or at run-time, more code will need to be written (most 
likely by the poor soul who actually needs that feature).

> Yes, all control characters appear to be in [\x00-\x3F], but not 
> everything in that range is a control character. (I remember 0x04 was 
> not.) I tried making c_iscntrl() a simple range check at first, but 
> that did not agree with the system iscntrl().

Thanks, this should be fixed in the attached patch, which I've installed.

[-- Attachment #2: 0001-c-ctype-assume-EBCDIC-1047-for-c_iscntrl.patch --]
[-- Type: text/x-patch, Size: 2531 bytes --]

>From a92ab221b5cad8a5c1a5ca1fc1823d1f3fe4a24b Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 22 Sep 2015 14:47:06 -0700
Subject: [PATCH] c-ctype: assume EBCDIC 1047 for c_iscntrl

* lib/c-ctype.c (c_iscntrl): When EBCDIC, assume code page 1047 at
both compile-time and at run-time.  Check it at compile-time.  We can
worry about other code pages later, if the topic ever comes up.
Fix typo in C_CTYPE_EBCDIC.
---
 lib/c-ctype.c | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/lib/c-ctype.c b/lib/c-ctype.c
index 916d46e..558c4af 100644
--- a/lib/c-ctype.c
+++ b/lib/c-ctype.c
@@ -131,17 +131,37 @@ c_isblank (int c)
 bool
 c_iscntrl (int c)
 {
-  enum { C_CTYPE_EBCDIC = (' ' == 64 && '0' == 240
-                           && 'A' == 193 && 'J' == 209 && 'S' == 226
-                           && 'A' == 129 && 'J' == 145 && 'S' == 162) };
-  verify (C_CTYPE_ASCII || C_CTYPE_EBCDIC);
-
-  if (0 <= c && c < ' ')
-    return true;
+  enum { C_CTYPE_EBCDIC = (' ' == '\x40' && '0' == '\xf0'
+                           && 'A' == '\xc1' && 'J' == '\xd1' && 'S' == '\xe2'
+                           && 'a' == '\x81' && 'j' == '\x91' && 's' == '\xa2') };
   if (C_CTYPE_ASCII)
-    return c == 0x7f;
+    return (0 <= c && c < ' ') || c == 0x7f;
   else
-    return c == 0xff || c == -1;
+    {
+      /* Return true if C corresponds to an ASCII control character.
+         Assume EBCDIC code page 1047, and verify that the compiler
+         agrees with this.  */
+      verify (C_CTYPE_ASCII
+              || (C_CTYPE_EBCDIC
+                  && '!' == '\x5a' && '#' == '\x7b' && '$' == '\x5b'
+                  && '@' == '\x7c' && '[' == '\xad' && '\\' == '\xe0'
+                  && ']' == '\xbd' && '^' == '\x5f' && '_' == '\x6d'
+                  && '`' == '\x79'));
+      switch (c)
+        {
+        case '\x00': case '\x01': case '\x02': case '\x03': case '\x05':
+        case '\x0b': case '\x0c': case '\x0d': case '\x0e': case '\x0f':
+        case '\x10': case '\x11': case '\x12': case '\x13': case '\x15':
+        case '\x16': case '\x18': case '\x19': case '\x1c': case '\x1d':
+        case '\x1e': case '\x1f': case '\x26': case '\x27': case '\x2d':
+        case '\x2e': case '\x2f': case '\x32': case '\x37': case '\x3c':
+        case '\x3d': case '\x3f': case '\xff':
+        case '\xff' < 0 ? 0xff : -1:
+          return true;
+        default:
+          return false;
+        }
+    }
 }
 
 bool
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 22:03     ` Paul Eggert
@ 2015-09-22 23:44       ` Daniel Richard G.
  2015-09-23  2:02         ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-22 23:44 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 2660 bytes --]

On Tue, 2015 Sep 22 15:03-0700, Paul Eggert wrote:
> Thanks for explaining.  I still see a problem with the proposed patch,
> though, in that (if I'm understanding it correctly) it would cause
> c_isalpha (120) to succeed, even though EBCDIC 120 corresponds to
> U+00CC LATIN CAPITAL LETTER I WITH GRAVE, and that is not supposed to
> be an alphabetic character in the stripped-down C locale.  Code that
> uses c-ctype wants only ASCII letters, and departing from this would
> likely break things.

How would that match occur? c_isalpha() was/is using a "switch"
statement for EBCDIC.

> Worse, the C expression "c_ispunct ('[')" might return false, as the
> library may be in a locale that's incompatible with the mode the
> compiler was in when it compiled the '['.

If the user builds in one locale and runs in another, they're going to
have bigger problems (e.g. garbled program messages). As far as I've
seen, this is considered "out of bounds" in z/OS usage.

> Looking at the web page you mentioned, it appears that one approach is
> to assume EBCDIC 1047 (this seems to be the default and typical
> setting for C programs) at both compile-time and run-time.  We can
> check the compile-time assumption without any code overhead.  The
> proposed patch does that.  If someone ally wants to use a different
> code page, either at compile-time or at run-time, more code will need
> to be written (most likely by the poor soul who actually needs that
> feature).

A different code page at run time, I think, is not feasible. But
international users will at least want a different code page at
compile time.

A simple program could generate tables for all the isxxxxx() functions
(see below) at compile time. Would you be inclined to do it that way?

> > Yes, all control characters appear to be in [\x00-\x3F], but not
> > everything in that range is a control character. (I remember 0x04
> > was not.) I tried making c_iscntrl() a simple range check at first,
> > but that did not agree with the system iscntrl().
>
> Thanks, this should be fixed in the attached patch, which I've
> installed.
> Email had 1 attachment:
> + 0001-c-ctype-assume-EBCDIC-1047-for-c_iscntrl.patch
>   3k (text/x-patch)

I'll try that out. I wasn't expecting you to all but rewrite c-ctype!

Just to help inform the discussion, I've attached a small program that
shows the output of the various isxxxxx() functions for all values in
[0, 255], and its output on z/OS with EBCDIC-1047 and -D_ALL_SOURCE.

It goes to show: where mainframes are concerned, nothing's easy :]


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ctype.c --]
[-- Type: text/x-csrc; name="ctype.c", Size: 878 bytes --]

#include <stdio.h>
#include <ctype.h>

int main(void)
{
  int c;

  puts("A. isalnum");
  puts("B. isalpha");
  puts("C. isascii");
  puts("D. isblank");
  puts("E. iscntrl");
  puts("F. isdigit");
  puts("G. isgraph");
  puts("H. islower");
  puts("I. isprint");
  puts("J. ispunct");
  puts("K. isspace");
  puts("L. isupper");
  puts("M. isxdigit");

  puts(" char\tA B C D E F G H I J K L M");
  
  for (c = 0; c < 256; c++)
    printf("%3d %c\t%d %d %d %d %d %d %d %d %d %d %d %d %d\n",
      c, isprint(c) ? c : ' ',
      isalnum(c) ? 1 : 0,
      isalpha(c) ? 1 : 0,
      isascii(c) ? 1 : 0,
      isblank(c) ? 1 : 0,
      iscntrl(c) ? 1 : 0,
      isdigit(c) ? 1 : 0,
      isgraph(c) ? 1 : 0,
      islower(c) ? 1 : 0,
      isprint(c) ? 1 : 0,
      ispunct(c) ? 1 : 0,
      isspace(c) ? 1 : 0,
      isupper(c) ? 1 : 0,
      isxdigit(c) ? 1 : 0);

  return 0;
}

[-- Attachment #3: ctype-ebcdic1047.txt --]
[-- Type: text/plain, Size: 8368 bytes --]

A. isalnum
B. isalpha
C. isascii
D. isblank
E. iscntrl
F. isdigit
G. isgraph
H. islower
I. isprint
J. ispunct
K. isspace
L. isupper
M. isxdigit
 char	A B C D E F G H I J K L M
  0  	0 0 1 0 1 0 0 0 0 0 0 0 0
  1  	0 0 1 0 1 0 0 0 0 0 0 0 0
  2  	0 0 1 0 1 0 0 0 0 0 0 0 0
  3  	0 0 1 0 1 0 0 0 0 0 0 0 0
  4  	0 0 0 0 0 0 0 0 0 0 0 0 0
  5  	0 0 1 1 1 0 0 0 0 0 1 0 0
  6  	0 0 0 0 0 0 0 0 0 0 0 0 0
  7  	0 0 1 0 1 0 0 0 0 0 0 0 0
  8  	0 0 0 0 0 0 0 0 0 0 0 0 0
  9  	0 0 0 0 0 0 0 0 0 0 0 0 0
 10  	0 0 0 0 0 0 0 0 0 0 0 0 0
 11  	0 0 1 0 1 0 0 0 0 0 1 0 0
 12  	0 0 1 0 1 0 0 0 0 0 1 0 0
 13  	0 0 1 0 1 0 0 0 0 0 1 0 0
 14  	0 0 1 0 1 0 0 0 0 0 0 0 0
 15  	0 0 1 0 1 0 0 0 0 0 0 0 0
 16  	0 0 1 0 1 0 0 0 0 0 0 0 0
 17  	0 0 1 0 1 0 0 0 0 0 0 0 0
 18  	0 0 1 0 1 0 0 0 0 0 0 0 0
 19  	0 0 1 0 1 0 0 0 0 0 0 0 0
 20  	0 0 0 0 0 0 0 0 0 0 0 0 0
 21  	0 0 1 0 1 0 0 0 0 0 1 0 0
 22  	0 0 1 0 1 0 0 0 0 0 0 0 0
 23  	0 0 0 0 0 0 0 0 0 0 0 0 0
 24  	0 0 1 0 1 0 0 0 0 0 0 0 0
 25  	0 0 1 0 1 0 0 0 0 0 0 0 0
 26  	0 0 0 0 0 0 0 0 0 0 0 0 0
 27  	0 0 0 0 0 0 0 0 0 0 0 0 0
 28  	0 0 1 0 1 0 0 0 0 0 0 0 0
 29  	0 0 1 0 1 0 0 0 0 0 0 0 0
 30  	0 0 1 0 1 0 0 0 0 0 0 0 0
 31  	0 0 1 0 1 0 0 0 0 0 0 0 0
 32  	0 0 0 0 0 0 0 0 0 0 0 0 0
 33  	0 0 0 0 0 0 0 0 0 0 0 0 0
 34  	0 0 0 0 0 0 0 0 0 0 0 0 0
 35  	0 0 0 0 0 0 0 0 0 0 0 0 0
 36  	0 0 0 0 0 0 0 0 0 0 0 0 0
 37  	0 0 0 0 0 0 0 0 0 0 0 0 0
 38  	0 0 1 0 1 0 0 0 0 0 0 0 0
 39  	0 0 1 0 1 0 0 0 0 0 0 0 0
 40  	0 0 0 0 0 0 0 0 0 0 0 0 0
 41  	0 0 0 0 0 0 0 0 0 0 0 0 0
 42  	0 0 0 0 0 0 0 0 0 0 0 0 0
 43  	0 0 0 0 0 0 0 0 0 0 0 0 0
 44  	0 0 0 0 0 0 0 0 0 0 0 0 0
 45  	0 0 1 0 1 0 0 0 0 0 0 0 0
 46  	0 0 1 0 1 0 0 0 0 0 0 0 0
 47  	0 0 1 0 1 0 0 0 0 0 0 0 0
 48  	0 0 0 0 0 0 0 0 0 0 0 0 0
 49  	0 0 0 0 0 0 0 0 0 0 0 0 0
 50  	0 0 1 0 1 0 0 0 0 0 0 0 0
 51  	0 0 0 0 0 0 0 0 0 0 0 0 0
 52  	0 0 0 0 0 0 0 0 0 0 0 0 0
 53  	0 0 0 0 0 0 0 0 0 0 0 0 0
 54  	0 0 0 0 0 0 0 0 0 0 0 0 0
 55  	0 0 1 0 1 0 0 0 0 0 0 0 0
 56  	0 0 0 0 0 0 0 0 0 0 0 0 0
 57  	0 0 0 0 0 0 0 0 0 0 0 0 0
 58  	0 0 0 0 0 0 0 0 0 0 0 0 0
 59  	0 0 0 0 0 0 0 0 0 0 0 0 0
 60  	0 0 1 0 1 0 0 0 0 0 0 0 0
 61  	0 0 1 0 1 0 0 0 0 0 0 0 0
 62  	0 0 0 0 0 0 0 0 0 0 0 0 0
 63  	0 0 1 0 1 0 0 0 0 0 0 0 0
 64  	0 0 1 1 0 0 0 0 1 0 1 0 0
 65  	0 0 0 0 0 0 0 0 0 0 0 0 0
 66  	0 0 0 0 0 0 0 0 0 0 0 0 0
 67  	0 0 0 0 0 0 0 0 0 0 0 0 0
 68  	0 0 0 0 0 0 0 0 0 0 0 0 0
 69  	0 0 0 0 0 0 0 0 0 0 0 0 0
 70  	0 0 0 0 0 0 0 0 0 0 0 0 0
 71  	0 0 0 0 0 0 0 0 0 0 0 0 0
 72  	0 0 0 0 0 0 0 0 0 0 0 0 0
 73  	0 0 0 0 0 0 0 0 0 0 0 0 0
 74  	0 0 0 0 0 0 0 0 0 0 0 0 0
 75 .	0 0 1 0 0 0 1 0 1 1 0 0 0
 76 <	0 0 1 0 0 0 1 0 1 1 0 0 0
 77 (	0 0 1 0 0 0 1 0 1 1 0 0 0
 78 +	0 0 1 0 0 0 1 0 1 1 0 0 0
 79 |	0 0 1 0 0 0 1 0 1 1 0 0 0
 80 &	0 0 1 0 0 0 1 0 1 1 0 0 0
 81  	0 0 0 0 0 0 0 0 0 0 0 0 0
 82  	0 0 0 0 0 0 0 0 0 0 0 0 0
 83  	0 0 0 0 0 0 0 0 0 0 0 0 0
 84  	0 0 0 0 0 0 0 0 0 0 0 0 0
 85  	0 0 0 0 0 0 0 0 0 0 0 0 0
 86  	0 0 0 0 0 0 0 0 0 0 0 0 0
 87  	0 0 0 0 0 0 0 0 0 0 0 0 0
 88  	0 0 0 0 0 0 0 0 0 0 0 0 0
 89  	0 0 0 0 0 0 0 0 0 0 0 0 0
 90 !	0 0 1 0 0 0 1 0 1 1 0 0 0
 91 $	0 0 1 0 0 0 1 0 1 1 0 0 0
 92 *	0 0 1 0 0 0 1 0 1 1 0 0 0
 93 )	0 0 1 0 0 0 1 0 1 1 0 0 0
 94 ;	0 0 1 0 0 0 1 0 1 1 0 0 0
 95 ^	0 0 1 0 0 0 1 0 1 1 0 0 0
 96 -	0 0 1 0 0 0 1 0 1 1 0 0 0
 97 /	0 0 1 0 0 0 1 0 1 1 0 0 0
 98  	0 0 0 0 0 0 0 0 0 0 0 0 0
 99  	0 0 0 0 0 0 0 0 0 0 0 0 0
100  	0 0 0 0 0 0 0 0 0 0 0 0 0
101  	0 0 0 0 0 0 0 0 0 0 0 0 0
102  	0 0 0 0 0 0 0 0 0 0 0 0 0
103  	0 0 0 0 0 0 0 0 0 0 0 0 0
104  	0 0 0 0 0 0 0 0 0 0 0 0 0
105  	0 0 0 0 0 0 0 0 0 0 0 0 0
106  	0 0 0 0 0 0 0 0 0 0 0 0 0
107 ,	0 0 1 0 0 0 1 0 1 1 0 0 0
108 %	0 0 1 0 0 0 1 0 1 1 0 0 0
109 _	0 0 1 0 0 0 1 0 1 1 0 0 0
110 >	0 0 1 0 0 0 1 0 1 1 0 0 0
111 ?	0 0 1 0 0 0 1 0 1 1 0 0 0
112  	0 0 0 0 0 0 0 0 0 0 0 0 0
113  	0 0 0 0 0 0 0 0 0 0 0 0 0
114  	0 0 0 0 0 0 0 0 0 0 0 0 0
115  	0 0 0 0 0 0 0 0 0 0 0 0 0
116  	0 0 0 0 0 0 0 0 0 0 0 0 0
117  	0 0 0 0 0 0 0 0 0 0 0 0 0
118  	0 0 0 0 0 0 0 0 0 0 0 0 0
119  	0 0 0 0 0 0 0 0 0 0 0 0 0
120  	0 0 0 0 0 0 0 0 0 0 0 0 0
121 `	0 0 1 0 0 0 1 0 1 1 0 0 0
122 :	0 0 1 0 0 0 1 0 1 1 0 0 0
123 #	0 0 1 0 0 0 1 0 1 1 0 0 0
124 @	0 0 1 0 0 0 1 0 1 1 0 0 0
125 '	0 0 1 0 0 0 1 0 1 1 0 0 0
126 =	0 0 1 0 0 0 1 0 1 1 0 0 0
127 "	0 0 1 0 0 0 1 0 1 1 0 0 0
128  	0 0 0 0 0 0 0 0 0 0 0 0 0
129 a	1 1 1 0 0 0 1 1 1 0 0 0 1
130 b	1 1 1 0 0 0 1 1 1 0 0 0 1
131 c	1 1 1 0 0 0 1 1 1 0 0 0 1
132 d	1 1 1 0 0 0 1 1 1 0 0 0 1
133 e	1 1 1 0 0 0 1 1 1 0 0 0 1
134 f	1 1 1 0 0 0 1 1 1 0 0 0 1
135 g	1 1 1 0 0 0 1 1 1 0 0 0 0
136 h	1 1 1 0 0 0 1 1 1 0 0 0 0
137 i	1 1 1 0 0 0 1 1 1 0 0 0 0
138  	0 0 0 0 0 0 0 0 0 0 0 0 0
139  	0 0 0 0 0 0 0 0 0 0 0 0 0
140  	0 0 0 0 0 0 0 0 0 0 0 0 0
141  	0 0 0 0 0 0 0 0 0 0 0 0 0
142  	0 0 0 0 0 0 0 0 0 0 0 0 0
143  	0 0 0 0 0 0 0 0 0 0 0 0 0
144  	0 0 0 0 0 0 0 0 0 0 0 0 0
145 j	1 1 1 0 0 0 1 1 1 0 0 0 0
146 k	1 1 1 0 0 0 1 1 1 0 0 0 0
147 l	1 1 1 0 0 0 1 1 1 0 0 0 0
148 m	1 1 1 0 0 0 1 1 1 0 0 0 0
149 n	1 1 1 0 0 0 1 1 1 0 0 0 0
150 o	1 1 1 0 0 0 1 1 1 0 0 0 0
151 p	1 1 1 0 0 0 1 1 1 0 0 0 0
152 q	1 1 1 0 0 0 1 1 1 0 0 0 0
153 r	1 1 1 0 0 0 1 1 1 0 0 0 0
154  	0 0 0 0 0 0 0 0 0 0 0 0 0
155  	0 0 0 0 0 0 0 0 0 0 0 0 0
156  	0 0 0 0 0 0 0 0 0 0 0 0 0
157  	0 0 0 0 0 0 0 0 0 0 0 0 0
158  	0 0 0 0 0 0 0 0 0 0 0 0 0
159  	0 0 0 0 0 0 0 0 0 0 0 0 0
160  	0 0 0 0 0 0 0 0 0 0 0 0 0
161 ~	0 0 1 0 0 0 1 0 1 1 0 0 0
162 s	1 1 1 0 0 0 1 1 1 0 0 0 0
163 t	1 1 1 0 0 0 1 1 1 0 0 0 0
164 u	1 1 1 0 0 0 1 1 1 0 0 0 0
165 v	1 1 1 0 0 0 1 1 1 0 0 0 0
166 w	1 1 1 0 0 0 1 1 1 0 0 0 0
167 x	1 1 1 0 0 0 1 1 1 0 0 0 0
168 y	1 1 1 0 0 0 1 1 1 0 0 0 0
169 z	1 1 1 0 0 0 1 1 1 0 0 0 0
170  	0 0 0 0 0 0 0 0 0 0 0 0 0
171  	0 0 0 0 0 0 0 0 0 0 0 0 0
172  	0 0 0 0 0 0 0 0 0 0 0 0 0
173 [	0 0 1 0 0 0 1 0 1 1 0 0 0
174  	0 0 0 0 0 0 0 0 0 0 0 0 0
175  	0 0 0 0 0 0 0 0 0 0 0 0 0
176  	0 0 0 0 0 0 0 0 0 0 0 0 0
177  	0 0 0 0 0 0 0 0 0 0 0 0 0
178  	0 0 0 0 0 0 0 0 0 0 0 0 0
179  	0 0 0 0 0 0 0 0 0 0 0 0 0
180  	0 0 0 0 0 0 0 0 0 0 0 0 0
181  	0 0 0 0 0 0 0 0 0 0 0 0 0
182  	0 0 0 0 0 0 0 0 0 0 0 0 0
183  	0 0 0 0 0 0 0 0 0 0 0 0 0
184  	0 0 0 0 0 0 0 0 0 0 0 0 0
185  	0 0 0 0 0 0 0 0 0 0 0 0 0
186  	0 0 0 0 0 0 0 0 0 0 0 0 0
187  	0 0 0 0 0 0 0 0 0 0 0 0 0
188  	0 0 0 0 0 0 0 0 0 0 0 0 0
189 ]	0 0 1 0 0 0 1 0 1 1 0 0 0
190  	0 0 0 0 0 0 0 0 0 0 0 0 0
191  	0 0 0 0 0 0 0 0 0 0 0 0 0
192 {	0 0 1 0 0 0 1 0 1 1 0 0 0
193 A	1 1 1 0 0 0 1 0 1 0 0 1 1
194 B	1 1 1 0 0 0 1 0 1 0 0 1 1
195 C	1 1 1 0 0 0 1 0 1 0 0 1 1
196 D	1 1 1 0 0 0 1 0 1 0 0 1 1
197 E	1 1 1 0 0 0 1 0 1 0 0 1 1
198 F	1 1 1 0 0 0 1 0 1 0 0 1 1
199 G	1 1 1 0 0 0 1 0 1 0 0 1 0
200 H	1 1 1 0 0 0 1 0 1 0 0 1 0
201 I	1 1 1 0 0 0 1 0 1 0 0 1 0
202  	0 0 0 0 0 0 0 0 0 0 0 0 0
203  	0 0 0 0 0 0 0 0 0 0 0 0 0
204  	0 0 0 0 0 0 0 0 0 0 0 0 0
205  	0 0 0 0 0 0 0 0 0 0 0 0 0
206  	0 0 0 0 0 0 0 0 0 0 0 0 0
207  	0 0 0 0 0 0 0 0 0 0 0 0 0
208 }	0 0 1 0 0 0 1 0 1 1 0 0 0
209 J	1 1 1 0 0 0 1 0 1 0 0 1 0
210 K	1 1 1 0 0 0 1 0 1 0 0 1 0
211 L	1 1 1 0 0 0 1 0 1 0 0 1 0
212 M	1 1 1 0 0 0 1 0 1 0 0 1 0
213 N	1 1 1 0 0 0 1 0 1 0 0 1 0
214 O	1 1 1 0 0 0 1 0 1 0 0 1 0
215 P	1 1 1 0 0 0 1 0 1 0 0 1 0
216 Q	1 1 1 0 0 0 1 0 1 0 0 1 0
217 R	1 1 1 0 0 0 1 0 1 0 0 1 0
218  	0 0 0 0 0 0 0 0 0 0 0 0 0
219  	0 0 0 0 0 0 0 0 0 0 0 0 0
220  	0 0 0 0 0 0 0 0 0 0 0 0 0
221  	0 0 0 0 0 0 0 0 0 0 0 0 0
222  	0 0 0 0 0 0 0 0 0 0 0 0 0
223  	0 0 0 0 0 0 0 0 0 0 0 0 0
224 \	0 0 1 0 0 0 1 0 1 1 0 0 0
225  	0 0 0 0 0 0 0 0 0 0 0 0 0
226 S	1 1 1 0 0 0 1 0 1 0 0 1 0
227 T	1 1 1 0 0 0 1 0 1 0 0 1 0
228 U	1 1 1 0 0 0 1 0 1 0 0 1 0
229 V	1 1 1 0 0 0 1 0 1 0 0 1 0
230 W	1 1 1 0 0 0 1 0 1 0 0 1 0
231 X	1 1 1 0 0 0 1 0 1 0 0 1 0
232 Y	1 1 1 0 0 0 1 0 1 0 0 1 0
233 Z	1 1 1 0 0 0 1 0 1 0 0 1 0
234  	0 0 0 0 0 0 0 0 0 0 0 0 0
235  	0 0 0 0 0 0 0 0 0 0 0 0 0
236  	0 0 0 0 0 0 0 0 0 0 0 0 0
237  	0 0 0 0 0 0 0 0 0 0 0 0 0
238  	0 0 0 0 0 0 0 0 0 0 0 0 0
239  	0 0 0 0 0 0 0 0 0 0 0 0 0
240 0	1 0 1 0 0 1 1 0 1 0 0 0 1
241 1	1 0 1 0 0 1 1 0 1 0 0 0 1
242 2	1 0 1 0 0 1 1 0 1 0 0 0 1
243 3	1 0 1 0 0 1 1 0 1 0 0 0 1
244 4	1 0 1 0 0 1 1 0 1 0 0 0 1
245 5	1 0 1 0 0 1 1 0 1 0 0 0 1
246 6	1 0 1 0 0 1 1 0 1 0 0 0 1
247 7	1 0 1 0 0 1 1 0 1 0 0 0 1
248 8	1 0 1 0 0 1 1 0 1 0 0 0 1
249 9	1 0 1 0 0 1 1 0 1 0 0 0 1
250  	0 0 0 0 0 0 0 0 0 0 0 0 0
251  	0 0 0 0 0 0 0 0 0 0 0 0 0
252  	0 0 0 0 0 0 0 0 0 0 0 0 0
253  	0 0 0 0 0 0 0 0 0 0 0 0 0
254  	0 0 0 0 0 0 0 0 0 0 0 0 0
255  	0 0 0 0 0 0 0 0 0 0 0 0 0

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-22 23:44       ` Daniel Richard G.
@ 2015-09-23  2:02         ` Paul Eggert
  2015-09-23  6:58           ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-23  2:02 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]

>> Code that
>> uses c-ctype wants only ASCII letters, and departing from this would
>> likely break things.
>
> How would that match occur? c_isalpha() was/is using a "switch"
> statement for EBCDIC.

Oh, sorry, I was assuming that the substitution was being proposed for all the 
functions; but it's being proposed only for c_isascii, c_iscntrl, c_isgraph, 
c_isprint, and c_ispunct.  These functions are so rarely used that it probably 
doesn't matter that much what we do....

> If the user builds in one locale and runs in another, they're going to
> have bigger problems (e.g. garbled program messages). As far as I've
> seen, this is considered "out of bounds" in z/OS usage.

Excellent; that simplifies things.

> A different code page at run time, I think, is not feasible. But
> international users will at least want a different code page at
> compile time.
>
> A simple program could generate tables for all the isxxxxx() functions
> (see below) at compile time. Would you be inclined to do it that way?

I think we can do it without that kind of compile-time hassle, if we can assume 
that the compile-time locale is the same as the run-time.  I installed the 
attached patch, which makes that assumption, and which I hope does the right thing.


[-- Attachment #2: 0001-c-ctype-support-EBCDIC-style-c_isascii.patch --]
[-- Type: text/plain, Size: 5407 bytes --]

From a5ce2c8c0b604a86fd575c6f80384e3189703546 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 22 Sep 2015 18:59:28 -0700
Subject: [PATCH] c-ctype: support EBCDIC-style c_isascii

* lib/c-ctype.c (C_TYPE_EBCDIC): Move to top level.
(c_isascii, c_iscntrl): Assume EBCDIC code page 1047 for control
characters, if EBCDIC.
---
 lib/c-ctype.c | 93 +++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 65 insertions(+), 28 deletions(-)

diff --git a/lib/c-ctype.c b/lib/c-ctype.c
index 558c4af..a3913a1 100644
--- a/lib/c-ctype.c
+++ b/lib/c-ctype.c
@@ -37,6 +37,17 @@ enum { C_CTYPE_CONSECUTIVE_LOWERCASE = false };
 enum { C_CTYPE_CONSECUTIVE_UPPERCASE = false };
 #endif
 
+enum
+  {
+    /* True if this appears to be a host using EBCDIC.  */
+    C_CTYPE_EBCDIC = (' ' == '\x40' && '0' == '\xf0'
+                      && 'A' == '\xc1' && 'J' == '\xd1' && 'S' == '\xe2'
+                      && 'a' == '\x81' && 'j' == '\x91' && 's' == '\xa2')
+  };
+
+/* The implementation currently supports ASCII and EBCDIC.  */
+verify (C_CTYPE_ASCII || C_CTYPE_EBCDIC);
+
 /* Convert an int, which may be promoted from either an unsigned or a
    signed char, to the corresponding char.  */
 
@@ -54,7 +65,45 @@ to_char (int c)
 bool
 c_isascii (int c)
 {
-  return (c >= 0x00 && c <= 0x7f);
+  if (C_CTYPE_ASCII)
+    return 0 <= c && c <= 0x7f;
+
+  /* Use EBCDIC code page 1047's assignments for ASCII control chars;
+     assume all EBCDIC code pages agree about these assignments.  */
+  switch (to_char (c))
+    {
+    case '\x00': case '\x01': case '\x02': case '\x03': case '\x05':
+    case '\x0b': case '\x0c': case '\x0d': case '\x0e': case '\x0f':
+    case '\x10': case '\x11': case '\x12': case '\x13': case '\x15':
+    case '\x16': case '\x18': case '\x19': case '\x1c': case '\x1d':
+    case '\x1e': case '\x1f': case '\x26': case '\x27': case '\x2d':
+    case '\x2e': case '\x2f': case '\x32': case '\x37': case '\x3c':
+    case '\x3d': case '\x3f': case '\xff':
+    case '\xff' < 0 ? 0xff : -1:
+
+    case ' ': case '!': case '"': case '#': case '$': case '%':
+    case '&': case '\'': case '(': case ')': case '*': case '+':
+    case ',': case '-': case '.': case '/':
+    case '0': case '1': case '2': case '3': case '4': case '5':
+    case '6': case '7': case '8': case '9':
+    case ':': case ';': case '<': case '=': case '>': case '?':
+    case '@':
+    case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
+    case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
+    case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
+    case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
+    case 'Y': case 'Z':
+    case '[': case '\\': case ']': case '^': case '_': case '`':
+    case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
+    case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
+    case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
+    case 's': case 't': case 'u': case 'v': case 'w': case 'x':
+    case 'y': case 'z':
+    case '{': case '|': case '}': case '~':
+      return true;
+    default:
+      return false;
+    }
 }
 
 bool
@@ -131,36 +180,24 @@ c_isblank (int c)
 bool
 c_iscntrl (int c)
 {
-  enum { C_CTYPE_EBCDIC = (' ' == '\x40' && '0' == '\xf0'
-                           && 'A' == '\xc1' && 'J' == '\xd1' && 'S' == '\xe2'
-                           && 'a' == '\x81' && 'j' == '\x91' && 's' == '\xa2') };
   if (C_CTYPE_ASCII)
     return (0 <= c && c < ' ') || c == 0x7f;
-  else
+
+  /* Use EBCDIC code page 1047's assignments for ASCII control chars;
+     assume all EBCDIC code pages agree about these assignments.  */
+  switch (c)
     {
-      /* Return true if C corresponds to an ASCII control character.
-         Assume EBCDIC code page 1047, and verify that the compiler
-         agrees with this.  */
-      verify (C_CTYPE_ASCII
-              || (C_CTYPE_EBCDIC
-                  && '!' == '\x5a' && '#' == '\x7b' && '$' == '\x5b'
-                  && '@' == '\x7c' && '[' == '\xad' && '\\' == '\xe0'
-                  && ']' == '\xbd' && '^' == '\x5f' && '_' == '\x6d'
-                  && '`' == '\x79'));
-      switch (c)
-        {
-        case '\x00': case '\x01': case '\x02': case '\x03': case '\x05':
-        case '\x0b': case '\x0c': case '\x0d': case '\x0e': case '\x0f':
-        case '\x10': case '\x11': case '\x12': case '\x13': case '\x15':
-        case '\x16': case '\x18': case '\x19': case '\x1c': case '\x1d':
-        case '\x1e': case '\x1f': case '\x26': case '\x27': case '\x2d':
-        case '\x2e': case '\x2f': case '\x32': case '\x37': case '\x3c':
-        case '\x3d': case '\x3f': case '\xff':
-        case '\xff' < 0 ? 0xff : -1:
-          return true;
-        default:
-          return false;
-        }
+    case '\x00': case '\x01': case '\x02': case '\x03': case '\x05':
+    case '\x0b': case '\x0c': case '\x0d': case '\x0e': case '\x0f':
+    case '\x10': case '\x11': case '\x12': case '\x13': case '\x15':
+    case '\x16': case '\x18': case '\x19': case '\x1c': case '\x1d':
+    case '\x1e': case '\x1f': case '\x26': case '\x27': case '\x2d':
+    case '\x2e': case '\x2f': case '\x32': case '\x37': case '\x3c':
+    case '\x3d': case '\x3f': case '\xff':
+    case '\xff' < 0 ? 0xff : -1:
+      return true;
+    default:
+      return false;
     }
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-23  2:02         ` Paul Eggert
@ 2015-09-23  6:58           ` Daniel Richard G.
  2015-09-23 19:05             ` Paul Eggert
  2015-09-23 19:29             ` Paul Eggert
  0 siblings, 2 replies; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-23  6:58 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

Okay, I tested your latest changes (git 4d83e798). There is one
assertion that needs to be #ifdef'ed out for EBCDIC:

    $ ./test-c-ctype
    /path/to/gltests/test-c-ctype.c:62: assertion 'c_isascii (c) == (c >= 0 && c < 0x80)' failed
    CEE5207E The signal SIGABRT was received.
    ABORT instruction

With that change, test-c-ctype passes.

I also tried a run with signed characters (-qchars=signed), and while
test-c-ctype passed, a number of other things broke. I'll be preparing
and submitting patches for those as well.

On Tue, 2015 Sep 22 19:02-0700, Paul Eggert wrote:
>
> > How would that match occur? c_isalpha() was/is using a "switch"
> > statement for EBCDIC.
>
> Oh, sorry, I was assuming that the substitution was being proposed
> for all the functions; but it's being proposed only for c_isascii,
> c_iscntrl, c_isgraph, c_isprint, and c_ispunct.  These functions
> are so rarely used that it probably doesn't matter that much what
> we do....

Okay, I understand. The functions that already had a complete "switch"
implementation, I left alone; that approach will work pretty much
regardless of encoding.

> > A simple program could generate tables for all the isxxxxx()
> > functions (see below) at compile time. Would you be inclined to do
> > it that way?
>
> I think we can do it without that kind of compile-time hassle, if we
> can assume that the compile-time locale is the same as the run-time.
> I installed the attached patch, which makes that assumption, and which
> I hope does the right thing.

I'm a bit uneasy about hard-coding the list of control characters for
c_iscntrl() like that.

What about having a check in test-c-ctype that compares c_iscntrl() with
its system counterpart? If the assumption is that alternate EBCDIC
encodings used with Gnulib will agree with EBCDIC-1047 on these
characters, then that should be checked.

Also, perhaps, that any character for which c_iscntrl() is true should
return false from most of the other functions...

--Daniel

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-23  6:58           ` Daniel Richard G.
@ 2015-09-23 19:05             ` Paul Eggert
  2015-09-23 19:29             ` Paul Eggert
  1 sibling, 0 replies; 49+ messages in thread
From: Paul Eggert @ 2015-09-23 19:05 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 223 bytes --]

On 09/22/2015 11:58 PM, Daniel Richard G. wrote:
> There is one
> assertion that needs to be #ifdef'ed out for EBCDIC:

Better than that, let's improve the assertion so that it works for 
EBCDIC.  I installed the attached.

[-- Attachment #2: 0001-c-ctype-improve-c_isascii-testing.patch --]
[-- Type: text/x-patch, Size: 1625 bytes --]

>From a7a072a14945dfbe5fdd207926846b2c286b4b83 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 23 Sep 2015 12:02:35 -0700
Subject: [PATCH] c-ctype: improve c_isascii testing

* tests/test-c-ctype.c (test_all): Port c_isascii test to EBCDIC.
Add a test to count the number of ASCII characters.
---
 ChangeLog            | 6 ++++++
 tests/test-c-ctype.c | 8 +++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 7f7910b..493c915 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2015-09-23  Paul Eggert  <eggert@cs.ucla.edu>
+
+	c-ctype: improve c_isascii testing
+	* tests/test-c-ctype.c (test_all): Port c_isascii test to EBCDIC.
+	Add a test to count the number of ASCII characters.
+
 2015-09-22  Paul Eggert  <eggert@cs.ucla.edu>
 
 	savewd: remove SAVEWD_CHDIR_READABLE
diff --git a/tests/test-c-ctype.c b/tests/test-c-ctype.c
index 63d0af9..80eb69d 100644
--- a/tests/test-c-ctype.c
+++ b/tests/test-c-ctype.c
@@ -37,6 +37,7 @@ static void
 test_all (void)
 {
   int c;
+  int n_isascii = 0;
 
   for (c = -0x80; c < 0x100; c++)
     {
@@ -59,7 +60,10 @@ test_all (void)
           ASSERT (to_char (c_toupper (c)) == to_char (c_toupper (c + 0x100)));
         }
 
-      ASSERT (c_isascii (c) == (c >= 0 && c < 0x80));
+      if (0 <= c)
+        n_isascii += c_isascii (c);
+
+      ASSERT (c_isascii (c) == (c_isprint (c) || c_iscntrl (c)));
 
       ASSERT (c_isalnum (c) == (c_isalpha (c) || c_isdigit (c)));
 
@@ -383,6 +387,8 @@ test_all (void)
           break;
         }
     }
+
+  ASSERT (n_isascii == 128);
 }
 
 int
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-23  6:58           ` Daniel Richard G.
  2015-09-23 19:05             ` Paul Eggert
@ 2015-09-23 19:29             ` Paul Eggert
  2015-09-23 21:57               ` Daniel Richard G.
  1 sibling, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-23 19:29 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 543 bytes --]

On 09/22/2015 11:58 PM, Daniel Richard G. wrote:
> What about having a check in test-c-ctype that compares c_iscntrl() with
> its system counterpart? If the assumption is that alternate EBCDIC
> encodings used with Gnulib will agree with EBCDIC-1047 on these
> characters, then that should be checked.

Good idea.  Done in the attached patch.

> Also, perhaps, that any character for which c_iscntrl() is true should
> return false from most of the other functions...

That's already tested by "ASSERT (! (c_iscntrl (c) && c_isprint (c)));".


[-- Attachment #2: 0001-Test-that-c_iscntrl-agrees-with-iscntrl-etc.patch --]
[-- Type: text/x-patch, Size: 5384 bytes --]

>From 54237dcc0ce907673513e4812a1bb270d54737fb Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 23 Sep 2015 12:26:38 -0700
Subject: [PATCH] Test that c_iscntrl agrees with iscntrl, etc.

Suggested by Daniel Richard G. in:
http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00034.html
* modules/c-ctype-tests (Depends-on): Add ctype.
* tests/test-c-ctype.c: Include <ctype.h>.
(NCHARS): New constant.
(test_agree_with_C_locale): New function.
(main): Use it.
(test_all): Use named constants.
---
 ChangeLog             | 10 ++++++++
 modules/c-ctype-tests |  2 +-
 tests/test-c-ctype.c  | 65 ++++++++++++++++++++++++++++++++++++++-------------
 3 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 493c915..7eca2c4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,15 @@
 2015-09-23  Paul Eggert  <eggert@cs.ucla.edu>
 
+	Test that c_iscntrl agrees with iscntrl, etc.
+	Suggested by Daniel Richard G. in:
+	http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00034.html
+	* modules/c-ctype-tests (Depends-on): Add ctype.
+	* tests/test-c-ctype.c: Include <ctype.h>.
+	(NCHARS): New constant.
+	(test_agree_with_C_locale): New function.
+	(main): Use it.
+	(test_all): Use named constants.
+
 	c-ctype: improve c_isascii testing
 	* tests/test-c-ctype.c (test_all): Port c_isascii test to EBCDIC.
 	Add a test to count the number of ASCII characters.
diff --git a/modules/c-ctype-tests b/modules/c-ctype-tests
index 196f529..cb65ee3 100644
--- a/modules/c-ctype-tests
+++ b/modules/c-ctype-tests
@@ -3,10 +3,10 @@ tests/test-c-ctype.c
 tests/macros.h
 
 Depends-on:
+ctype
 
 configure.ac:
 
 Makefile.am:
 TESTS += test-c-ctype
 check_PROGRAMS += test-c-ctype
-
diff --git a/tests/test-c-ctype.c b/tests/test-c-ctype.c
index 80eb69d..481cbbb 100644
--- a/tests/test-c-ctype.c
+++ b/tests/test-c-ctype.c
@@ -20,11 +20,14 @@
 
 #include "c-ctype.h"
 
+#include <ctype.h>
 #include <limits.h>
 #include <locale.h>
 
 #include "macros.h"
 
+enum { NCHARS = UCHAR_MAX + 1 };
+
 static char
 to_char (int c)
 {
@@ -34,30 +37,58 @@ to_char (int c)
 }
 
 static void
+test_agree_with_C_locale (void)
+{
+  int c;
+
+  for (c = 0; c <= UCHAR_MAX; c++)
+    {
+      ASSERT (c_isascii (c) == (isascii (c) != 0));
+      if (c_isascii (c))
+        {
+          ASSERT (c_isalnum (c) == (isalnum (c) != 0));
+          ASSERT (c_isalpha (c) == (isalpha (c) != 0));
+          ASSERT (c_isblank (c) == (isblank (c) != 0));
+          ASSERT (c_iscntrl (c) == (iscntrl (c) != 0));
+          ASSERT (c_isdigit (c) == (isdigit (c) != 0));
+          ASSERT (c_islower (c) == (islower (c) != 0));
+          ASSERT (c_isgraph (c) == (isgraph (c) != 0));
+          ASSERT (c_isprint (c) == (isprint (c) != 0));
+          ASSERT (c_ispunct (c) == (ispunct (c) != 0));
+          ASSERT (c_isspace (c) == (isspace (c) != 0));
+          ASSERT (c_isupper (c) == (isupper (c) != 0));
+          ASSERT (c_isxdigit (c) == (isxdigit (c) != 0));
+          ASSERT (c_tolower (c) == tolower (c));
+          ASSERT (c_toupper (c) == toupper (c));
+        }
+    }
+}
+
+static void
 test_all (void)
 {
   int c;
   int n_isascii = 0;
 
-  for (c = -0x80; c < 0x100; c++)
+  for (c = SCHAR_MIN; c <= UCHAR_MAX; c++)
     {
       if (c < 0)
         {
-          ASSERT (c_isascii (c) == c_isascii (c + 0x100));
-          ASSERT (c_isalnum (c) == c_isalnum (c + 0x100));
-          ASSERT (c_isalpha (c) == c_isalpha (c + 0x100));
-          ASSERT (c_isblank (c) == c_isblank (c + 0x100));
-          ASSERT (c_iscntrl (c) == c_iscntrl (c + 0x100));
-          ASSERT (c_isdigit (c) == c_isdigit (c + 0x100));
-          ASSERT (c_islower (c) == c_islower (c + 0x100));
-          ASSERT (c_isgraph (c) == c_isgraph (c + 0x100));
-          ASSERT (c_isprint (c) == c_isprint (c + 0x100));
-          ASSERT (c_ispunct (c) == c_ispunct (c + 0x100));
-          ASSERT (c_isspace (c) == c_isspace (c + 0x100));
-          ASSERT (c_isupper (c) == c_isupper (c + 0x100));
-          ASSERT (c_isxdigit (c) == c_isxdigit (c + 0x100));
-          ASSERT (to_char (c_tolower (c)) == to_char (c_tolower (c + 0x100)));
-          ASSERT (to_char (c_toupper (c)) == to_char (c_toupper (c + 0x100)));
+          ASSERT (c_isascii (c) == c_isascii (c + NCHARS));
+          ASSERT (c_isalnum (c) == c_isalnum (c + NCHARS));
+          ASSERT (c_isalpha (c) == c_isalpha (c + NCHARS));
+          ASSERT (c_isblank (c) == c_isblank (c + NCHARS));
+          ASSERT (c_iscntrl (c) == c_iscntrl (c + NCHARS));
+          ASSERT (c_isdigit (c) == c_isdigit (c + NCHARS));
+          ASSERT (c_islower (c) == c_islower (c + NCHARS));
+          ASSERT (c_isgraph (c) == c_isgraph (c + NCHARS));
+          ASSERT (c_isprint (c) == c_isprint (c + NCHARS));
+          ASSERT (c_ispunct (c) == c_ispunct (c + NCHARS));
+          ASSERT (c_isspace (c) == c_isspace (c + NCHARS));
+          ASSERT (c_isupper (c) == c_isupper (c + NCHARS));
+          ASSERT (c_isxdigit (c) == c_isxdigit (c + NCHARS));
+          ASSERT (to_char (c_tolower (c)) == to_char (c_tolower (c + NCHARS)));
+          ASSERT (to_char (c_toupper (c)) == to_char (c_toupper (c + NCHARS)));
         }
 
       if (0 <= c)
@@ -394,6 +425,8 @@ test_all (void)
 int
 main ()
 {
+  test_agree_with_C_locale ();
+
   test_all ();
 
   setlocale (LC_ALL, "de_DE");
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-23 19:29             ` Paul Eggert
@ 2015-09-23 21:57               ` Daniel Richard G.
  2015-09-25  7:29                 ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-23 21:57 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

Hi Paul,

I tested your changes in git a406de9c. A handful of fixes are needed:

* c_isascii(): Add \x07 (DEL) as an ASCII character.

* c_isascii(): Drop \xFF (EO), as this is not ASCII.

* c_iscntrl(): Add \x07 (DEL) as a control character.

* c_iscntrl(): Drop \xFF (EO), as apparently this is not a control
  character.

* c_tolower(): In order to agree with tolower(), it needs to return the
  _unsigned_ promoted form of the character. (Returning identity in the
  default: case seems fine.)

* c_toupper(): Likewise.

* test-c-ctype.c: test_all(): As a result of the preceding two changes,
  this is needed:

  -          ASSERT (c_tolower (c) == 'a');
  +          ASSERT (to_char (c_tolower (c)) == 'a');

  -          ASSERT (c_toupper (c) == 'A');
  +          ASSERT (to_char (c_toupper (c)) == 'A');

Signed characters are a real PITA in EBCDIC. Even something like

    wchar_t buf[] = { 'a', 'b', 'c', '\0' };

doesn't work properly in that case.

--Daniel

On Wed, 2015 Sep 23 12:29-0700, Paul Eggert wrote:
> On 09/22/2015 11:58 PM, Daniel Richard G. wrote:
> >
> > What about having a check in test-c-ctype that compares c_iscntrl()
> > with its system counterpart? If the assumption is that alternate
> > EBCDIC encodings used with Gnulib will agree with EBCDIC-1047 on
> > these characters, then that should be checked.
> 
> Good idea.  Done in the attached patch.

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-23 21:57               ` Daniel Richard G.
@ 2015-09-25  7:29                 ` Paul Eggert
  2015-09-26  0:25                   ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-25  7:29 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

Thanks for checking it.  On further thought, I'd rather that we went to inline 
functions, as that would have made ironing out all these glitches easier, and 
anyway inline functions are typically the way to go for this sort of thing 
nowadays.  I installed a further patch to do that (see URL below); it should 
also fix the c-ctype bugs you mentioned.

http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=43a090ce05f7046457be302ae4a17e83351968b0

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-25  7:29                 ` Paul Eggert
@ 2015-09-26  0:25                   ` Daniel Richard G.
  2015-09-26  2:49                     ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-26  0:25 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

Hi Paul,

On Fri, 2015 Sep 25 00:29-0700, Paul Eggert wrote:
> Thanks for checking it.  On further thought, I'd rather that we went
> to inline functions, as that would have made ironing out all these
> glitches easier, and anyway inline functions are typically the way to
> go for this sort of thing nowadays.  I installed a further patch to do
> that (see URL below); it should also fix the c-ctype bugs you
> mentioned.
> 
> http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=43a090ce05f7046457be302ae4a17e83351968b0

When I run test-c-ctype with unsigned chars, a number of assertions trip
starting at c == -127. (-127 + NCHARS == 129 == 'a'). Here is the
complete list for that value, after removing the abort() from ASSERT():

    .../test-c-ctype.c:82: assertion 'c_isascii (c) == c_isascii (c + NCHARS)' failed
    .../test-c-ctype.c:83: assertion 'c_isalnum (c) == c_isalnum (c + NCHARS)' failed
    .../test-c-ctype.c:84: assertion 'c_isalpha (c) == c_isalpha (c + NCHARS)' failed
    .../test-c-ctype.c:88: assertion 'c_islower (c) == c_islower (c + NCHARS)' failed
    .../test-c-ctype.c:89: assertion 'c_isgraph (c) == c_isgraph (c + NCHARS)' failed
    .../test-c-ctype.c:90: assertion 'c_isprint (c) == c_isprint (c + NCHARS)' failed
    .../test-c-ctype.c:94: assertion 'c_isxdigit (c) == c_isxdigit (c + NCHARS)' failed
    .../test-c-ctype.c:96: assertion 'to_char (c_toupper (c)) == to_char (c_toupper (c + NCHARS))' failed
    .../test-c-ctype.c:142: assertion 'c_islower (c) == 1' failed
    .../test-c-ctype.c:203: assertion 'c_isxdigit (c) == 1' failed
    .../test-c-ctype.c:243: assertion 'to_char (c_toupper (c)) == 'A'' failed

    (line numbers will have minor deltas due to printf() debugging)

The way the c_isxxxxx() functions are written now makes it a little
difficult for me to determine what's going on, but it should be
clearer to you.

When I run the test with signed chars, there are only a couple failures,
and they represent an odd corner case of EBCDIC.

So in z/OS, '\n' == 0x15, and that is the normal end-of-line marker:

    $ echo x | od -t x1
    0000000000    A7  15
    0000000002

According to

    https://www-304.ibm.com/support/knowledgecenter/SSLTBW_2.1.0/com.ibm.zos.v2r1.bpxbd00/risasc.htm?lang=en

ISO 8859-1 codepoint 0x0A (LF) corresponds to IBM-1047 codepoint
0x15 (NL/newline).

IBM-1047 does contain LF, at 0x25. But per IBM, that does not map to
anything in ISO 8859-1.

(IANA disagrees, of course: EBCDIC 0x15 == U+0085 and
EBCDIC 0x25 == U+000A. But that does you little good in z/OS.)

What's more, all the system isxxxxx() functions---including isascii(),
iscntrl() and isspace()---return false for 0x25.

There is probably some ancient history behind the NL<->LF mapping,
seeing as EBCDIC has both characters and ASCII only has the latter. My
hypothesis is that UNIX decided to "emulate" NL using LF, and as UNIX
become popular and linefeeds became standardized as an end-of-line
marker, IBM figured it made more sense to map it to NL (as a functional
equivalent) than to LF (as a pedantically-correct translation).

EBCDIC LF not being classified as control nor space looks dodgy. But as
it appears that all control and space characters are also isascii()
characters, I suspect IBM for whatever reason did not want to have a
codepoint that would be an exception to that rule.

So to make a long story short: After I add \x15 and remove \x25 to/from
_C_CTYPE_CNTRL for EBCDIC, the test passes in the signed-char case.

--Daniel

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-26  0:25                   ` Daniel Richard G.
@ 2015-09-26  2:49                     ` Paul Eggert
  2015-09-26  4:39                       ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2015-09-26  2:49 UTC (permalink / raw)
  To: Daniel Richard G., bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 432 bytes --]

Daniel Richard G. wrote:
> So to make a long story short: After I add \x15 and remove \x25 to/from
> _C_CTYPE_CNTRL for EBCDIC, the test passes in the signed-char case.

Thanks, given all that history let's rewrite it so that the compiler can decide 
what '\n' maps to, that way it'll work even in EBCDIC environments that agree 
with IANA instead of IBM.  I installed the attached patch, which should fix the 
bugs you mentioned.


[-- Attachment #2: 0001-c-ctype-port-better-to-z-OS-EBCDIC.patch --]
[-- Type: text/plain, Size: 4416 bytes --]

From b3807b62cc5e4e06a74c69665cb171ef51b40567 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 25 Sep 2015 19:45:59 -0700
Subject: [PATCH] c-ctype: port better to z/OS EBCDIC

Problems reported by Daniel Richard G. in:
http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00050.html
* lib/c-ctype.h (_C_CTYPE_CNTRL): Rewrite in terms of
the C standard escapes and _C_CTYPE_OTHER_CNTRL.
(_C_CTYPE_OTHER_CNTRL): New macro.
* tests/test-c-ctype.c (test_all): Test from CHAR_MIN, not
from SCHAR_MIN, as the functions are defined only from values
promoted from char or from unsigned char, not necessarily from
signed char.
---
 ChangeLog            | 13 +++++++++++++
 lib/c-ctype.h        | 41 +++++++++++++++++++++++------------------
 tests/test-c-ctype.c |  2 +-
 3 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 3b3f101..a347908 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,16 @@
+2015-09-25  Paul Eggert  <eggert@cs.ucla.edu>
+
+	c-ctype: port better to z/OS EBCDIC
+	Problems reported by Daniel Richard G. in:
+	http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00050.html
+	* lib/c-ctype.h (_C_CTYPE_CNTRL): Rewrite in terms of
+	the C standard escapes and _C_CTYPE_OTHER_CNTRL.
+	(_C_CTYPE_OTHER_CNTRL): New macro.
+	* tests/test-c-ctype.c (test_all): Test from CHAR_MIN, not
+	from SCHAR_MIN, as the functions are defined only from values
+	promoted from char or from unsigned char, not necessarily from
+	signed char.
+
 2015-09-25  Pavel Raiskup  <praiskup@redhat.com>
 
 	gnulib-common.m4: fix gl_PROG_AR_RANLIB/AM_PROG_AR clash
diff --git a/lib/c-ctype.h b/lib/c-ctype.h
index 1292fc8..88e001f 100644
--- a/lib/c-ctype.h
+++ b/lib/c-ctype.h
@@ -80,30 +80,35 @@ extern "C" {
 
 #define _C_CTYPE_SIGNED_EBCDIC ('A' < 0)
 
+/* Cases for control characters.  */
+
+#define _C_CTYPE_CNTRL \
+   case '\a': case '\b': case '\f': case '\n': \
+   case '\r': case '\t': case '\v': \
+   _C_CTYPE_OTHER_CNTRL
+
+/* ASCII control characters other than those with \-letter escapes.  */
+
 #if C_CTYPE_ASCII
-# define _C_CTYPE_CNTRL \
+# define _C_CTYPE_OTHER_CNTRL \
     case '\x00': case '\x01': case '\x02': case '\x03': \
-    case '\x04': case '\x05': case '\x06': case '\x07': \
-    case '\x08': case '\x09': case '\x0a': case '\x0b': \
-    case '\x0c': case '\x0d': case '\x0e': case '\x0f': \
-    case '\x10': case '\x11': case '\x12': case '\x13': \
-    case '\x14': case '\x15': case '\x16': case '\x17': \
-    case '\x18': case '\x19': case '\x1a': case '\x1b': \
-    case '\x1c': case '\x1d': case '\x1e': case '\x1f': \
-    case '\x7f'
+    case '\x04': case '\x05': case '\x06': case '\x0e': \
+    case '\x0f': case '\x10': case '\x11': case '\x12': \
+    case '\x13': case '\x14': case '\x15': case '\x16': \
+    case '\x17': case '\x18': case '\x19': case '\x1a': \
+    case '\x1b': case '\x1c': case '\x1d': case '\x1e': \
+    case '\x1f': case '\x7f'
 #else
    /* Use EBCDIC code page 1047's assignments for ASCII control chars;
       assume all EBCDIC code pages agree about these assignments.  */
-# define _C_CTYPE_CNTRL \
+# define _C_CTYPE_OTHER_CNTRL \
     case '\x00': case '\x01': case '\x02': case '\x03': \
-    case '\x05': case '\x07': case '\x0b': case '\x0c': \
-    case '\x0d': case '\x0e': case '\x0f': case '\x10': \
-    case '\x11': case '\x12': case '\x13': case '\x16': \
-    case '\x18': case '\x19': case '\x1c': case '\x1d': \
-    case '\x1e': case '\x1f': case '\x25': case '\x26': \
-    case '\x27': case '\x2d': case '\x2e': case '\x2f': \
-    case '\x32': case '\x37': case '\x3c': case '\x3d': \
-    case '\x3f'
+    case '\x07': case '\x0e': case '\x0f': case '\x10': \
+    case '\x11': case '\x12': case '\x13': case '\x18': \
+    case '\x19': case '\x1c': case '\x1d': case '\x1e': \
+    case '\x1f': case '\x26': case '\x27': case '\x2d': \
+    case '\x2e': case '\x32': case '\x37': case '\x3c': \
+    case '\x3d': case '\x3f'
 #endif
 
 /* Cases for hex letter digits, digits, lower, and upper, offset by N.  */
diff --git a/tests/test-c-ctype.c b/tests/test-c-ctype.c
index d25dc03..544adeb 100644
--- a/tests/test-c-ctype.c
+++ b/tests/test-c-ctype.c
@@ -70,7 +70,7 @@ test_all (void)
   int c;
   int n_isascii = 0;
 
-  for (c = SCHAR_MIN; c <= UCHAR_MAX; c++)
+  for (c = CHAR_MIN; c <= UCHAR_MAX; c++)
     {
       if (c < 0)
         {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-26  2:49                     ` Paul Eggert
@ 2015-09-26  4:39                       ` Daniel Richard G.
  2015-09-26 16:08                         ` Ben Pfaff
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-26  4:39 UTC (permalink / raw)
  To: Paul Eggert, bug-gnulib

On Fri, 2015 Sep 25 19:49-0700, Paul Eggert wrote:
>
> Thanks, given all that history let's rewrite it so that the compiler
> can decide what '\n' maps to, that way it'll work even in EBCDIC
> environments that agree with IANA instead of IBM.  I installed the
> attached patch, which should fix the bugs you mentioned.

I'm happy to report that test-c-ctype in Git ff1ef114 now passes with
both signed and unsigned EBCDIC chars on z/OS. Thank you for chasing
this down!

I will investigate further some of the issues that have been uncovered
in this thread, and return to this list with my findings in a few days.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-26  4:39                       ` Daniel Richard G.
@ 2015-09-26 16:08                         ` Ben Pfaff
  2015-09-27  6:31                           ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Ben Pfaff @ 2015-09-26 16:08 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: Paul Eggert, bug-gnulib

On Sat, Sep 26, 2015 at 12:39:52AM -0400, Daniel Richard G. wrote:
> I'm happy to report that test-c-ctype in Git ff1ef114 now passes with
> both signed and unsigned EBCDIC chars on z/OS. Thank you for chasing
> this down!

A "char" configured as signed in EBCDIC violates the ANSI C standard,
which says:

     If a member of the basic execution character set is stored in a
     char object, its value is guaranteed to be positive.

whereas the "basic execution character set" is defined as:

     Both the basic source and basic execution character sets shall have
     the following members: the 26 uppercase letters of the Latin
     alphabet

             A B C D E F G H I J K L M
             N O P Q R S T U V W X Y Z
     the 26 lowercase letters of the Latin alphabet
             a b c d e f g h i j k l m
             n o p q r s t u v w x y z
     the 10 decimal digits
             0 1 2 3 4 5 6 7 8 9
     the following 29 graphic characters
             ! " # % & ' ( ) * + , - . / :
             ; < = > ? [ \ ] ^ _ { | } ~
     the space character, and control characters representing horizontal
     tab, vertical tab, and form feed.

Do people actually used signed "char" with EBCDIC?


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-26 16:08                         ` Ben Pfaff
@ 2015-09-27  6:31                           ` Daniel Richard G.
  2015-09-27  6:59                             ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-27  6:31 UTC (permalink / raw)
  To: Ben Pfaff; +Cc: Paul Eggert, bug-gnulib

On Sat, 2015 Sep 26 09:08-0700, Ben Pfaff wrote:
> 
> A "char" configured as signed in EBCDIC violates the ANSI C standard,
> which says:
>
>      If a member of the basic execution character set is stored in a
>      char object, its value is guaranteed to be positive.

Now _that's_ a welcome bit of clarity.

While (IMO) it is reasonable to support the oddball case of negative
basic chars where the logic can be centralized, there are numerous
instances where problems arise in common C idioms that are less cleanly
addressable.

Examples that I've found so far in Gnulib include

    if (getc(f) == 'x') { ... }

    wchar_t buf[] = { 'a', 'b', 'c', '\0' };

> Do people actually used signed "char" with EBCDIC?

It's certainly not the default, but given the sort of history and
longevity that surround many mainframe installations, I wouldn't
be surprised if some folks do. Not that the xlc man page gives
any hint why:

     -qchars={signed|unsigned}
            Determines whether all variables of type char are
            treated as either signed or unsigned.
            The default is -qchars=unsigned.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-27  6:31                           ` Daniel Richard G.
@ 2015-09-27  6:59                             ` Paul Eggert
  2015-09-28  2:09                               ` Daniel Richard G.
  2015-10-15  4:49                               ` Daniel Richard G.
  0 siblings, 2 replies; 49+ messages in thread
From: Paul Eggert @ 2015-09-27  6:59 UTC (permalink / raw)
  To: Daniel Richard G., Ben Pfaff; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

Daniel Richard G. wrote:
> It's certainly not the default, but given the sort of history and
> longevity that surround many mainframe installations, I wouldn't
> be surprised if some folks do.

Given all the problems mentioned (including some in the proposed patches), let's 
give up on trying to support any such folks.  If they want to build gnulib-using 
software on z/OS, they'll have to build with the default configuration in which 
char is unsigned.  It wouldn't be practical for us to try to support char being 
signed when standard chars have the top bit set.  With that in mind I installed 
the attached further patch, which simplifies the recent changes to c-ctype quite 
a bit.


[-- Attachment #2: 0001-c-ctype-do-not-worry-about-EBCDIC-char-signed.patch --]
[-- Type: text/plain, Size: 22483 bytes --]

From d25768c6961d9d94492c328035a35ceb140dec6f Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 26 Sep 2015 23:55:07 -0700
Subject: [PATCH] c-ctype: do not worry about EBCDIC + char signed

Drop support for EBCDIC with char being signed, as this breaks too
many programs.  Problem reported by Ben Pfaff in:
http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00053.html
* lib/c-ctype.h: Verify that we are not using EBCDIC with
char being signed.
(_C_CTYPE_LOWER_A_THRU_F_N): New macro.
(_C_CTYPE_LOWER_N, _C_CTYPE_A_THRU_F): Use it.
(_C_CTYPE_DIGIT, _C_CTYPE_LOWER, _C_CTYPE_PUNCT, _C_CTYPE_UPPER):
(c_isascii, c_isgraph, c_isprint, c_ispunct, c_tolower, c_toupper):
* tests/test-c-ctype.c (test_all):
Simplify by assuming standard char values cannot be negative.
* tests/test-c-ctype.c (NCHARS, to_char): Remove; all uses removed.
---
 ChangeLog            |  16 ++
 lib/c-ctype.h        | 490 ++++-----------------------------------------------
 tests/test-c-ctype.c | 137 +++++---------
 3 files changed, 88 insertions(+), 555 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index a347908..1584e29 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,19 @@
+2015-09-26  Paul Eggert  <eggert@cs.ucla.edu>
+
+	c-ctype: do not worry about EBCDIC + char signed
+	Drop support for EBCDIC with char being signed, as this breaks too
+	many programs.  Problem reported by Ben Pfaff in:
+	http://lists.gnu.org/archive/html/bug-gnulib/2015-09/msg00053.html
+	* lib/c-ctype.h: Verify that we are not using EBCDIC with
+	char being signed.
+	(_C_CTYPE_LOWER_A_THRU_F_N): New macro.
+	(_C_CTYPE_LOWER_N, _C_CTYPE_A_THRU_F): Use it.
+	(_C_CTYPE_DIGIT, _C_CTYPE_LOWER, _C_CTYPE_PUNCT, _C_CTYPE_UPPER):
+	(c_isascii, c_isgraph, c_isprint, c_ispunct, c_tolower, c_toupper):
+	* tests/test-c-ctype.c (test_all):
+	Simplify by assuming standard char values cannot be negative.
+	* tests/test-c-ctype.c (NCHARS, to_char): Remove; all uses removed.
+
 2015-09-25  Paul Eggert  <eggert@cs.ucla.edu>
 
 	c-ctype: port better to z/OS EBCDIC
diff --git a/lib/c-ctype.h b/lib/c-ctype.h
index 88e001f..907e1e2 100644
--- a/lib/c-ctype.h
+++ b/lib/c-ctype.h
@@ -78,7 +78,9 @@ extern "C" {
 # error "Only ASCII and EBCDIC are supported"
 #endif
 
-#define _C_CTYPE_SIGNED_EBCDIC ('A' < 0)
+#if 'A' < 0
+# error "EBCDIC and char is signed -- not supported"
+#endif
 
 /* Cases for control characters.  */
 
@@ -111,54 +113,30 @@ extern "C" {
     case '\x3d': case '\x3f'
 #endif
 
-/* Cases for hex letter digits, digits, lower, and upper, offset by N.  */
+/* Cases for lowercase hex letters, and lowercase letters, all offset by N.  */
 
-#define _C_CTYPE_A_THRU_F_N(n) \
+#define _C_CTYPE_LOWER_A_THRU_F_N(n) \
    case 'a' + (n): case 'b' + (n): case 'c' + (n): case 'd' + (n): \
-   case 'e' + (n): case 'f' + (n): \
-   case 'A' + (n): case 'B' + (n): case 'C' + (n): case 'D' + (n): \
-   case 'E' + (n): case 'F' + (n)
-#define _C_CTYPE_DIGIT_N(n) \
-   case '0' + (n): case '1' + (n): case '2' + (n): case '3' + (n): \
-   case '4' + (n): case '5' + (n): case '6' + (n): case '7' + (n): \
-   case '8' + (n): case '9' + (n)
+   case 'e' + (n): case 'f' + (n)
 #define _C_CTYPE_LOWER_N(n) \
-   case 'a' + (n): case 'b' + (n): case 'c' + (n): case 'd' + (n): \
-   case 'e' + (n): case 'f' + (n): case 'g' + (n): case 'h' + (n): \
-   case 'i' + (n): case 'j' + (n): case 'k' + (n): case 'l' + (n): \
-   case 'm' + (n): case 'n' + (n): case 'o' + (n): case 'p' + (n): \
-   case 'q' + (n): case 'r' + (n): case 's' + (n): case 't' + (n): \
-   case 'u' + (n): case 'v' + (n): case 'w' + (n): case 'x' + (n): \
-   case 'y' + (n): case 'z' + (n)
-#define _C_CTYPE_UPPER_N(n) \
-   case 'A' + (n): case 'B' + (n): case 'C' + (n): case 'D' + (n): \
-   case 'E' + (n): case 'F' + (n): case 'G' + (n): case 'H' + (n): \
-   case 'I' + (n): case 'J' + (n): case 'K' + (n): case 'L' + (n): \
-   case 'M' + (n): case 'N' + (n): case 'O' + (n): case 'P' + (n): \
-   case 'Q' + (n): case 'R' + (n): case 'S' + (n): case 'T' + (n): \
-   case 'U' + (n): case 'V' + (n): case 'W' + (n): case 'X' + (n): \
-   case 'Y' + (n): case 'Z' + (n)
-
-/* Given MACRO_N, expand to all the cases for the corresponding class.  */
-#if _C_CTYPE_SIGNED_EBCDIC
-# define _C_CTYPE_CASES(macro_n) macro_n (0): macro_n (256)
-#else
-# define _C_CTYPE_CASES(macro_n) macro_n (0)
-#endif
-
-/* Cases for hex letter digits, digits, lower, and upper, with another
-   case for unsigned char if the original char is negative.  */
-
-#define _C_CTYPE_A_THRU_F _C_CTYPE_CASES (_C_CTYPE_A_THRU_F_N)
-#define _C_CTYPE_DIGIT _C_CTYPE_CASES (_C_CTYPE_DIGIT_N)
-#define _C_CTYPE_LOWER _C_CTYPE_CASES (_C_CTYPE_LOWER_N)
-#define _C_CTYPE_UPPER _C_CTYPE_CASES (_C_CTYPE_UPPER_N)
-
-/* The punct class differs because some punctuation characters may be
-   negative while others are nonnegative.  Instead of attempting to
-   define _C_CTYPE_PUNCT, define just the plain chars here, and do any
-   cases-plus-256 by hand after using this macro.  */
-#define _C_CTYPE_PUNCT_PLAIN \
+   _C_CTYPE_LOWER_A_THRU_F_N(n): \
+   case 'g' + (n): case 'h' + (n): case 'i' + (n): case 'j' + (n): \
+   case 'k' + (n): case 'l' + (n): case 'm' + (n): case 'n' + (n): \
+   case 'o' + (n): case 'p' + (n): case 'q' + (n): case 'r' + (n): \
+   case 's' + (n): case 't' + (n): case 'u' + (n): case 'v' + (n): \
+   case 'w' + (n): case 'x' + (n): case 'y' + (n): case 'z' + (n)
+
+/* Cases for hex letters, digits, lower, punct, and upper.  */
+
+#define _C_CTYPE_A_THRU_F \
+   _C_CTYPE_LOWER_A_THRU_F_N (0): \
+   _C_CTYPE_LOWER_A_THRU_F_N ('A' - 'a')
+#define _C_CTYPE_DIGIT                     \
+   case '0': case '1': case '2': case '3': \
+   case '4': case '5': case '6': case '7': \
+   case '8': case '9'
+#define _C_CTYPE_LOWER _C_CTYPE_LOWER_N (0)
+#define _C_CTYPE_PUNCT \
    case '!': case '"': case '#': case '$':  \
    case '%': case '&': case '\'': case '(': \
    case ')': case '*': case '+': case ',':  \
@@ -167,6 +145,8 @@ extern "C" {
    case '?': case '@': case '[': case '\\': \
    case ']': case '^': case '_': case '`':  \
    case '{': case '|': case '}': case '~'
+#define _C_CTYPE_UPPER _C_CTYPE_LOWER_N ('A' - 'a')
+
 
 /* Function definitions.  */
 
@@ -194,7 +174,6 @@ c_isalnum (int c)
     _C_CTYPE_LOWER:
     _C_CTYPE_UPPER:
       return true;
-
     default:
       return false;
     }
@@ -208,7 +187,6 @@ c_isalpha (int c)
     _C_CTYPE_LOWER:
     _C_CTYPE_UPPER:
       return true;
-
     default:
       return false;
     }
@@ -225,107 +203,9 @@ c_isascii (int c)
     _C_CTYPE_CNTRL:
     _C_CTYPE_DIGIT:
     _C_CTYPE_LOWER:
+    _C_CTYPE_PUNCT:
     _C_CTYPE_UPPER:
-
-    _C_CTYPE_PUNCT_PLAIN:
-#if '!' < 0
-    case '!' + 256:
-#endif
-#if '"' < 0
-    case '"' + 256:
-#endif
-#if '#' < 0
-    case '#' + 256:
-#endif
-#if '$' < 0
-    case '$' + 256:
-#endif
-#if '%' < 0
-    case '%' + 256:
-#endif
-#if '&' < 0
-    case '&' + 256:
-#endif
-#if '\'' < 0
-    case '\'' + 256:
-#endif
-#if '(' < 0
-    case '(' + 256:
-#endif
-#if ')' < 0
-    case ')' + 256:
-#endif
-#if '*' < 0
-    case '*' + 256:
-#endif
-#if '+' < 0
-    case '+' + 256:
-#endif
-#if ',' < 0
-    case ',' + 256:
-#endif
-#if '-' < 0
-    case '-' + 256:
-#endif
-#if '.' < 0
-    case '.' + 256:
-#endif
-#if '/' < 0
-    case '/' + 256:
-#endif
-#if ':' < 0
-    case ':' + 256:
-#endif
-#if ';' < 0
-    case ';' + 256:
-#endif
-#if '<' < 0
-    case '<' + 256:
-#endif
-#if '=' < 0
-    case '=' + 256:
-#endif
-#if '>' < 0
-    case '>' + 256:
-#endif
-#if '?' < 0
-    case '?' + 256:
-#endif
-#if '@' < 0
-    case '@' + 256:
-#endif
-#if '[' < 0
-    case '[' + 256:
-#endif
-#if '\\' < 0
-    case '\\' + 256:
-#endif
-#if ']' < 0
-    case ']' + 256:
-#endif
-#if '^' < 0
-    case '^' + 256:
-#endif
-#if '_' < 0
-    case '_' + 256:
-#endif
-#if '`' < 0
-    case '`' + 256:
-#endif
-#if '{' < 0
-    case '{' + 256:
-#endif
-#if '|' < 0
-    case '|' + 256:
-#endif
-#if '}' < 0
-    case '}' + 256:
-#endif
-#if '~' < 0
-    case '~' + 256:
-#endif
       return true;
-
     default:
       return false;
     }
@@ -368,107 +248,9 @@ c_isgraph (int c)
     {
     _C_CTYPE_DIGIT:
     _C_CTYPE_LOWER:
+    _C_CTYPE_PUNCT:
     _C_CTYPE_UPPER:
-
-    _C_CTYPE_PUNCT_PLAIN:
-#if '!' < 0
-    case '!' + 256:
-#endif
-#if '"' < 0
-    case '"' + 256:
-#endif
-#if '#' < 0
-    case '#' + 256:
-#endif
-#if '$' < 0
-    case '$' + 256:
-#endif
-#if '%' < 0
-    case '%' + 256:
-#endif
-#if '&' < 0
-    case '&' + 256:
-#endif
-#if '\'' < 0
-    case '\'' + 256:
-#endif
-#if '(' < 0
-    case '(' + 256:
-#endif
-#if ')' < 0
-    case ')' + 256:
-#endif
-#if '*' < 0
-    case '*' + 256:
-#endif
-#if '+' < 0
-    case '+' + 256:
-#endif
-#if ',' < 0
-    case ',' + 256:
-#endif
-#if '-' < 0
-    case '-' + 256:
-#endif
-#if '.' < 0
-    case '.' + 256:
-#endif
-#if '/' < 0
-    case '/' + 256:
-#endif
-#if ':' < 0
-    case ':' + 256:
-#endif
-#if ';' < 0
-    case ';' + 256:
-#endif
-#if '<' < 0
-    case '<' + 256:
-#endif
-#if '=' < 0
-    case '=' + 256:
-#endif
-#if '>' < 0
-    case '>' + 256:
-#endif
-#if '?' < 0
-    case '?' + 256:
-#endif
-#if '@' < 0
-    case '@' + 256:
-#endif
-#if '[' < 0
-    case '[' + 256:
-#endif
-#if '\\' < 0
-    case '\\' + 256:
-#endif
-#if ']' < 0
-    case ']' + 256:
-#endif
-#if '^' < 0
-    case '^' + 256:
-#endif
-#if '_' < 0
-    case '_' + 256:
-#endif
-#if '`' < 0
-    case '`' + 256:
-#endif
-#if '{' < 0
-    case '{' + 256:
-#endif
-#if '|' < 0
-    case '|' + 256:
-#endif
-#if '}' < 0
-    case '}' + 256:
-#endif
-#if '~' < 0
-    case '~' + 256:
-#endif
       return true;
-
     default:
       return false;
     }
@@ -494,107 +276,9 @@ c_isprint (int c)
     case ' ':
     _C_CTYPE_DIGIT:
     _C_CTYPE_LOWER:
+    _C_CTYPE_PUNCT:
     _C_CTYPE_UPPER:
-
-    _C_CTYPE_PUNCT_PLAIN:
-#if '!' < 0
-    case '!' + 256:
-#endif
-#if '"' < 0
-    case '"' + 256:
-#endif
-#if '#' < 0
-    case '#' + 256:
-#endif
-#if '$' < 0
-    case '$' + 256:
-#endif
-#if '%' < 0
-    case '%' + 256:
-#endif
-#if '&' < 0
-    case '&' + 256:
-#endif
-#if '\'' < 0
-    case '\'' + 256:
-#endif
-#if '(' < 0
-    case '(' + 256:
-#endif
-#if ')' < 0
-    case ')' + 256:
-#endif
-#if '*' < 0
-    case '*' + 256:
-#endif
-#if '+' < 0
-    case '+' + 256:
-#endif
-#if ',' < 0
-    case ',' + 256:
-#endif
-#if '-' < 0
-    case '-' + 256:
-#endif
-#if '.' < 0
-    case '.' + 256:
-#endif
-#if '/' < 0
-    case '/' + 256:
-#endif
-#if ':' < 0
-    case ':' + 256:
-#endif
-#if ';' < 0
-    case ';' + 256:
-#endif
-#if '<' < 0
-    case '<' + 256:
-#endif
-#if '=' < 0
-    case '=' + 256:
-#endif
-#if '>' < 0
-    case '>' + 256:
-#endif
-#if '?' < 0
-    case '?' + 256:
-#endif
-#if '@' < 0
-    case '@' + 256:
-#endif
-#if '[' < 0
-    case '[' + 256:
-#endif
-#if '\\' < 0
-    case '\\' + 256:
-#endif
-#if ']' < 0
-    case ']' + 256:
-#endif
-#if '^' < 0
-    case '^' + 256:
-#endif
-#if '_' < 0
-    case '_' + 256:
-#endif
-#if '`' < 0
-    case '`' + 256:
-#endif
-#if '{' < 0
-    case '{' + 256:
-#endif
-#if '|' < 0
-    case '|' + 256:
-#endif
-#if '}' < 0
-    case '}' + 256:
-#endif
-#if '~' < 0
-    case '~' + 256:
-#endif
       return true;
-
     default:
       return false;
     }
@@ -605,105 +289,8 @@ c_ispunct (int c)
 {
   switch (c)
     {
-    _C_CTYPE_PUNCT_PLAIN:
-#if '!' < 0
-    case '!' + 256:
-#endif
-#if '"' < 0
-    case '"' + 256:
-#endif
-#if '#' < 0
-    case '#' + 256:
-#endif
-#if '$' < 0
-    case '$' + 256:
-#endif
-#if '%' < 0
-    case '%' + 256:
-#endif
-#if '&' < 0
-    case '&' + 256:
-#endif
-#if '\'' < 0
-    case '\'' + 256:
-#endif
-#if '(' < 0
-    case '(' + 256:
-#endif
-#if ')' < 0
-    case ')' + 256:
-#endif
-#if '*' < 0
-    case '*' + 256:
-#endif
-#if '+' < 0
-    case '+' + 256:
-#endif
-#if ',' < 0
-    case ',' + 256:
-#endif
-#if '-' < 0
-    case '-' + 256:
-#endif
-#if '.' < 0
-    case '.' + 256:
-#endif
-#if '/' < 0
-    case '/' + 256:
-#endif
-#if ':' < 0
-    case ':' + 256:
-#endif
-#if ';' < 0
-    case ';' + 256:
-#endif
-#if '<' < 0
-    case '<' + 256:
-#endif
-#if '=' < 0
-    case '=' + 256:
-#endif
-#if '>' < 0
-    case '>' + 256:
-#endif
-#if '?' < 0
-    case '?' + 256:
-#endif
-#if '@' < 0
-    case '@' + 256:
-#endif
-#if '[' < 0
-    case '[' + 256:
-#endif
-#if '\\' < 0
-    case '\\' + 256:
-#endif
-#if ']' < 0
-    case ']' + 256:
-#endif
-#if '^' < 0
-    case '^' + 256:
-#endif
-#if '_' < 0
-    case '_' + 256:
-#endif
-#if '`' < 0
-    case '`' + 256:
-#endif
-#if '{' < 0
-    case '{' + 256:
-#endif
-#if '|' < 0
-    case '|' + 256:
-#endif
-#if '}' < 0
-    case '}' + 256:
-#endif
-#if '~' < 0
-    case '~' + 256:
-#endif
+    _C_CTYPE_PUNCT:
       return true;
-
     default:
       return false;
     }
@@ -741,7 +328,6 @@ c_isxdigit (int c)
     _C_CTYPE_DIGIT:
     _C_CTYPE_A_THRU_F:
       return true;
-
     default:
       return false;
     }
@@ -752,14 +338,8 @@ c_tolower (int c)
 {
   switch (c)
     {
-    _C_CTYPE_UPPER_N (0):
-#if _C_CTYPE_SIGNED_EBCDIC
-      c += 256;
-      /* Fall through.  */
-    _C_CTYPE_UPPER_N (256):
-#endif
+    _C_CTYPE_UPPER:
       return c - 'A' + 'a';
-
     default:
       return c;
     }
@@ -770,14 +350,8 @@ c_toupper (int c)
 {
   switch (c)
     {
-    _C_CTYPE_LOWER_N (0):
-#if _C_CTYPE_SIGNED_EBCDIC
-      c += 256;
-      /* Fall through.  */
-    _C_CTYPE_LOWER_N (256):
-#endif
+    _C_CTYPE_LOWER:
       return c - 'a' + 'A';
-
     default:
       return c;
     }
diff --git a/tests/test-c-ctype.c b/tests/test-c-ctype.c
index 544adeb..9780554 100644
--- a/tests/test-c-ctype.c
+++ b/tests/test-c-ctype.c
@@ -26,16 +26,6 @@
 
 #include "macros.h"
 
-enum { NCHARS = UCHAR_MAX + 1 };
-
-static char
-to_char (int c)
-{
-  if (CHAR_MIN < 0 && CHAR_MAX < c)
-    return c - CHAR_MAX - 1 + CHAR_MIN;
-  return c;
-}
-
 static void
 test_agree_with_C_locale (void)
 {
@@ -72,27 +62,30 @@ test_all (void)
 
   for (c = CHAR_MIN; c <= UCHAR_MAX; c++)
     {
-      if (c < 0)
+      if (! (0 <= c && c <= CHAR_MAX))
         {
-          ASSERT (c_isascii (c) == c_isascii (c + NCHARS));
-          ASSERT (c_isalnum (c) == c_isalnum (c + NCHARS));
-          ASSERT (c_isalpha (c) == c_isalpha (c + NCHARS));
-          ASSERT (c_isblank (c) == c_isblank (c + NCHARS));
-          ASSERT (c_iscntrl (c) == c_iscntrl (c + NCHARS));
-          ASSERT (c_isdigit (c) == c_isdigit (c + NCHARS));
-          ASSERT (c_islower (c) == c_islower (c + NCHARS));
-          ASSERT (c_isgraph (c) == c_isgraph (c + NCHARS));
-          ASSERT (c_isprint (c) == c_isprint (c + NCHARS));
-          ASSERT (c_ispunct (c) == c_ispunct (c + NCHARS));
-          ASSERT (c_isspace (c) == c_isspace (c + NCHARS));
-          ASSERT (c_isupper (c) == c_isupper (c + NCHARS));
-          ASSERT (c_isxdigit (c) == c_isxdigit (c + NCHARS));
-          ASSERT (to_char (c_tolower (c)) == to_char (c_tolower (c + NCHARS)));
-          ASSERT (to_char (c_toupper (c)) == to_char (c_toupper (c + NCHARS)));
+          ASSERT (! c_isascii (c));
+          ASSERT (! c_isalnum (c));
+          ASSERT (! c_isalpha (c));
+          ASSERT (! c_isblank (c));
+          ASSERT (! c_iscntrl (c));
+          ASSERT (! c_isdigit (c));
+          ASSERT (! c_islower (c));
+          ASSERT (! c_isgraph (c));
+          ASSERT (! c_isprint (c));
+          ASSERT (! c_ispunct (c));
+          ASSERT (! c_isspace (c));
+          ASSERT (! c_isupper (c));
+          ASSERT (! c_isxdigit (c));
+          ASSERT (c_tolower (c) == c);
+          ASSERT (c_toupper (c) == c);
         }
 
-      if (0 <= c)
-        n_isascii += c_isascii (c);
+      n_isascii += c_isascii (c);
+
+#ifdef C_CTYPE_ASCII
+      ASSERT (c_isascii (c) == (0 <= c && c <= 0x7f));
+#endif
 
       ASSERT (c_isascii (c) == (c_isprint (c) || c_iscntrl (c)));
 
@@ -100,7 +93,7 @@ test_all (void)
 
       ASSERT (c_isalpha (c) == (c_islower (c) || c_isupper (c)));
 
-      switch (to_char (c))
+      switch (c)
         {
         case '\t': case ' ':
           ASSERT (c_isblank (c) == 1);
@@ -114,9 +107,17 @@ test_all (void)
       ASSERT (c_iscntrl (c) == ((c >= 0 && c < 0x20) || c == 0x7f));
 #endif
 
+      switch (c)
+        {
+        case '\a': case '\b': case '\f': case '\n':
+        case '\r': case '\t': case '\v':
+          ASSERT (c_iscntrl (c));
+          break;
+        }
+
       ASSERT (! (c_iscntrl (c) && c_isprint (c)));
 
-      switch (to_char (c))
+      switch (c)
         {
         case '0': case '1': case '2': case '3': case '4': case '5':
         case '6': case '7': case '8': case '9':
@@ -127,7 +128,7 @@ test_all (void)
           break;
         }
 
-      switch (to_char (c))
+      switch (c)
         {
         case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
         case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
@@ -135,9 +136,11 @@ test_all (void)
         case 's': case 't': case 'u': case 'v': case 'w': case 'x':
         case 'y': case 'z':
           ASSERT (c_islower (c) == 1);
+          ASSERT (c_toupper (c) == c - 'a' + 'A');
           break;
         default:
           ASSERT (c_islower (c) == 0);
+          ASSERT (c_toupper (c) == c);
           break;
         }
 
@@ -151,7 +154,7 @@ test_all (void)
 
       ASSERT (c_isprint (c) == (c_isgraph (c) || c == ' '));
 
-      switch (to_char (c))
+      switch (c)
         {
         case '!': case '"': case '#': case '$': case '%': case '&': case '\'':
         case '(': case ')': case '*': case '+': case ',': case '-': case '.':
@@ -165,7 +168,7 @@ test_all (void)
           break;
         }
 
-      switch (to_char (c))
+      switch (c)
         {
         case ' ': case '\t': case '\n': case '\v': case '\f': case '\r':
           ASSERT (c_isspace (c) == 1);
@@ -175,7 +178,7 @@ test_all (void)
           break;
         }
 
-      switch (to_char (c))
+      switch (c)
         {
         case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
         case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
@@ -183,13 +186,15 @@ test_all (void)
         case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
         case 'Y': case 'Z':
           ASSERT (c_isupper (c) == 1);
+          ASSERT (c_tolower (c) == c - 'A' + 'a');
           break;
         default:
           ASSERT (c_isupper (c) == 0);
+          ASSERT (c_tolower (c) == c);
           break;
         }
 
-      switch (to_char (c))
+      switch (c)
         {
         case '0': case '1': case '2': case '3': case '4': case '5':
         case '6': case '7': case '8': case '9':
@@ -201,68 +206,6 @@ test_all (void)
           ASSERT (c_isxdigit (c) == 0);
           break;
         }
-
-      switch (to_char (c))
-        {
-        case 'A': ASSERT (to_char (c_tolower (c)) == 'a'); break;
-        case 'B': ASSERT (to_char (c_tolower (c)) == 'b'); break;
-        case 'C': ASSERT (to_char (c_tolower (c)) == 'c'); break;
-        case 'D': ASSERT (to_char (c_tolower (c)) == 'd'); break;
-        case 'E': ASSERT (to_char (c_tolower (c)) == 'e'); break;
-        case 'F': ASSERT (to_char (c_tolower (c)) == 'f'); break;
-        case 'G': ASSERT (to_char (c_tolower (c)) == 'g'); break;
-        case 'H': ASSERT (to_char (c_tolower (c)) == 'h'); break;
-        case 'I': ASSERT (to_char (c_tolower (c)) == 'i'); break;
-        case 'J': ASSERT (to_char (c_tolower (c)) == 'j'); break;
-        case 'K': ASSERT (to_char (c_tolower (c)) == 'k'); break;
-        case 'L': ASSERT (to_char (c_tolower (c)) == 'l'); break;
-        case 'M': ASSERT (to_char (c_tolower (c)) == 'm'); break;
-        case 'N': ASSERT (to_char (c_tolower (c)) == 'n'); break;
-        case 'O': ASSERT (to_char (c_tolower (c)) == 'o'); break;
-        case 'P': ASSERT (to_char (c_tolower (c)) == 'p'); break;
-        case 'Q': ASSERT (to_char (c_tolower (c)) == 'q'); break;
-        case 'R': ASSERT (to_char (c_tolower (c)) == 'r'); break;
-        case 'S': ASSERT (to_char (c_tolower (c)) == 's'); break;
-        case 'T': ASSERT (to_char (c_tolower (c)) == 't'); break;
-        case 'U': ASSERT (to_char (c_tolower (c)) == 'u'); break;
-        case 'V': ASSERT (to_char (c_tolower (c)) == 'v'); break;
-        case 'W': ASSERT (to_char (c_tolower (c)) == 'w'); break;
-        case 'X': ASSERT (to_char (c_tolower (c)) == 'x'); break;
-        case 'Y': ASSERT (to_char (c_tolower (c)) == 'y'); break;
-        case 'Z': ASSERT (to_char (c_tolower (c)) == 'z'); break;
-        default: ASSERT (c_tolower (c) == c); break;
-        }
-
-      switch (to_char (c))
-        {
-        case 'a': ASSERT (to_char (c_toupper (c)) == 'A'); break;
-        case 'b': ASSERT (to_char (c_toupper (c)) == 'B'); break;
-        case 'c': ASSERT (to_char (c_toupper (c)) == 'C'); break;
-        case 'd': ASSERT (to_char (c_toupper (c)) == 'D'); break;
-        case 'e': ASSERT (to_char (c_toupper (c)) == 'E'); break;
-        case 'f': ASSERT (to_char (c_toupper (c)) == 'F'); break;
-        case 'g': ASSERT (to_char (c_toupper (c)) == 'G'); break;
-        case 'h': ASSERT (to_char (c_toupper (c)) == 'H'); break;
-        case 'i': ASSERT (to_char (c_toupper (c)) == 'I'); break;
-        case 'j': ASSERT (to_char (c_toupper (c)) == 'J'); break;
-        case 'k': ASSERT (to_char (c_toupper (c)) == 'K'); break;
-        case 'l': ASSERT (to_char (c_toupper (c)) == 'L'); break;
-        case 'm': ASSERT (to_char (c_toupper (c)) == 'M'); break;
-        case 'n': ASSERT (to_char (c_toupper (c)) == 'N'); break;
-        case 'o': ASSERT (to_char (c_toupper (c)) == 'O'); break;
-        case 'p': ASSERT (to_char (c_toupper (c)) == 'P'); break;
-        case 'q': ASSERT (to_char (c_toupper (c)) == 'Q'); break;
-        case 'r': ASSERT (to_char (c_toupper (c)) == 'R'); break;
-        case 's': ASSERT (to_char (c_toupper (c)) == 'S'); break;
-        case 't': ASSERT (to_char (c_toupper (c)) == 'T'); break;
-        case 'u': ASSERT (to_char (c_toupper (c)) == 'U'); break;
-        case 'v': ASSERT (to_char (c_toupper (c)) == 'V'); break;
-        case 'w': ASSERT (to_char (c_toupper (c)) == 'W'); break;
-        case 'x': ASSERT (to_char (c_toupper (c)) == 'X'); break;
-        case 'y': ASSERT (to_char (c_toupper (c)) == 'Y'); break;
-        case 'z': ASSERT (to_char (c_toupper (c)) == 'Z'); break;
-        default: ASSERT (c_toupper (c) == c); break;
-        }
     }
 
   ASSERT (n_isascii == 128);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-27  6:59                             ` Paul Eggert
@ 2015-09-28  2:09                               ` Daniel Richard G.
  2015-10-15  4:49                               ` Daniel Richard G.
  1 sibling, 0 replies; 49+ messages in thread
From: Daniel Richard G. @ 2015-09-28  2:09 UTC (permalink / raw)
  To: Paul Eggert, Ben Pfaff; +Cc: bug-gnulib

On Sat, 2015 Sep 26 23:59-0700, Paul Eggert wrote:
>
> Given all the problems mentioned (including some in the proposed
> patches), let's give up on trying to support any such folks.  If they
> want to build gnulib-using software on z/OS, they'll have to build
> with the default configuration in which char is unsigned.  It wouldn't
> be practical for us to try to support char being signed when standard
> chars have the top bit set.

I wasn't quite sure where to draw the "not worth the trouble" line,
but I think this is defensible.

> With that in mind I installed the attached further patch, which
> simplifies the recent changes to c-ctype quite a bit.

It's a shame; I was impressed by the work you had done to support the
signed chars. But in any event, test-c-ctype in Git d2de2a91 passes in
an unsigned-char EBCDIC build on z/OS, and that will hopefully remain
the case for a good long while.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-09-27  6:59                             ` Paul Eggert
  2015-09-28  2:09                               ` Daniel Richard G.
@ 2015-10-15  4:49                               ` Daniel Richard G.
  2016-08-18  0:47                                 ` Paul Eggert
                                                   ` (2 more replies)
  1 sibling, 3 replies; 49+ messages in thread
From: Daniel Richard G. @ 2015-10-15  4:49 UTC (permalink / raw)
  To: Paul Eggert; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1633 bytes --]

Okay, I've split my changes into a set of patches, attached. These
patches are orthogonal and may be applied in any order:

gnulib-zos-ascii.patch: When in a non-ASCII environment, disable tests
that assume ASCII.

gnulib-zos-charset.patch: Added appropriately conditional #pragmas so
that the test strings in test-iconv-utf.c are correctly interpreted in
ASCII instead of EBCDIC (i.e. 'J' == 0x4A and not 0xD1). This issue
could be addressed in a more portable way by simply rewriting all the
ASCII literal characters as octal escapes, but then you would lose the
partial readability that the strings have now. Also, iconv_open() on
z/OS does not recognize "ISO-8859-1", but "ISO8859-1" works.

gnulib-zos-configure.patch: Changes to the Autoconf M4 code to support
z/OS. Note that fclose() is broken in a different way on z/OS than it is
on other systems, thus the special-case in fclose.m4.

gnulib-zos-cpp.patch: General preprocessor-level changes to
support z/OS.

gnulib-zos-errno.patch: Accommodate z/OS errno code preferences. (I
believe this should still be within spec; IBM is good at following the
letter if not the spirit of such things.)

gnulib-zos-pthread.patch: Rudimentary gl_thread support for z/OS.

gnulib-zos-regex-argname.patch: "__string" is not a good name to use as
an identifier on this system. A better fix would be to use a different
name (why not just "s"?), provided this can be pushed to upstream glibc.

gnulib-zos-strtod.patch: Address a couple quirks in the z/OS
implementation of strtod().


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gnulib-zos-ascii.patch --]
[-- Type: text/x-patch; name="gnulib-zos-ascii.patch", Size: 1363 bytes --]

diff --git a/tests/test-c-strcasecmp.c b/tests/test-c-strcasecmp.c
index f7f6b43..47feac8 100644
--- a/tests/test-c-strcasecmp.c
+++ b/tests/test-c-strcasecmp.c
@@ -19,6 +19,7 @@
 #include <config.h>
 
 #include "c-strcase.h"
+#include "c-ctype.h"
 
 #include <locale.h>
 #include <string.h>
@@ -57,9 +58,11 @@ main (int argc, char *argv[])
   ASSERT (c_strcasecmp ("\303\266zg\303\274r", "\303\226ZG\303\234R") > 0); /* özgür */
   ASSERT (c_strcasecmp ("\303\226ZG\303\234R", "\303\266zg\303\274r") < 0); /* özgür */
 
+#if C_CTYPE_ASCII
   /* This test shows how strings of different size cannot compare equal.  */
   ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
   ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);
+#endif
 
   return 0;
 }
diff --git a/tests/test-wcwidth.c b/tests/test-wcwidth.c
index 9fad785..fdbecc3 100644
--- a/tests/test-wcwidth.c
+++ b/tests/test-wcwidth.c
@@ -26,6 +26,7 @@ SIGNATURE_CHECK (wcwidth, int, (wchar_t));
 #include <locale.h>
 #include <string.h>
 
+#include "c-ctype.h"
 #include "localcharset.h"
 #include "macros.h"
 
@@ -34,9 +35,11 @@ main ()
 {
   wchar_t wc;
 
+#ifdef C_CTYPE_ASCII
   /* Test width of ASCII characters.  */
   for (wc = 0x20; wc < 0x7F; wc++)
     ASSERT (wcwidth (wc) == 1);
+#endif
 
   /* Switch to an UTF-8 locale.  */
   if (setlocale (LC_ALL, "fr_FR.UTF-8") != NULL


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: gnulib-zos-charset.patch --]
[-- Type: text/x-patch; name="gnulib-zos-charset.patch", Size: 9066 bytes --]

diff --git a/tests/test-iconv-utf.c b/tests/test-iconv-utf.c
index c1589f6..a769bee 100644
--- a/tests/test-iconv-utf.c
+++ b/tests/test-iconv-utf.c
@@ -27,20 +27,38 @@
 
 #include "macros.h"
 
+/* If compiling on an EBCDIC system, keep the test strings in ASCII.  */
+#if defined __IBMC__ && 'A' != 0x41
+# pragma convert("ISO8859-1")
+# define CONVERT_ENABLED
+#endif
+
+/* The text is "Japanese (日本語) [\U0001D50D\U0001D51E\U0001D52D]".  */
+
+const char test_utf8_string[] = "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+
+const char test_utf16be_string[] = "\000J\000a\000p\000a\000n\000e\000s\000e\000 \000(\145\345\147\054\212\236\000)\000 \000[\330\065\335\015\330\065\335\036\330\065\335\055\000]";
+
+const char test_utf16le_string[] = "J\000a\000p\000a\000n\000e\000s\000e\000 \000(\000\345\145\054\147\236\212)\000 \000[\000\065\330\015\335\065\330\036\335\065\330\055\335]\000";
+
+const char test_utf32be_string[] = "\000\000\000J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\145\345\000\000\147\054\000\000\212\236\000\000\000)\000\000\000 \000\000\000[\000\001\325\015\000\001\325\036\000\001\325\055\000\000\000]";
+
+const char test_utf32le_string[] = "J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\000\345\145\000\000\054\147\000\000\236\212\000\000)\000\000\000 \000\000\000[\000\000\000\015\325\001\000\036\325\001\000\055\325\001\000]\000\000\000";
+
+#ifdef CONVERT_ENABLED
+# pragma convert(pop)
+#endif
+
 int
 main ()
 {
 #if HAVE_ICONV
   /* Assume that iconv() supports at least the encoding UTF-8.  */
 
-  /* The text is "Japanese (日本語) [\U0001D50D\U0001D51E\U0001D52D]".  */
-
   /* Test conversion from UTF-8 to UTF-16BE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "\000J\000a\000p\000a\000n\000e\000s\000e\000 \000(\145\345\147\054\212\236\000)\000 \000[\330\065\335\015\330\065\335\036\330\065\335\055\000]";
+#define input    test_utf8_string
+#define expected test_utf16be_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -64,14 +82,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-8 to UTF-16LE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "J\000a\000p\000a\000n\000e\000s\000e\000 \000(\000\345\145\054\147\236\212)\000 \000[\000\065\330\015\335\065\330\036\335\065\330\055\335]\000";
+#define input    test_utf8_string
+#define expected test_utf16le_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -95,14 +114,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-8 to UTF-32BE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "\000\000\000J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\145\345\000\000\147\054\000\000\212\236\000\000\000)\000\000\000 \000\000\000[\000\001\325\015\000\001\325\036\000\001\325\055\000\000\000]";
+#define input    test_utf8_string
+#define expected test_utf32be_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -126,14 +146,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-8 to UTF-32LE with no errors.  */
   {
-    static const char input[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
-    static const char expected[] =
-      "J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\000\345\145\000\000\054\147\000\000\236\212\000\000)\000\000\000 \000\000\000[\000\000\000\015\325\001\000\036\325\001\000\055\325\001\000]\000\000\000";
+#define input    test_utf8_string
+#define expected test_utf32le_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -157,14 +178,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-16BE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "\000J\000a\000p\000a\000n\000e\000s\000e\000 \000(\145\345\147\054\212\236\000)\000 \000[\330\065\335\015\330\065\335\036\330\065\335\055\000]";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf16be_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -188,14 +210,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-16LE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "J\000a\000p\000a\000n\000e\000s\000e\000 \000(\000\345\145\054\147\236\212)\000 \000[\000\065\330\015\335\065\330\036\335\065\330\055\335]\000";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf16le_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -219,14 +242,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-32BE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "\000\000\000J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\145\345\000\000\147\054\000\000\212\236\000\000\000)\000\000\000 \000\000\000[\000\001\325\015\000\001\325\036\000\001\325\055\000\000\000]";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf32be_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -250,14 +274,15 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 
   /* Test conversion from UTF-32LE to UTF-8 with no errors.  */
   {
-    static const char input[] =
-      "J\000\000\000a\000\000\000p\000\000\000a\000\000\000n\000\000\000e\000\000\000s\000\000\000e\000\000\000 \000\000\000(\000\000\000\345\145\000\000\054\147\000\000\236\212\000\000)\000\000\000 \000\000\000[\000\000\000\015\325\001\000\036\325\001\000\055\325\001\000]\000\000\000";
-    static const char expected[] =
-      "Japanese (\346\227\245\346\234\254\350\252\236) [\360\235\224\215\360\235\224\236\360\235\224\255]";
+#define input    test_utf32le_string
+#define expected test_utf8_string
     iconv_t cd;
     char buf[100];
     const char *inptr;
@@ -281,6 +306,9 @@ main ()
     ASSERT (memcmp (buf, expected, sizeof (expected) - 1) == 0);
 
     ASSERT (iconv_close (cd) == 0);
+
+#undef input
+#undef expected
   }
 #endif
 
diff --git a/tests/test-iconv.c b/tests/test-iconv.c
index ed715bd..a64c6dd 100644
--- a/tests/test-iconv.c
+++ b/tests/test-iconv.c
@@ -44,8 +44,14 @@ main ()
 #if HAVE_ICONV
   /* Assume that iconv() supports at least the encodings ASCII, ISO-8859-1,
      and UTF-8.  */
-  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO-8859-1");
-  iconv_t cd_utf8_to_88591 = iconv_open ("ISO-8859-1", "UTF-8");
+  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO8859-1");
+  iconv_t cd_utf8_to_88591 = iconv_open ("ISO8859-1", "UTF-8");
+
+#if defined __MVS__ && defined __IBMC__
+  /* String literals below are in ASCII, not EBCDIC.  */
+# pragma convert("ISO8859-1")
+# define CONVERT_ENABLED
+#endif
 
   ASSERT (cd_88591_to_utf8 != (iconv_t)(-1));
   ASSERT (cd_utf8_to_88591 != (iconv_t)(-1));
@@ -142,7 +148,12 @@ main ()
 
   iconv_close (cd_88591_to_utf8);
   iconv_close (cd_utf8_to_88591);
+
+#ifdef CONVERT_ENABLED
+# pragma convert(pop)
 #endif
 
+#endif /* HAVE_ICONV */
+
   return 0;
 }


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: gnulib-zos-configure.patch --]
[-- Type: text/x-patch; name="gnulib-zos-configure.patch", Size: 3598 bytes --]

diff --git a/m4/fclose.m4 b/m4/fclose.m4
index 6bd1ad8..e939d30 100644
--- a/m4/fclose.m4
+++ b/m4/fclose.m4
@@ -1,4 +1,4 @@
-# fclose.m4 serial 6
+# fclose.m4 serial 7
 dnl Copyright (C) 2008-2015 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -7,6 +7,7 @@ dnl with or without modifications, as long as this notice is preserved.
 AC_DEFUN([gl_FUNC_FCLOSE],
 [
   AC_REQUIRE([gl_STDIO_H_DEFAULTS])
+  AC_REQUIRE([AC_CANONICAL_HOST])
 
   gl_FUNC_FFLUSH_STDIN
   if test $gl_cv_func_fflush_stdin != yes; then
@@ -17,4 +18,8 @@ AC_DEFUN([gl_FUNC_FCLOSE],
   if test $REPLACE_CLOSE = 1; then
     REPLACE_FCLOSE=1
   fi
+
+  case "$host_os" in
+    openedition) REPLACE_FCLOSE=1 ;;
+  esac
 ])
diff --git a/m4/strstr.m4 b/m4/strstr.m4
index 040c0b9..e3e528d 100644
--- a/m4/strstr.m4
+++ b/m4/strstr.m4
@@ -1,4 +1,4 @@
-# strstr.m4 serial 16
+# strstr.m4 serial 17
 dnl Copyright (C) 2008-2015 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -67,6 +67,12 @@ AC_DEFUN([gl_FUNC_STRSTR],
     AC_CACHE_CHECK([whether strstr works in linear time],
       [gl_cv_func_strstr_linear],
       [AC_RUN_IFELSE([AC_LANG_PROGRAM([[
+#ifdef __MVS__
+/* z/OS does not deliver signals while strstr() is running (thanks to
+   restrictions on its LE runtime), which prevents us from limiting the
+   running time of this test.  */
+# error "This test does not work properly on z/OS"
+#endif
 #include <signal.h> /* for signal */
 #include <string.h> /* for strstr */
 #include <stdlib.h> /* for malloc */
diff --git a/m4/wchar_h.m4 b/m4/wchar_h.m4
index 9d1b0f8..c926c4b 100644
--- a/m4/wchar_h.m4
+++ b/m4/wchar_h.m4
@@ -7,7 +7,7 @@ dnl with or without modifications, as long as this notice is preserved.
 
 dnl Written by Eric Blake.
 
-# wchar_h.m4 serial 39
+# wchar_h.m4 serial 40
 
 AC_DEFUN([gl_WCHAR_H],
 [
@@ -81,8 +81,14 @@ AC_DEFUN([gl_WCHAR_H_INLINE_OK],
 extern int zero (void);
 int main () { return zero(); }
 ]])])
+     dnl Do not rename the object file from conftest.$ac_objext to
+     dnl conftest1.$ac_objext, as this will cause the link to fail on
+     dnl z/OS when using the XPLINK object format (due to duplicate
+     dnl CSECT names). Instead, temporarily redefine $ac_compile so
+     dnl that the object file has the latter name from the start.
+     save_ac_compile="$ac_compile"
+     ac_compile=`echo "$save_ac_compile" | sed s/conftest/conftest1/`
      if AC_TRY_EVAL([ac_compile]); then
-       mv conftest.$ac_objext conftest1.$ac_objext
        AC_LANG_CONFTEST([
          AC_LANG_SOURCE([[#define wcstod renamed_wcstod
 /* Tru64 with Desktop Toolkit C has a bug: <stdio.h> must be included before
@@ -95,8 +101,9 @@ int main () { return zero(); }
 #include <wchar.h>
 int zero (void) { return 0; }
 ]])])
+       dnl See note above about renaming object files.
+       ac_compile=`echo "$save_ac_compile" | sed s/conftest/conftest2/`
        if AC_TRY_EVAL([ac_compile]); then
-         mv conftest.$ac_objext conftest2.$ac_objext
          if $CC -o conftest$ac_exeext $CFLAGS $LDFLAGS conftest1.$ac_objext conftest2.$ac_objext $LIBS >&AS_MESSAGE_LOG_FD 2>&1; then
            :
          else
@@ -104,6 +111,7 @@ int zero (void) { return 0; }
          fi
        fi
      fi
+     ac_compile="$save_ac_compile"
      rm -f conftest1.$ac_objext conftest2.$ac_objext conftest$ac_exeext
     ])
   if test $gl_cv_header_wchar_h_correct_inline = no; then


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: gnulib-zos-cpp.patch --]
[-- Type: text/x-patch; name="gnulib-zos-cpp.patch", Size: 8471 bytes --]

diff --git a/lib/alloca.in.h b/lib/alloca.in.h
index d5664b6..6606984 100644
--- a/lib/alloca.in.h
+++ b/lib/alloca.in.h
@@ -51,6 +51,8 @@ extern "C"
 void *_alloca (unsigned short);
 #  pragma intrinsic (_alloca)
 #  define alloca _alloca
+# elif defined __MVS__
+#  include <stdlib.h>
 # else
 #  include <stddef.h>
 #  ifdef  __cplusplus
diff --git a/lib/fnmatch.c b/lib/fnmatch.c
index a607672..58754fa 100644
--- a/lib/fnmatch.c
+++ b/lib/fnmatch.c
@@ -22,7 +22,7 @@
 # define _GNU_SOURCE    1
 #endif
 
-#if ! defined __builtin_expect && __GNUC__ < 3
+#if ! defined __builtin_expect && defined __GNUC__ && __GNUC__ < 3
 # define __builtin_expect(expr, expected) (expr)
 #endif
 
diff --git a/lib/get-rusage-as.c b/lib/get-rusage-as.c
index 2bad20a..4db1596 100644
--- a/lib/get-rusage-as.c
+++ b/lib/get-rusage-as.c
@@ -355,7 +355,7 @@ get_rusage_as_via_iterator (void)
 uintptr_t
 get_rusage_as (void)
 {
-#if (defined __APPLE__ && defined __MACH__) || defined _AIX || defined __CYGWIN__ /* Mac OS X, AIX, Cygwin */
+#if (defined __APPLE__ && defined __MACH__) || defined _AIX || defined __CYGWIN__ || defined __MVS__ /* Mac OS X, AIX, Cygwin, z/OS */
   /* get_rusage_as_via_setrlimit() does not work.
      Prefer get_rusage_as_via_iterator().  */
   return get_rusage_as_via_iterator ();
diff --git a/lib/glob.c b/lib/glob.c
index ed49a9d..9fd6482 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -144,7 +144,9 @@
 # define __stat64(fname, buf)   stat (fname, buf)
 # define __fxstatat64(_, d, f, st, flag) fstatat (d, f, st, flag)
 # define struct_stat64          struct stat
-# define __alloca               alloca
+# ifndef __MVS__
+#  define __alloca              alloca
+# endif
 # define __readdir              readdir
 # define __glob_pattern_p       glob_pattern_p
 #endif /* _LIBC */
diff --git a/lib/math.in.h b/lib/math.in.h
index 62a089a..59293fd 100644
--- a/lib/math.in.h
+++ b/lib/math.in.h
@@ -406,6 +406,7 @@ _GL_WARN_ON_USE (ceilf, "ceilf is unportable - "
 #if @GNULIB_CEIL@
 # if @REPLACE_CEIL@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef ceil
 #   define ceil rpl_ceil
 #  endif
 _GL_FUNCDECL_RPL (ceil, double, (double x));
@@ -753,6 +754,7 @@ _GL_WARN_ON_USE (floorf, "floorf is unportable - "
 #if @GNULIB_FLOOR@
 # if @REPLACE_FLOOR@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef floor
 #   define floor rpl_floor
 #  endif
 _GL_FUNCDECL_RPL (floor, double, (double x));
@@ -973,6 +975,7 @@ _GL_WARN_ON_USE (frexpf, "frexpf is unportable - "
 #if @GNULIB_FREXP@
 # if @REPLACE_FREXP@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef frexp
 #   define frexp rpl_frexp
 #  endif
 _GL_FUNCDECL_RPL (frexp, double, (double x, int *expptr) _GL_ARG_NONNULL ((2)));
@@ -1958,6 +1961,7 @@ _GL_WARN_ON_USE (tanhf, "tanhf is unportable - "
 #if @GNULIB_TRUNCF@
 # if @REPLACE_TRUNCF@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef truncf
 #   define truncf rpl_truncf
 #  endif
 _GL_FUNCDECL_RPL (truncf, float, (float x));
@@ -1980,6 +1984,7 @@ _GL_WARN_ON_USE (truncf, "truncf is unportable - "
 #if @GNULIB_TRUNC@
 # if @REPLACE_TRUNC@
 #  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef trunc
 #   define trunc rpl_trunc
 #  endif
 _GL_FUNCDECL_RPL (trunc, double, (double x));
diff --git a/lib/ptsname_r.c b/lib/ptsname_r.c
index faa33fb..809388a 100644
--- a/lib/ptsname_r.c
+++ b/lib/ptsname_r.c
@@ -34,6 +34,11 @@
 #  define _PATH_DEV "/dev/"
 # endif
 
+# undef __set_errno
+# undef __stat
+# undef __ttyname_r
+# undef __ptsname_r
+
 # define __set_errno(e) errno = (e)
 # define __isatty isatty
 # define __stat stat
diff --git a/tests/infinity.h b/tests/infinity.h
index 45c30bd..4e8a755 100644
--- a/tests/infinity.h
+++ b/tests/infinity.h
@@ -17,8 +17,9 @@
 
 /* Infinityf () returns a 'float' +Infinity.  */
 
-/* The Microsoft MSVC 9 compiler chokes on the expression 1.0f / 0.0f.  */
-#if defined _MSC_VER
+/* The Microsoft MSVC 9 compiler chokes on the expression 1.0f / 0.0f.
+   The IBM XL C compiler on z/OS complains.  */
+#if defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static float
 Infinityf ()
 {
@@ -32,8 +33,9 @@ Infinityf ()
 
 /* Infinityd () returns a 'double' +Infinity.  */
 
-/* The Microsoft MSVC 9 compiler chokes on the expression 1.0 / 0.0.  */
-#if defined _MSC_VER
+/* The Microsoft MSVC 9 compiler chokes on the expression 1.0 / 0.0.
+   The IBM XL C compiler on z/OS complains.  */
+#if defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static double
 Infinityd ()
 {
@@ -47,9 +49,10 @@ Infinityd ()
 
 /* Infinityl () returns a 'long double' +Infinity.  */
 
-/* The Microsoft MSVC 9 compiler chokes on the expression 1.0L / 0.0L.  */
-#if defined _MSC_VER
-static double
+/* The Microsoft MSVC 9 compiler chokes on the expression 1.0L / 0.0L.
+   The IBM XL C compiler on z/OS complains.  */
+#if defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
+static long double
 Infinityl ()
 {
   static long double zero = 0.0L;
diff --git a/tests/nan.h b/tests/nan.h
index 9f6819c..10b393e 100644
--- a/tests/nan.h
+++ b/tests/nan.h
@@ -15,11 +15,18 @@
    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
 
+/* IBM z/OS supports both hexadecimal and IEEE floating-point formats. The
+   former does not support NaN and its isnan() implementation returns zero
+   for all values.  */
+#if defined __MVS__ && defined __IBMC__ && !defined __BFP__
+# error "NaN is not supported with IBM's hexadecimal floating-point format; please re-compile with -qfloat=ieee"
+#endif
+
 /* NaNf () returns a 'float' not-a-number.  */
 
 /* The Compaq (ex-DEC) C 6.4 compiler and the Microsoft MSVC 9 compiler choke
-   on the expression 0.0 / 0.0.  */
-#if defined __DECC || defined _MSC_VER
+   on the expression 0.0 / 0.0.  The IBM XL C compiler on z/OS complains.  */
+#if defined __DECC || defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static float
 NaNf ()
 {
@@ -34,8 +41,8 @@ NaNf ()
 /* NaNd () returns a 'double' not-a-number.  */
 
 /* The Compaq (ex-DEC) C 6.4 compiler and the Microsoft MSVC 9 compiler choke
-   on the expression 0.0 / 0.0.  */
-#if defined __DECC || defined _MSC_VER
+   on the expression 0.0 / 0.0.  The IBM XL C compiler on z/OS complains.  */
+#if defined __DECC || defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static double
 NaNd ()
 {
@@ -51,14 +58,15 @@ NaNd ()
 
 /* On Irix 6.5, gcc 3.4.3 can't compute compile-time NaN, and needs the
    runtime type conversion.
-   The Microsoft MSVC 9 compiler chokes on the expression 0.0L / 0.0L.  */
+   The Microsoft MSVC 9 compiler chokes on the expression 0.0L / 0.0L.
+   The IBM XL C compiler on z/OS complains.  */
 #ifdef __sgi
 static long double NaNl ()
 {
   double zero = 0.0;
   return zero / zero;
 }
-#elif defined _MSC_VER
+#elif defined _MSC_VER || (defined __MVS__ && defined __IBMC__)
 static long double
 NaNl ()
 {
diff --git a/tests/test-canonicalize-lgpl.c b/tests/test-canonicalize-lgpl.c
index 12d2bb0..49c0221 100644
--- a/tests/test-canonicalize-lgpl.c
+++ b/tests/test-canonicalize-lgpl.c
@@ -191,12 +191,16 @@ main (void)
     ASSERT (result2);
     ASSERT (stat ("/", &st1) == 0);
     ASSERT (stat ("//", &st2) == 0);
+    /* On IBM z/OS, "/" and "//" are distinct, yet they both have
+       st_dev == st_ino == 1.  */
+#ifndef __MVS__
     if (SAME_INODE (st1, st2))
       {
         ASSERT (strcmp (result1, "/") == 0);
         ASSERT (strcmp (result2, "/") == 0);
       }
     else
+#endif
       {
         ASSERT (strcmp (result1, "//") == 0);
         ASSERT (strcmp (result2, "//") == 0);
diff --git a/tests/test-nonblocking-pipe.h b/tests/test-nonblocking-pipe.h
index 5b3646e..01c992c 100644
--- a/tests/test-nonblocking-pipe.h
+++ b/tests/test-nonblocking-pipe.h
@@ -31,10 +31,11 @@
      OSF/1                           >= 262145
      Solaris <= 7                    >= 10241
      Solaris >= 8                    >= 20481
+     z/OS                            >= 131073
      Cygwin                          >= 65537
      native Windows                  >= 4097 (depends on the _pipe argument)
  */
-#if defined __osf__ || (defined __linux__ && (defined __ia64__ || defined __mips__))
+#if defined __MVS__ || defined __osf__ || (defined __linux__ && (defined __ia64__ || defined __mips__))
 # define PIPE_DATA_BLOCK_SIZE 270000
 #elif defined __linux__ && defined __sparc__
 # define PIPE_DATA_BLOCK_SIZE 140000


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: gnulib-zos-errno.patch --]
[-- Type: text/x-patch; name="gnulib-zos-errno.patch", Size: 1871 bytes --]

diff --git a/tests/test-nonblocking-reader.h b/tests/test-nonblocking-reader.h
index 8cba131..d8eaa32 100644
--- a/tests/test-nonblocking-reader.h
+++ b/tests/test-nonblocking-reader.h
@@ -110,7 +110,7 @@ full_read_from_nonblocking_fd (size_t fd, void *buf, size_t count)
       ASSERT (spent_time < 0.5);
       if (ret < 0)
         {
-          ASSERT (saved_errno == EAGAIN);
+          ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
           usleep (SMALL_DELAY);
         }
       else
diff --git a/tests/test-nonblocking-writer.h b/tests/test-nonblocking-writer.h
index 0ecf996..ff148dc 100644
--- a/tests/test-nonblocking-writer.h
+++ b/tests/test-nonblocking-writer.h
@@ -124,7 +124,7 @@ main_writer_loop (int test, size_t data_block_size, int fd,
                         (long) ret, dbgstrerror (ret < 0, saved_errno));
             if (ret < 0 && bytes_written >= data_block_size)
               {
-                ASSERT (saved_errno == EAGAIN);
+                ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
                 ASSERT (spent_time < 0.5);
                 break;
               }
@@ -133,7 +133,7 @@ main_writer_loop (int test, size_t data_block_size, int fd,
             ASSERT (spent_time < 0.5);
             if (ret < 0)
               {
-                ASSERT (saved_errno == EAGAIN);
+                ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
                 usleep (SMALL_DELAY);
               }
             else
@@ -165,7 +165,7 @@ main_writer_loop (int test, size_t data_block_size, int fd,
             ASSERT (spent_time < 0.5);
             if (ret < 0)
               {
-                ASSERT (saved_errno == EAGAIN);
+                ASSERT (saved_errno == EAGAIN || saved_errno == EWOULDBLOCK);
                 usleep (SMALL_DELAY);
               }
             else


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: gnulib-zos-pthread.patch --]
[-- Type: text/x-patch; name="gnulib-zos-pthread.patch", Size: 1223 bytes --]

diff --git a/lib/glthread/thread.c b/lib/glthread/thread.c
index d4e2921..5923ea2 100644
--- a/lib/glthread/thread.c
+++ b/lib/glthread/thread.c
@@ -33,7 +33,7 @@
 
 #include <pthread.h>
 
-#ifdef PTW32_VERSION
+#if defined PTW32_VERSION || defined __MVS__
 
 const gl_thread_t gl_null_thread /* = { .p = NULL } */;
 
diff --git a/lib/glthread/thread.h b/lib/glthread/thread.h
index 2febe34..36a9521 100644
--- a/lib/glthread/thread.h
+++ b/lib/glthread/thread.h
@@ -172,6 +172,15 @@ typedef pthread_t gl_thread_t;
 #  define gl_thread_self_pointer() \
      (pthread_in_use () ? pthread_self ().p : NULL)
 extern const gl_thread_t gl_null_thread;
+# elif defined __MVS__
+   /* On IBM z/OS, pthread_t is a struct with an 8-byte '__' field.
+      The first three bytes of this field appear to uniquely identify a
+      pthread_t, though not necessarily representing a pointer.  */
+#  define gl_thread_self() \
+     (pthread_in_use () ? pthread_self () : gl_null_thread)
+#  define gl_thread_self_pointer() \
+     (pthread_in_use () ? *((void **) pthread_self ().__) : NULL)
+extern const gl_thread_t gl_null_thread;
 # else
 #  define gl_thread_self() \
      (pthread_in_use () ? pthread_self () : (pthread_t) NULL)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #8: gnulib-zos-regex-argname.patch --]
[-- Type: text/x-patch; name="gnulib-zos-regex-argname.patch", Size: 804 bytes --]

diff --git a/lib/regex.h b/lib/regex.h
index 6f3bae3..21fc00c 100644
--- a/lib/regex.h
+++ b/lib/regex.h
@@ -23,6 +23,12 @@
 
 #include <sys/types.h>
 
+/* IBM z/OS uses -D__string=1 as an inclusion guard.  */
+#if defined __MVS__ && defined(__string)
+# undef __string
+# define __string __string
+#endif
+
 /* Allow the use in C++ code.  */
 #ifdef __cplusplus
 extern "C" {
diff --git a/lib/string.in.h b/lib/string.in.h
index b3356bb..fa438a4 100644
--- a/lib/string.in.h
+++ b/lib/string.in.h
@@ -44,6 +44,12 @@
 #ifndef _@GUARD_PREFIX@_STRING_H
 #define _@GUARD_PREFIX@_STRING_H
 
+/* IBM z/OS uses -D__string=1 as an inclusion guard.  */
+#if defined __MVS__ && defined(__string)
+# undef __string
+# define __string __string
+#endif
+
 /* NetBSD 5.0 mis-defines NULL.  */
 #include <stddef.h>
 


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #9: gnulib-zos-strtod.patch --]
[-- Type: text/x-patch; name="gnulib-zos-strtod.patch", Size: 961 bytes --]

diff --git a/lib/strtod.c b/lib/strtod.c
index 9fd0170..9dc6eeb 100644
--- a/lib/strtod.c
+++ b/lib/strtod.c
@@ -239,7 +239,12 @@ strtod (const char *nptr, char **endptr)
       if (*s == '0' && c_tolower (s[1]) == 'x')
         {
           if (! c_isxdigit (s[2 + (s[2] == '.')]))
-            end = s + 1;
+            {
+              end = s + 1;
+
+              /* strtod() on z/OS returns ERANGE for "0x".  */
+              errno = 0;
+            }
           else if (end <= s + 2)
             {
               num = parse_number (s + 2, 16, 2, 4, 'p', &endbuf);
@@ -321,7 +326,7 @@ strtod (const char *nptr, char **endptr)
          better to use the underlying implementation's result, since a
          nice implementation populates the bits of the NaN according
          to interpreting n-char-sequence as a hexadecimal number.  */
-      if (s != end)
+      if (s != end || num == num)
         num = NAN;
       errno = saved_errno;
     }


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2015-10-15  4:49                               ` Daniel Richard G.
@ 2016-08-18  0:47                                 ` Paul Eggert
  2016-08-18  8:24                                   ` Daniel Richard G.
  2019-12-19  4:57                                 ` z/OS configure triple Bruno Haible
  2019-12-19  5:16                                 ` z/OS, iconv, and charset aliases Bruno Haible
  2 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2016-08-18  0:47 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Daniel Richard G. wrote:
> Okay, I've split my changes into a set of patches, attached. These
> patches are orthogonal and may be applied in any order:

Thanks, I finally installed those into the main repository on Savannah. I had to 
write ChangeLog entries, which I took from your email. I also fixed a couple of 
minor glitches that I noticed.

For future patches, could you please email the output of "git format-patch"? Or 
just use "git send-email". Please use the typical format for ChangeLog entries; 
I did that for your first patch but ran out of time to do that for later ones.

Thanks again.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2016-08-18  0:47                                 ` Paul Eggert
@ 2016-08-18  8:24                                   ` Daniel Richard G.
  2016-08-18  8:53                                     ` Paul Eggert
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2016-08-18  8:24 UTC (permalink / raw)
  To: Paul Eggert; +Cc: bug-gnulib

On Wed, 2016 Aug 17 17:47-0700, Paul Eggert wrote:
>
> Thanks, I finally installed those into the main repository on
> Savannah. I had to write ChangeLog entries, which I took from your
> email. I also fixed a couple of minor glitches that I noticed.

Much appreciated, Paul.

(Did you get the minor changes to test-c-strncasecmp.c and
test-sigpipe.sh? Those are the only salient ones still outstanding)

I have a few more fixes to send in, but those are not yet ready as I am
still working with IBM on various z/OS issues exposed by the gnulib test
suite. The process has been, to say the least, frustratingly slow.

> For future patches, could you please email the output of "git format-
> patch"? Or just use "git send-email". Please use the typical format
> for ChangeLog entries; I did that for your first patch but ran out of
> time to do that for later ones.

Understood. I will keep this in mind for the next patch.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2016-08-18  8:24                                   ` Daniel Richard G.
@ 2016-08-18  8:53                                     ` Paul Eggert
  2016-08-19  8:20                                       ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2016-08-18  8:53 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Daniel Richard G. wrote:
> (Did you get the minor changes to test-c-strncasecmp.c and
> test-sigpipe.sh? Those are the only salient ones still outstanding)

Hmm, sorry, I don't seem to have them. Could you please resend them, in 'git 
format-patch' format? Thanks.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2016-08-18  8:53                                     ` Paul Eggert
@ 2016-08-19  8:20                                       ` Daniel Richard G.
  2016-08-19 11:03                                         ` Bruno Haible
  2016-08-19 19:28                                         ` Paul Eggert
  0 siblings, 2 replies; 49+ messages in thread
From: Daniel Richard G. @ 2016-08-19  8:20 UTC (permalink / raw)
  To: Paul Eggert; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

On Thu, 2016 Aug 18 01:53-0700, Paul Eggert wrote:
>
> Hmm, sorry, I don't seem to have them. Could you please resend them,
> in 'git format-patch' format? Thanks.

They are attached. I needed to make some minor edits on the format-patch
output, hopefully harmless. The ChangeLog-style entries are at the top,
though you'll need to add the "section" lines.

The first change, to test-c-strncasecmp.c, disables two string-compares
that fail in EBCDIC. (\304 == 'D' in the 1047 encoding.)

The second change, to test-sigpipe.sh, fixes what looked like a typo.
(A goes with A, B goes with B, so what should go with C...)


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Changes-for-z-OS.patch --]
[-- Type: text/x-patch; name="0001-Changes-for-z-OS.patch", Size: 1616 bytes --]

2016-08-19  Daniel Richard G.  <skunk@iSKUNK.ORG>

	* tests/test-c-strncasecmp.c: Allow two c_strncasecmp() calls
	which assume ASCII encoding semantics to run only in ASCII
	mode, as they fail in EBCDIC

	* tests/test-sigpipe.sh: Fixed typo

---
 tests/test-c-strncasecmp.c | 3 +++
 tests/test-sigpipe.sh      | 2 +-

diff --git a/tests/test-c-strncasecmp.c b/tests/test-c-strncasecmp.c
index 1ca42d8..349f6b3 100644
--- a/tests/test-c-strncasecmp.c
+++ b/tests/test-c-strncasecmp.c
@@ -19,6 +19,7 @@
 #include <config.h>
 
 #include "c-strcase.h"
+#include "c-ctype.h"
 
 #include <locale.h>
 #include <string.h>
@@ -71,9 +72,11 @@ main (int argc, char *argv[])
   ASSERT (c_strncasecmp ("\303\266zg\303\274r", "\303\226ZG\303\234R", 99) > 0); /* özgür */
   ASSERT (c_strncasecmp ("\303\226ZG\303\234R", "\303\266zg\303\274r", 99) < 0); /* özgür */
 
+#if C_CTYPE_ASCII
   /* This test shows how strings of different size cannot compare equal.  */
   ASSERT (c_strncasecmp ("turkish", "TURK\304\260SH", 7) < 0);
   ASSERT (c_strncasecmp ("TURK\304\260SH", "turkish", 7) > 0);
+#endif
 
   return 0;
 }
diff --git a/tests/test-sigpipe.sh b/tests/test-sigpipe.sh
index bc2baf2..6cf3242 100755
--- a/tests/test-sigpipe.sh
+++ b/tests/test-sigpipe.sh
@@ -21,7 +21,7 @@ fi
 
 # Test signal's behaviour when a handler is installed.
 tmpfiles="$tmpfiles t-sigpipeC.tmp"
-./test-sigpipe${EXEEXT} B 2> t-sigpipeC.tmp | head -1 > /dev/null
+./test-sigpipe${EXEEXT} C 2> t-sigpipeC.tmp | head -1 > /dev/null
 if test -s t-sigpipeC.tmp; then
   LC_ALL=C tr -d '\r' < t-sigpipeC.tmp
   rm -fr $tmpfiles; exit 1
-- 
2.9.0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2016-08-19  8:20                                       ` Daniel Richard G.
@ 2016-08-19 11:03                                         ` Bruno Haible
  2016-08-19 19:28                                         ` Paul Eggert
  1 sibling, 0 replies; 49+ messages in thread
From: Bruno Haible @ 2016-08-19 11:03 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Daniel Richard G.

Daniel Richard G. wrote:
> The second change, to test-sigpipe.sh, fixes what looked like a typo.
> (A goes with A, B goes with B, so what should go with C...)

Thanks. This change is definitely good.

Bruno



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2016-08-19  8:20                                       ` Daniel Richard G.
  2016-08-19 11:03                                         ` Bruno Haible
@ 2016-08-19 19:28                                         ` Paul Eggert
  2016-08-19 20:38                                           ` Daniel Richard G.
  1 sibling, 1 reply; 49+ messages in thread
From: Paul Eggert @ 2016-08-19 19:28 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Thanks, I installed those changes in your name, after reformatting the ChangeLog 
file and commit message to fit Gnulib style. You can see what that format is 
like by doing "git pull; git format-patch --stdout -1": the ChangeLog entry's 
contents duplicate the commit message, except that they're indented a tab and 
the 2nd (empty) line is omitted.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] IBM z/OS + EBCDIC support
  2016-08-19 19:28                                         ` Paul Eggert
@ 2016-08-19 20:38                                           ` Daniel Richard G.
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Richard G. @ 2016-08-19 20:38 UTC (permalink / raw)
  To: Paul Eggert; +Cc: bug-gnulib

On Fri, 2016 Aug 19 12:28-0700, Paul Eggert wrote:
> Thanks, I installed those changes in your name, after reformatting the
> ChangeLog file and commit message to fit Gnulib style. You can see
> what that format is like by doing "git pull; git format-patch --stdout
> -1": the ChangeLog entry's contents duplicate the commit message,
> except that they're indented a tab and the 2nd (empty) line is
> omitted.

I see the formatting, but admit to uncertainty on the precise
wording desired.

Nevertheless, thanks for curating that and getting these changes in.
That wraps it up for my initial submission. Here's hoping the next batch
won't be terribly far off.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS configure triple
  2015-10-15  4:49                               ` Daniel Richard G.
  2016-08-18  0:47                                 ` Paul Eggert
@ 2019-12-19  4:57                                 ` Bruno Haible
  2019-12-20  0:22                                   ` Daniel Richard G.
  2019-12-19  5:16                                 ` z/OS, iconv, and charset aliases Bruno Haible
  2 siblings, 1 reply; 49+ messages in thread
From: Bruno Haible @ 2019-12-19  4:57 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Hi Daniel,

In <https://lists.gnu.org/archive/html/bug-gnulib/2015-10/msg00020.html>
you wrote:
> gnulib-zos-configure.patch: Changes to the Autoconf M4 code to support
> z/OS.

What is the host_os of the canonical triple in that environment?
Is it 'mvs' or 'openedition'?

I know that at the C preprocessor level, the test is 'defined __MVS__'.

Bruno



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and charset aliases
  2015-10-15  4:49                               ` Daniel Richard G.
  2016-08-18  0:47                                 ` Paul Eggert
  2019-12-19  4:57                                 ` z/OS configure triple Bruno Haible
@ 2019-12-19  5:16                                 ` Bruno Haible
  2019-12-19  5:21                                   ` Bruno Haible
  2019-12-20  4:38                                   ` Daniel Richard G.
  2 siblings, 2 replies; 49+ messages in thread
From: Bruno Haible @ 2019-12-19  5:16 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Hi Daniel,

In <https://lists.gnu.org/archive/html/bug-gnulib/2015-10/msg00020.html>
you submitted this patch, which Paul committed on 2016-08-18:

> Also, iconv_open() on
> z/OS does not recognize "ISO-8859-1", but "ISO8859-1" works.

diff --git a/tests/test-iconv.c b/tests/test-iconv.c
index ed715bd..a64c6dd 100644
--- a/tests/test-iconv.c
+++ b/tests/test-iconv.c
@@ -44,8 +44,14 @@ main ()
 #if HAVE_ICONV
   /* Assume that iconv() supports at least the encodings ASCII, ISO-8859-1,
      and UTF-8.  */
-  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO-8859-1");
-  iconv_t cd_utf8_to_88591 = iconv_open ("ISO-8859-1", "UTF-8");
+  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO8859-1");
+  iconv_t cd_utf8_to_88591 = iconv_open ("ISO8859-1", "UTF-8");
+

This part is not right. The approach we take regarding charset/encoding
aliases is that
  - locale_charset() produces canonicalized charset names
    (see localcharset.h for the precise list, e.g. "UTF-8" not "UTF8",
    "ISO-8859-1" not "ISO8859-1", "CP1252" not "WINDOWS-1252", etc.),
  - glibc is known to support these canonicalized charset names,
  - All functions are supposed to receive canonical, not system-dependent
    charset names.
In particular, the gnulib iconv_open module is supposed to receive an
encoding name such as "ISO-8859-1" as argument and, on platforms which
don't understand it, pass "ISO8859-1" (on whatever the platform likes)
to the platform's iconv_open() function. The way this is done is by
adding a gperf-syntax data file to the 'iconv_open' module. To create
such a file, I would need from you the list of encoding names, as z/OS
lists them. You can also take lib/iconv_open-aix.gperf as a template.

Packages such gettext are passing an encoding name "ISO-8859-1" to
iconv_open, and the unit test is supposed to verify that this works.


2019-12-19  Bruno Haible  <bruno@clisp.org>

	iconv tests: Test canonicalized, not system-dependent, encoding names.
	* tests/test-iconv.c (main): Revert part of the 2016-08-17 patch.

diff --git a/tests/test-iconv.c b/tests/test-iconv.c
index 3fe9e30..ef1e681 100644
--- a/tests/test-iconv.c
+++ b/tests/test-iconv.c
@@ -44,8 +44,8 @@ main ()
 #if HAVE_ICONV
   /* Assume that iconv() supports at least the encodings ASCII, ISO-8859-1,
      and UTF-8.  */
-  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO8859-1");
-  iconv_t cd_utf8_to_88591 = iconv_open ("ISO8859-1", "UTF-8");
+  iconv_t cd_88591_to_utf8 = iconv_open ("UTF-8", "ISO-8859-1");
+  iconv_t cd_utf8_to_88591 = iconv_open ("ISO-8859-1", "UTF-8");
 
 #if defined __MVS__ && defined __IBMC__
   /* String literals below are in ASCII, not EBCDIC.  */



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and charset aliases
  2019-12-19  5:16                                 ` z/OS, iconv, and charset aliases Bruno Haible
@ 2019-12-19  5:21                                   ` Bruno Haible
  2019-12-20  4:38                                   ` Daniel Richard G.
  1 sibling, 0 replies; 49+ messages in thread
From: Bruno Haible @ 2019-12-19  5:21 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

> 2019-12-19  Bruno Haible  <bruno@clisp.org>
> 
> 	iconv tests: Test canonicalized, not system-dependent, encoding names.
> 	* tests/test-iconv.c (main): Revert part of the 2016-08-17 patch.

Addendum:

	* modules/iconv-tests (Depends-on): Add iconv_open.

diff --git a/modules/iconv-tests b/modules/iconv-tests
index 91e17f0..c1709f9 100644
--- a/modules/iconv-tests
+++ b/modules/iconv-tests
@@ -4,6 +4,7 @@ tests/signature.h
 tests/macros.h
 
 Depends-on:
+iconv_open
 
 configure.ac:
 



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: z/OS configure triple
  2019-12-19  4:57                                 ` z/OS configure triple Bruno Haible
@ 2019-12-20  0:22                                   ` Daniel Richard G.
  2019-12-20  6:29                                     ` Bruno Haible
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2019-12-20  0:22 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

On Wed, 2019 Dec 18 23:57-05:00, Bruno Haible wrote:
> Hi Daniel,
> 
> > gnulib-zos-configure.patch: Changes to the Autoconf M4 code to support
> > z/OS.
> 
> What is the host_os of the canonical triple in that environment?
> Is it 'mvs' or 'openedition'?
> 
> I know that at the C preprocessor level, the test is 'defined __MVS__'.

It's the latter; I haven't seen "mvs" used much by third parties, aside
from older folks at my org :)

$ /tmp/testdir/build-aux/config.guess
trap: /tmp/testdir/build-aux/config.guess 99: FSUM7327 signal number 13 not conventional
i370-ibm-openedition


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and charset aliases
  2019-12-19  5:16                                 ` z/OS, iconv, and charset aliases Bruno Haible
  2019-12-19  5:21                                   ` Bruno Haible
@ 2019-12-20  4:38                                   ` Daniel Richard G.
  2019-12-20  8:19                                     ` Bruno Haible
  1 sibling, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2019-12-20  4:38 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1915 bytes --]

On Thu, 2019 Dec 19 00:16-05:00, Bruno Haible wrote:
> Hi Daniel,
> 
> In <https://lists.gnu.org/archive/html/bug-gnulib/2015-10/msg00020.html>
> you submitted this patch, which Paul committed on 2016-08-18:
> 
> > Also, iconv_open() on
> > z/OS does not recognize "ISO-8859-1", but "ISO8859-1" works.
> 
> [...]
> 
> This part is not right. The approach we take regarding charset/encoding
> aliases is that
>   - locale_charset() produces canonicalized charset names
>     (see localcharset.h for the precise list, e.g. "UTF-8" not "UTF8",
>     "ISO-8859-1" not "ISO8859-1", "CP1252" not "WINDOWS-1252", etc.),
>   - glibc is known to support these canonicalized charset names,
>   - All functions are supposed to receive canonical, not system-dependent
>     charset names.

Understood.

> In particular, the gnulib iconv_open module is supposed to receive an
> encoding name such as "ISO-8859-1" as argument and, on platforms which
> don't understand it, pass "ISO8859-1" (on whatever the platform likes)
> to the platform's iconv_open() function. The way this is done is by
> adding a gperf-syntax data file to the 'iconv_open' module. To create
> such a file, I would need from you the list of encoding names, as z/OS
> lists them. You can also take lib/iconv_open-aix.gperf as a template.

I've attached a file with the output of "iconv -l". The names appear 
consistent with what's in iconv_open-aix.gperf.

> Packages such gettext are passing an encoding name "ISO-8859-1" to
> iconv_open, and the unit test is supposed to verify that this works.
> 
> 2019-12-19  Bruno Haible  <bruno@clisp.org>
> 
> 	iconv tests: Test canonicalized, not system-dependent, encoding names.
> 	* tests/test-iconv.c (main): Revert part of the 2016-08-17 patch.

So *that's* why test-iconv was breaking for me again :]


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

[-- Attachment #2: zos-iconv-output.txt --]
[-- Type: text/plain, Size: 7708 bytes --]

$ iconv -l
Character sets:
	37	IBM-037
	256	00256
	259	00259
	273	IBM-273
	274	IBM-274
	275	IBM-275
	277	IBM-277
	278	IBM-278
	280	IBM-280
	281	IBM-281
	282	IBM-282
	284	IBM-284
	285	IBM-285
	286	00286
	290	IBM-290
	293	00293
	297	IBM-297
	300	IBM-300
	301	IBM-301
	367	00367
	420	IBM-420
	421	00421
	423	00423
	424	IBM-424
	425	IBM-425
	437	IBM-437
	500	IBM-500
	720	00720
	737	00737
	775	00775
	803	00803
	806	00806
	808	IBM-808
	813	ISO8859-7
	819	ISO8859-1
	833	IBM-833
	834	IBM-834
	835	IBM-835
	836	IBM-836
	837	IBM-837
	838	IBM-838
	848	IBM-848
	849	00849
	850	IBM-850
	851	00851
	852	IBM-852
	853	00853
	855	IBM-855
	856	IBM-856
	857	00857
	858	IBM-858
	859	IBM-859
	860	00860
	861	IBM-861
	862	IBM-862
	863	00863
	864	IBM-864
	865	00865
	866	IBM-866
	867	IBM-867
	868	00868
	869	IBM-869
	870	IBM-870
	871	IBM-871
	872	IBM-872
	874	TIS-620
	875	IBM-875
	876	00876
	878	00878
	880	IBM-880
	891	00891
	895	00895
	896	00896
	897	00897
	899	00899
	901	IBM-901
	902	IBM-902
	903	00903
	904	IBM-904
	905	00905
	912	ISO8859-2
	913	00913
	914	ISO8859-4
	915	ISO8859-5
	916	ISO8859-8
	918	00918
	920	ISO8859-9
	921	ISO8859-13
	922	IBM-922
	923	ISO8859-15
	924	IBM-924
	926	00926
	927	IBM-927
	928	IBM-928
	930	IBM-930
	931	00931
	932	IBM-eucJC
	933	IBM-933
	934	00934
	935	IBM-935
	936	IBM-936
	937	IBM-937
	938	IBM-938
	939	IBM-939
	941	00941
	942	IBM-942
	943	IBM-943
	944	00944
	946	IBM-946
	947	IBM-947
	948	IBM-948
	949	IBM-949
	950	BIG5
	951	IBM-951
	952	00952
	953	00953
	954	00954
	955	00955
	956	IBM-956
	957	IBM-957
	958	IBM-958
	959	IBM-959
	960	00960
	961	00961
	963	00963
	964	IBM-eucTW
	965	00965
	966	00966
	970	IBM-eucKR
	971	00971
	1002	01002
	1004	01004
	1006	01006
	1008	01008
	1009	01009
	1010	01010
	1011	01011
	1012	01012
	1013	01013
	1014	01014
	1015	01015
	1016	01016
	1017	01017
	1018	01018
	1019	01019
	1020	01020
	1021	01021
	1023	01023
	1025	IBM-1025
	1026	IBM-1026
	1027	IBM-1027
	1040	01040
	1041	01041
	1042	01042
	1043	01043
	1046	IBM-1046
	1047	IBM-1047
	1051	01051
	1088	IBM-1088
	1089	ISO8859-6
	1097	01097
	1098	01098
	1100	01100
	1101	01101
	1102	01102
	1103	01103
	1104	01104
	1105	01105
	1106	01106
	1107	01107
	1112	IBM-1112
	1114	01114
	1115	IBM-1115
	1122	IBM-1122
	1123	IBM-1123
	1124	IBM-1124
	1125	IBM-1125
	1126	IBM-1126
	1129	01129
	1130	01130
	1131	01131
	1132	01132
	1133	01133
	1137	01137
	1140	IBM-1140
	1141	IBM-1141
	1142	IBM-1142
	1143	IBM-1143
	1144	IBM-1144
	1145	IBM-1145
	1146	IBM-1146
	1147	IBM-1147
	1148	IBM-1148
	1149	IBM-1149
	1153	IBM-1153
	1154	IBM-1154
	1155	IBM-1155
	1156	IBM-1156
	1157	IBM-1157
	1158	IBM-1158
	1159	IBM-1159
	1160	IBM-1160
	1161	IBM-1161
	1162	01162
	1163	01163
	1164	01164
	1165	IBM-1165
	1166	01166
	1167	01167
	1168	01168
	1200	01200
	1202	01202
	1208	UTF-8
	1210	01210
	1232	01232
	1250	IBM-1250
	1251	IBM-1251
	1252	IBM-1252
	1253	IBM-1253
	1254	IBM-1254
	1255	IBM-1255
	1256	IBM-1256
	1257	01257
	1258	01258
	1275	01275
	1276	01276
	1277	01277
	1280	01280
	1281	01281
	1282	01282
	1283	01283
	1284	01284
	1285	01285
	1287	01287
	1288	01288
	1350	01350
	1351	01351
	1362	IBM-1362
	1363	IBM-1363
	1364	IBM-1364
	1370	IBM-1370
	1371	IBM-1371
	1374	01374
	1375	01375
	1380	IBM-1380
	1381	IBM-1381
	1382	01382
	1383	IBM-eucCN
	1385	01385
	1386	IBM-1386
	1388	IBM-1388
	1390	IBM-1390
	1391	01391
	1392	01392
	1399	IBM-1399
	4133	04133
	4369	04369
	4370	04370
	4371	04371
	4373	04373
	4374	04374
	4376	04376
	4378	04378
	4380	04380
	4381	04381
	4386	04386
	4393	04393
	4396	IBM-4396
	4397	04397
	4516	04516
	4517	04517
	4519	04519
	4520	04520
	4533	04533
	4596	04596
	4899	04899
	4904	04904
	4909	IBM-4909
	4929	04929
	4930	IBM-4930
	4931	04931
	4932	04932
	4933	IBM-4933
	4934	04934
	4944	04944
	4945	04945
	4946	IBM-4946
	4947	04947
	4948	04948
	4949	04949
	4951	04951
	4952	04952
	4953	04953
	4954	04954
	4955	04955
	4956	04956
	4957	04957
	4958	04958
	4959	04959
	4960	04960
	4961	04961
	4962	04962
	4963	04963
	4964	04964
	4965	04965
	4966	04966
	4967	04967
	4970	04970
	4971	IBM-4971
	4976	04976
	4992	04992
	4993	04993
	5012	05012
	5014	05014
	5023	05023
	5026	IBM-5026
	5028	05028
	5029	05029
	5031	IBM-5031
	5033	05033
	5035	IBM-5035
	5038	05038
	5039	05039
	5043	05043
	5045	05045
	5046	05046
	5047	05047
	5048	05048
	5049	05049
	5050	05050
	5052	ISO-2022-JP
	5053	IBM-5053
	5054	IBM-5054
	5055	IBM-5055
	5056	05056
	5067	05067
	5100	05100
	5104	05104
	5123	IBM-5123
	5137	05137
	5142	05142
	5143	05143
	5210	05210
	5211	05211
	5346	IBM-5346
	5347	IBM-5347
	5348	IBM-5348
	5349	IBM-5349
	5350	IBM-5350
	5351	IBM-5351
	5352	IBM-5352
	5353	05353
	5354	05354
	5470	05470
	5471	05471
	5472	05472
	5473	05473
	5476	05476
	5477	05477
	5478	05478
	5479	05479
	5486	05486
	5487	05487
	5488	IBM-5488
	5495	05495
	8229	08229
	8448	08448
	8482	IBM-8482
	8492	08492
	8493	08493
	8612	08612
	8629	08629
	8692	08692
	9025	09025
	9026	09026
	9027	IBM-9027
	9028	09028
	9030	09030
	9042	09042
	9044	IBM-9044
	9047	09047
	9048	09048
	9049	09049
	9056	09056
	9060	09060
	9061	IBM-9061
	9064	09064
	9066	09066
	9088	09088
	9089	09089
	9122	09122
	9124	09124
	9125	09125
	9127	09127
	9131	09131
	9139	09139
	9142	09142
	9144	09144
	9145	09145
	9146	09146
	9163	09163
	9238	IBM-9238
	9306	09306
	9444	09444
	9447	09447
	9448	09448
	9449	09449
	9572	09572
	9574	09574
	9575	09575
	9577	09577
	9580	09580
	12544	12544
	12588	12588
	12712	IBM-12712
	12725	12725
	12788	12788
	13121	IBM-13121
	13124	IBM-13124
	13125	13125
	13140	13140
	13143	13143
	13145	13145
	13152	13152
	13156	13156
	13157	13157
	13162	13162
	13184	13184
	13185	13185
	13218	13218
	13219	13219
	13221	13221
	13223	13223
	13235	13235
	13238	13238
	13240	13240
	13241	13241
	13242	13242
	13488	UCS-2
	13671	13671
	13676	13676
	16421	16421
	16684	IBM-16684
	16804	IBM-16804
	16821	16821
	16884	16884
	17221	17221
	17240	17240
	17248	IBM-17248
	17314	17314
	17331	17331
	17337	17337
	17354	17354
	17584	17584
	20517	20517
	20780	20780
	20917	20917
	20980	20980
	21314	21314
	21317	21317
	21344	21344
	21427	21427
	21433	21433
	21450	21450
	21680	21680
	24613	24613
	24876	24876
	24877	24877
	25013	25013
	25076	25076
	25426	25426
	25427	25427
	25428	25428
	25429	25429
	25431	25431
	25432	25432
	25433	25433
	25436	25436
	25437	25437
	25438	25438
	25439	25439
	25440	25440
	25441	25441
	25442	25442
	25444	25444
	25445	25445
	25450	25450
	25467	25467
	25473	25473
	25479	25479
	25480	25480
	25502	25502
	25503	25503
	25504	25504
	25508	25508
	25510	25510
	25512	25512
	25514	25514
	25518	25518
	25520	25520
	25522	25522
	25524	25524
	25525	25525
	25527	25527
	25546	25546
	25580	25580
	25616	25616
	25617	25617
	25618	25618
	25619	25619
	25664	25664
	25690	25690
	25691	25691
	28709	IBM-28709
	29109	29109
	29172	29172
	29522	29522
	29523	29523
	29524	29524
	29525	29525
	29527	29527
	29528	29528
	29529	29529
	29532	29532
	29533	29533
	29534	29534
	29535	29535
	29536	29536
	29537	29537
	29540	29540
	29541	29541
	29546	29546
	29614	29614
	29616	29616
	29618	29618
	29620	29620
	29621	29621
	29623	29623
	29712	29712
	29713	29713
	29714	29714
	29715	29715
	29760	29760
	32805	32805
	33058	33058
	33205	33205
	33268	33268
	33618	33618
	33619	33619
	33620	33620
	33621	33621
	33623	33623
	33624	33624
	33632	33632
	33636	33636
	33637	33637
	33665	33665
	33698	33698
	33699	33699
	33700	33700
	33717	33717
	33722	EUCJP
	37301	37301
	37719	37719
	37728	37728
	37732	37732
	37761	37761
	37813	37813
	41397	41397
	41460	41460
	41824	41824
	41828	41828
	45493	45493
	45556	45556
	45920	45920
	49589	49589
	49652	49652
	53668	IBM-53668
	53685	53685
	53748	53748
	54189	54189
	54191	IBM-54191
	54289	54289
	61696	61696
	61697	61697
	61698	61698
	61699	61699
	61700	61700
	61710	61710
	61711	61711
	61712	61712
	61953	61953
	61956	61956
	62337	62337
	62381	62381
	62383	IBM-62383

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS configure triple
  2019-12-20  0:22                                   ` Daniel Richard G.
@ 2019-12-20  6:29                                     ` Bruno Haible
  0 siblings, 0 replies; 49+ messages in thread
From: Bruno Haible @ 2019-12-20  6:29 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Hi Daniel,

> $ /tmp/testdir/build-aux/config.guess
> trap: /tmp/testdir/build-aux/config.guess 99: FSUM7327 signal number 13 not conventional
> i370-ibm-openedition

Thanks. So the wiki is right, and the m4/intl-thread-locale.m4 file is wrong.


2019-12-20  Bruno Haible  <bruno@clisp.org>

	localename, gettext: Fix host_os value for z/OS.
	* m4/intl-thread-locale.m4 (gt_FUNC_USELOCALE): Fix host_os value in
	cross-configuration code.

diff --git a/m4/intl-thread-locale.m4 b/m4/intl-thread-locale.m4
index f74f116..7f8817f 100644
--- a/m4/intl-thread-locale.m4
+++ b/m4/intl-thread-locale.m4
@@ -1,4 +1,4 @@
-# intl-thread-locale.m4 serial 6
+# intl-thread-locale.m4 serial 7
 dnl Copyright (C) 2015-2019 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -171,8 +171,8 @@ int main ()
          [gt_cv_func_uselocale_works=no],
          [# Guess no on AIX and z/OS, yes otherwise.
           case "$host_os" in
-            aix* | mvs*) gt_cv_func_uselocale_works="guessing no" ;;
-            *)           gt_cv_func_uselocale_works="guessing yes" ;;
+            aix* | openedition*) gt_cv_func_uselocale_works="guessing no" ;;
+            *)                   gt_cv_func_uselocale_works="guessing yes" ;;
           esac
          ])
       ])



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and charset aliases
  2019-12-20  4:38                                   ` Daniel Richard G.
@ 2019-12-20  8:19                                     ` Bruno Haible
  2019-12-20 18:23                                       ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Bruno Haible @ 2019-12-20  8:19 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 2498 bytes --]

Hi Daniel,

> I've attached a file with the output of "iconv -l". The names appear 
> consistent with what's in iconv_open-aix.gperf.

Thanks. From this, I think we can equate the following vendor names with
GNU canonical names:


Vendor name  Canonical name          References

00367        ASCII, ANSI_X3.4-1968   https://en.wikipedia.org/wiki/Code_page_367
                                     https://haible.de/bruno/charsets/conversion-tables/ASCII.html
ISO8859-1    ISO-8859-1
ISO8859-2    ISO-8859-2
ISO8859-4    ISO-8859-4
ISO8859-5    ISO-8859-5
ISO8859-6    ISO-8859-6
ISO8859-7    ISO-8859-7
ISO8859-8    ISO-8859-8
ISO8859-9    ISO-8859-9
ISO8859-13   ISO-8859-13
ISO8859-15   ISO-8859-15
IBM-437      CP437
IBM-850      CP850
IBM-852      CP852
IBM-855      CP855
IBM-856      CP856
IBM-861      CP861
IBM-862      CP862
IBM-864      CP864
IBM-866      CP866
IBM-869      CP869
TIS-620      CP874                   https://haible.de/bruno/charsets/conversion-tables/Thai.html
IBM-922      CP922
IBM-eucJC    CP932
IBM-943      CP943
IBM-949      CP949
IBM-1046     CP1046
IBM-1124     CP1124
IBM-1125     CP1125
IBM-1250     CP1250
IBM-1251     CP1251
IBM-1252     CP1252
IBM-1253     CP1253
IBM-1254     CP1254
IBM-1255     CP1255
IBM-1256     CP1256
IBM-eucCN    GB2312
EUCJP        EUC-JP
IBM-eucKR    EUC-KR
IBM-eucTW    EUC-TW
BIG5         BIG5
IBM-936      GBK
TIS-620      TIS-620
UTF-8        UTF-8


Fortunately, all encodings listed as locale encodings in
"Table 3. Supported language-territory names and LT codes for ASCII locales"
of https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.cbcpx01/locnamc.htm
are in this list.

Omitting identical names on both sides (e.g. BIG5 BIG5), I arrive at the two
attached patches.


2019-12-20  Bruno Haible  <bruno@clisp.org>

	iconv_open: Add support for z/OS encoding names.
	Reported by Daniel Richard G. in
	<https://lists.gnu.org/archive/html/bug-gnulib/2019-12/msg00172.html>.
	* lib/iconv_open-zos.gperf: New file.
	* modules/iconv_open (Files): Add iconv_open-zos.gperf.
	(Makefile.am): Add rules for generating iconv_open-zos.h from it.
	* lib/iconv_open.c (ICONV_FLAVOR_ZOS): New macro.
	* m4/iconv_open.m4 (gl_FUNC_ICONV_OPEN): On z/OS, use ICONV_FLAVOR_ZOS.
	* doc/posix-functions/iconv_open.texi: Mention z/OS.

2019-12-20  Bruno Haible  <bruno@clisp.org>

	localcharset: Add support for z/OS encoding names.
	* lib/localcharset.h: Mention which encodings are used as locale
	encodings on z/OS.



[-- Attachment #2: 0001-iconv_open-Add-support-for-z-OS-encoding-names.patch --]
[-- Type: text/x-patch, Size: 8174 bytes --]

From 49e78fcade5457b00b877fa7f7309056076a9b53 Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Fri, 20 Dec 2019 09:12:37 +0100
Subject: [PATCH 1/2] iconv_open: Add support for z/OS encoding names.

Reported by Daniel Richard G. in
<https://lists.gnu.org/archive/html/bug-gnulib/2019-12/msg00172.html>.

* lib/iconv_open-zos.gperf: New file.
* modules/iconv_open (Files): Add iconv_open-zos.gperf.
(Makefile.am): Add rules for generating iconv_open-zos.h from it.
* lib/iconv_open.c (ICONV_FLAVOR_ZOS): New macro.
* m4/iconv_open.m4 (gl_FUNC_ICONV_OPEN): On z/OS, use ICONV_FLAVOR_ZOS.
* doc/posix-functions/iconv_open.texi: Mention z/OS.
---
 ChangeLog                           | 12 +++++++
 doc/posix-functions/iconv_open.texi |  2 +-
 lib/iconv_open-zos.gperf            | 68 +++++++++++++++++++++++++++++++++++++
 lib/iconv_open.c                    |  1 +
 m4/iconv_open.m4                    | 13 +++----
 modules/iconv_open                  | 12 ++++---
 6 files changed, 97 insertions(+), 11 deletions(-)
 create mode 100644 lib/iconv_open-zos.gperf

diff --git a/ChangeLog b/ChangeLog
index dece371..9d6d06b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,17 @@
 2019-12-20  Bruno Haible  <bruno@clisp.org>
 
+	iconv_open: Add support for z/OS encoding names.
+	Reported by Daniel Richard G. in
+	<https://lists.gnu.org/archive/html/bug-gnulib/2019-12/msg00172.html>.
+	* lib/iconv_open-zos.gperf: New file.
+	* modules/iconv_open (Files): Add iconv_open-zos.gperf.
+	(Makefile.am): Add rules for generating iconv_open-zos.h from it.
+	* lib/iconv_open.c (ICONV_FLAVOR_ZOS): New macro.
+	* m4/iconv_open.m4 (gl_FUNC_ICONV_OPEN): On z/OS, use ICONV_FLAVOR_ZOS.
+	* doc/posix-functions/iconv_open.texi: Mention z/OS.
+
+2019-12-20  Bruno Haible  <bruno@clisp.org>
+
 	doc: Document the problem of the per-thread locale functions on z/OS.
 	* doc/posix-functions/uselocale.texi: Document the z/OS problem.
 	* doc/posix-functions/newlocale.texi: Likewise.
diff --git a/doc/posix-functions/iconv_open.texi b/doc/posix-functions/iconv_open.texi
index a70e1f5..d5f05ee 100644
--- a/doc/posix-functions/iconv_open.texi
+++ b/doc/posix-functions/iconv_open.texi
@@ -20,7 +20,7 @@ Portability problems fixed by Gnulib module @code{iconv_open}:
 @item
 This function recognizes only non-standard aliases for many encodings (not
 the IANA registered encoding names) on many platforms:
-AIX 5.1, HP-UX 11, IRIX 6.5, Solaris 11 2010-11.
+AIX 5.1, HP-UX 11, IRIX 6.5, Solaris 11 2010-11, z/OS.
 @end itemize
 
 Portability problems fixed by Gnulib module @code{iconv_open-utf}:
diff --git a/lib/iconv_open-zos.gperf b/lib/iconv_open-zos.gperf
new file mode 100644
index 0000000..d44b5d7
--- /dev/null
+++ b/lib/iconv_open-zos.gperf
@@ -0,0 +1,68 @@
+/* Character set conversion.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License along
+   with this program; if not, see <https://www.gnu.org/licenses/>.  */
+
+struct mapping { int standard_name; const char vendor_name[10 + 1]; };
+%struct-type
+%language=ANSI-C
+%define slot-name standard_name
+%define hash-function-name mapping_hash
+%define lookup-function-name mapping_lookup
+%readonly-tables
+%global-table
+%define word-array-name mappings
+%pic
+%%
+ASCII, "00367"
+ISO-8859-1, "ISO8859-1"
+ISO-8859-2, "ISO8859-2"
+ISO-8859-4, "ISO8859-4"
+ISO-8859-5, "ISO8859-5"
+ISO-8859-6, "ISO8859-6"
+ISO-8859-7, "ISO8859-7"
+ISO-8859-8, "ISO8859-8"
+ISO-8859-9, "ISO8859-9"
+ISO-8859-13, "ISO8859-13"
+ISO-8859-15, "ISO8859-15"
+CP437, "IBM-437"
+CP850, "IBM-850"
+CP852, "IBM-852"
+CP855, "IBM-855"
+CP856, "IBM-856"
+CP861, "IBM-861"
+CP862, "IBM-862"
+CP864, "IBM-864"
+CP866, "IBM-866"
+CP869, "IBM-869"
+CP874, "TIS-620"
+CP922, "IBM-922"
+CP932, "IBM-eucJC"
+CP943, "IBM-943"
+CP949, "IBM-949"
+CP1046, "IBM-1046"
+CP1124, "IBM-1124"
+CP1125, "IBM-1125"
+CP1250, "IBM-1250"
+CP1251, "IBM-1251"
+CP1252, "IBM-1252"
+CP1253, "IBM-1253"
+CP1254, "IBM-1254"
+CP1255, "IBM-1255"
+CP1256, "IBM-1256"
+GB2312, "IBM-eucCN"
+EUC-JP, "EUCJP"
+EUC-KR, "IBM-eucKR"
+EUC-TW, "IBM-eucTW"
+GBK, "IBM-936"
diff --git a/lib/iconv_open.c b/lib/iconv_open.c
index 928ccf2..918b89c 100644
--- a/lib/iconv_open.c
+++ b/lib/iconv_open.c
@@ -36,6 +36,7 @@
 #define ICONV_FLAVOR_IRIX "iconv_open-irix.h"
 #define ICONV_FLAVOR_OSF "iconv_open-osf.h"
 #define ICONV_FLAVOR_SOLARIS "iconv_open-solaris.h"
+#define ICONV_FLAVOR_ZOS "iconv_open-zos.h"
 
 #ifdef ICONV_FLAVOR
 # include ICONV_FLAVOR
diff --git a/m4/iconv_open.m4 b/m4/iconv_open.m4
index bfcd354..b4730a9 100644
--- a/m4/iconv_open.m4
+++ b/m4/iconv_open.m4
@@ -1,4 +1,4 @@
-# iconv_open.m4 serial 15
+# iconv_open.m4 serial 16
 dnl Copyright (C) 2007-2019 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -23,11 +23,12 @@ AC_DEFUN([gl_FUNC_ICONV_OPEN],
     if test $gl_func_iconv_gnu = no; then
       iconv_flavor=
       case "$host_os" in
-        aix*)     iconv_flavor=ICONV_FLAVOR_AIX ;;
-        irix*)    iconv_flavor=ICONV_FLAVOR_IRIX ;;
-        hpux*)    iconv_flavor=ICONV_FLAVOR_HPUX ;;
-        osf*)     iconv_flavor=ICONV_FLAVOR_OSF ;;
-        solaris*) iconv_flavor=ICONV_FLAVOR_SOLARIS ;;
+        aix*)         iconv_flavor=ICONV_FLAVOR_AIX ;;
+        irix*)        iconv_flavor=ICONV_FLAVOR_IRIX ;;
+        hpux*)        iconv_flavor=ICONV_FLAVOR_HPUX ;;
+        osf*)         iconv_flavor=ICONV_FLAVOR_OSF ;;
+        solaris*)     iconv_flavor=ICONV_FLAVOR_SOLARIS ;;
+        openedition*) iconv_flavor=ICONV_FLAVOR_ZOS ;;
       esac
       if test -n "$iconv_flavor"; then
         AC_DEFINE_UNQUOTED([ICONV_FLAVOR], [$iconv_flavor],
diff --git a/modules/iconv_open b/modules/iconv_open
index 7032dca..3486901 100644
--- a/modules/iconv_open
+++ b/modules/iconv_open
@@ -8,6 +8,7 @@ lib/iconv_open-hpux.gperf
 lib/iconv_open-irix.gperf
 lib/iconv_open-osf.gperf
 lib/iconv_open-solaris.gperf
+lib/iconv_open-zos.gperf
 lib/iconv.c
 lib/iconv_close.c
 m4/iconv_open.m4
@@ -48,10 +49,13 @@ $(srcdir)/iconv_open-osf.h: $(srcdir)/iconv_open-osf.gperf
 $(srcdir)/iconv_open-solaris.h: $(srcdir)/iconv_open-solaris.gperf
 	$(V_GPERF)$(GPERF) -m 10 $(srcdir)/iconv_open-solaris.gperf > $(srcdir)/iconv_open-solaris.h-t && \
 	mv $(srcdir)/iconv_open-solaris.h-t $(srcdir)/iconv_open-solaris.h
-BUILT_SOURCES        += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h
-MOSTLYCLEANFILES     += iconv_open-aix.h-t iconv_open-hpux.h-t iconv_open-irix.h-t iconv_open-osf.h-t iconv_open-solaris.h-t
-MAINTAINERCLEANFILES += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h
-EXTRA_DIST           += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h
+$(srcdir)/iconv_open-zos.h: $(srcdir)/iconv_open-zos.gperf
+	$(V_GPERF)$(GPERF) -m 10 $(srcdir)/iconv_open-zos.gperf > $(srcdir)/iconv_open-zos.h-t && \
+	mv $(srcdir)/iconv_open-zos.h-t $(srcdir)/iconv_open-zos.h
+BUILT_SOURCES        += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h iconv_open-zos.h
+MOSTLYCLEANFILES     += iconv_open-aix.h-t iconv_open-hpux.h-t iconv_open-irix.h-t iconv_open-osf.h-t iconv_open-solaris.h-t iconv_open-zos.h-t
+MAINTAINERCLEANFILES += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h iconv_open-zos.h
+EXTRA_DIST           += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h iconv_open-zos.h
 
 Include:
 <iconv.h>
-- 
2.7.4


[-- Attachment #3: 0002-localcharset-Add-support-for-z-OS-encoding-names.patch --]
[-- Type: text/x-patch, Size: 5308 bytes --]

From 3f7d8da2ee9e513a9db318dc9c4aa91ca6ed8b3b Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Fri, 20 Dec 2019 09:17:20 +0100
Subject: [PATCH 2/2] localcharset: Add support for z/OS encoding names.

* lib/localcharset.h: Mention which encodings are used as locale
encodings on z/OS.
---
 ChangeLog          |  6 ++++++
 lib/localcharset.h | 24 ++++++++++++------------
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 9d6d06b..9a47dbc 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
 2019-12-20  Bruno Haible  <bruno@clisp.org>
 
+	localcharset: Add support for z/OS encoding names.
+	* lib/localcharset.h: Mention which encodings are used as locale
+	encodings on z/OS.
+
+2019-12-20  Bruno Haible  <bruno@clisp.org>
+
 	iconv_open: Add support for z/OS encoding names.
 	Reported by Daniel Richard G. in
 	<https://lists.gnu.org/archive/html/bug-gnulib/2019-12/msg00172.html>.
diff --git a/lib/localcharset.h b/lib/localcharset.h
index 5897140..81ebfae 100644
--- a/lib/localcharset.h
+++ b/lib/localcharset.h
@@ -48,15 +48,15 @@ extern const char * locale_charset (void);
                                     (darwin = Mac OS X, windows = native Windows)
 
    ASCII, ANSI_X3.4-1968       glibc solaris freebsd netbsd darwin minix cygwin
-   ISO-8859-1              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin
-   ISO-8859-2              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin
+   ISO-8859-1              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin zos
+   ISO-8859-2              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin zos
    ISO-8859-3              Y   glibc solaris cygwin
    ISO-8859-4              Y   hpux osf solaris freebsd netbsd openbsd darwin
-   ISO-8859-5              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin
+   ISO-8859-5              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin zos
    ISO-8859-6              Y   glibc aix hpux solaris cygwin
-   ISO-8859-7              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin
-   ISO-8859-8              Y   glibc aix hpux osf solaris cygwin
-   ISO-8859-9              Y   glibc aix hpux irix osf solaris freebsd darwin cygwin
+   ISO-8859-7              Y   glibc aix hpux irix osf solaris freebsd netbsd openbsd darwin cygwin zos
+   ISO-8859-8              Y   glibc aix hpux osf solaris cygwin zos
+   ISO-8859-9              Y   glibc aix hpux irix osf solaris freebsd darwin cygwin zos
    ISO-8859-13                 glibc hpux solaris freebsd netbsd openbsd darwin cygwin
    ISO-8859-14                 glibc cygwin
    ISO-8859-15                 glibc aix irix osf solaris freebsd netbsd openbsd darwin cygwin
@@ -79,7 +79,7 @@ extern const char * locale_charset (void);
    CP874                       windows dos
    CP922                       aix
    CP932                       aix cygwin windows dos
-   CP943                       aix
+   CP943                       aix zos
    CP949                       osf darwin windows dos
    CP950                       windows dos
    CP1046                      aix
@@ -95,17 +95,17 @@ extern const char * locale_charset (void);
    CP1255                      glibc windows
    CP1256                      windows
    CP1257                      windows
-   GB2312                  Y   glibc aix hpux irix solaris freebsd netbsd darwin cygwin
+   GB2312                  Y   glibc aix hpux irix solaris freebsd netbsd darwin cygwin zos
    EUC-JP                  Y   glibc aix hpux irix osf solaris freebsd netbsd darwin cygwin
-   EUC-KR                  Y   glibc aix hpux irix osf solaris freebsd netbsd darwin cygwin
+   EUC-KR                  Y   glibc aix hpux irix osf solaris freebsd netbsd darwin cygwin zos
    EUC-TW                      glibc aix hpux irix osf solaris netbsd
-   BIG5                    Y   glibc aix hpux osf solaris freebsd netbsd darwin cygwin
+   BIG5                    Y   glibc aix hpux osf solaris freebsd netbsd darwin cygwin zos
    BIG5-HKSCS                  glibc hpux solaris netbsd darwin
    GBK                         glibc aix osf solaris freebsd darwin cygwin windows dos
    GB18030                     glibc hpux solaris freebsd netbsd darwin
    SHIFT_JIS               Y   hpux osf solaris freebsd netbsd darwin
    JOHAB                       glibc solaris windows
-   TIS-620                     glibc aix hpux osf solaris cygwin
+   TIS-620                     glibc aix hpux osf solaris cygwin zos
    VISCII                  Y   glibc
    TCVN5712-1                  glibc
    ARMSCII-8                   glibc freebsd netbsd darwin
@@ -119,7 +119,7 @@ extern const char * locale_charset (void);
    HP-KANA8                    hpux
    DEC-KANJI                   osf
    DEC-HANYU                   osf
-   UTF-8                   Y   glibc aix hpux osf solaris netbsd darwin cygwin
+   UTF-8                   Y   glibc aix hpux osf solaris netbsd darwin cygwin zos
 
    Note: Names which are not marked as being a MIME name should not be used in
    Internet protocols for information interchange (mail, news, etc.).
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and charset aliases
  2019-12-20  8:19                                     ` Bruno Haible
@ 2019-12-20 18:23                                       ` Daniel Richard G.
  2019-12-21  5:49                                         ` z/OS, iconv, and gperf Bruno Haible
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2019-12-20 18:23 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

On Fri, 2019 Dec 20 03:19-05:00, Bruno Haible wrote:
> 
> Thanks. From this, I think we can equate the following vendor names
> with GNU canonical names:

Is there a good test to ensure that the conversions are as expected? I
wouldn't put it past IBM to use a strange variant of some of these otherwise-
familiar encodings...

> Omitting identical names on both sides (e.g. BIG5 BIG5), I arrive at
> the two attached patches.

Git 3f7d8da2 gives me this build error:

make[3]: Entering directory `/tmp/gnulib-build/gllib'
source='/tmp/testdir/gllib/iconv_open.c' object='iconv_open.o' libtool=no \
DEPDIR=.deps depmode=aix /bin/sh /tmp/testdir/build-aux/depcomp \
xlc-wrap -DHAVE_CONFIG_H -I. -I/tmp/testdir/gllib -I..  -DGNULIB_STRICT_CHECKING=1 -D_UNIX95_THREADS -D_XOPEN_SOURCE=600 -DNSIG=39 -qhaltonmsg=CCN3296  -g -q64 -qfloat=ieee -qlanglvl=extc99 -qenumsize=4  -c -o iconv_open.o /tmp/testdir/gllib/iconv_open.c
ERROR CCN3205 /tmp/testdir/gllib/iconv_open-zos.h:29    "gperf generated tables don't work with this execution character set. Please report a bug to <bug-gperf@gnu.org>."
CCN0793(I) Compilation failed for file /tmp/testdir/gllib/iconv_open.c.  Object file not created.
make[3]: *** [iconv_open.o] Error 12

Normally, everything builds using EBCDIC on this system. (There are ways
of compiling ASCII source, but that's not the usual way of working.)

There isn't a way to compile gperf tables in an encoding-agnostic
manner?

--Daniel

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2019-12-20 18:23                                       ` Daniel Richard G.
@ 2019-12-21  5:49                                         ` Bruno Haible
  2020-01-09  5:48                                           ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Bruno Haible @ 2019-12-21  5:49 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Hi Daniel,

> > Thanks. From this, I think we can equate the following vendor names
> > with GNU canonical names:
> 
> Is there a good test to ensure that the conversions are as expected? I
> wouldn't put it past IBM to use a strange variant of some of these otherwise-
> familiar encodings...

Oh, certainly many of the IBM-nnn encodings are variants of what Microsoft
and the rest of the world do regarding codepage nnn. Find an extensive
comparison at https://haible.de/bruno/charsets/conversion-tables/index.html .

You find the tools to extract the conversion tables and compare them here:
https://haible.de/bruno/charsets/conversion-tables/tools.html

> > Omitting identical names on both sides (e.g. BIG5 BIG5), I arrive at
> > the two attached patches.
> 
> Git 3f7d8da2 gives me this build error:
> 
> make[3]: Entering directory `/tmp/gnulib-build/gllib'
> source='/tmp/testdir/gllib/iconv_open.c' object='iconv_open.o' libtool=no \
> DEPDIR=.deps depmode=aix /bin/sh /tmp/testdir/build-aux/depcomp \
> xlc-wrap -DHAVE_CONFIG_H -I. -I/tmp/testdir/gllib -I..  -DGNULIB_STRICT_CHECKING=1 -D_UNIX95_THREADS -D_XOPEN_SOURCE=600 -DNSIG=39 -qhaltonmsg=CCN3296  -g -q64 -qfloat=ieee -qlanglvl=extc99 -qenumsize=4  -c -o iconv_open.o /tmp/testdir/gllib/iconv_open.c
> ERROR CCN3205 /tmp/testdir/gllib/iconv_open-zos.h:29    "gperf generated tables don't work with this execution character set. Please report a bug to <bug-gperf@gnu.org>."
> CCN0793(I) Compilation failed for file /tmp/testdir/gllib/iconv_open.c.  Object file not created.
> make[3]: *** [iconv_open.o] Error 12
> 
> Normally, everything builds using EBCDIC on this system. (There are ways
> of compiling ASCII source, but that's not the usual way of working.)
> 
> There isn't a way to compile gperf tables in an encoding-agnostic
> manner?

No. gperf works by using character values as indices into arrays; the
arrays are filled by gperf at code generation time.

Can you experiment with the pragmas to resolve this? For this, you
best take the gperf source distribution, remove the part that emits
the error message in gperf/src/output.cc:2103, and then work with
"make check" to get things working.

Bruno



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2019-12-21  5:49                                         ` z/OS, iconv, and gperf Bruno Haible
@ 2020-01-09  5:48                                           ` Daniel Richard G.
  2020-01-19 21:52                                             ` Bruno Haible
  2020-01-19 21:59                                             ` Bruno Haible
  0 siblings, 2 replies; 49+ messages in thread
From: Daniel Richard G. @ 2020-01-09  5:48 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

On Sat, 2019 Dec 21 00:49-05:00, Bruno Haible wrote:
> 
> Oh, certainly many of the IBM-nnn encodings are variants of what
> Microsoft and the rest of the world do regarding codepage nnn. Find an
> extensive comparison at
> https://haible.de/bruno/charsets/conversion-tables/index.html .
>
> You find the tools to extract the conversion tables and compare
> them here:
> https://haible.de/bruno/charsets/conversion-tables/tools.html

I downloaded the tools, and gave them a try. I will discuss sending you
the resulting information in a private message, as it is fairly large.

> > There isn't a way to compile gperf tables in an encoding-agnostic
> > manner?
>
> No. gperf works by using character values as indices into arrays; the
> arrays are filled by gperf at code generation time.
>
> Can you experiment with the pragmas to resolve this? For this, you
> best take the gperf source distribution, remove the part that emits
> the error message in gperf/src/output.cc:2103, and then work with
> "make check" to get things working.

What is the intended outcome, however? There are pragmas to change the
encoding assumed by the compiler in character/string literals, but if
that is set to ASCII, then the compiled code will also assume ASCII
input, which would typically not be the case on this system.

I suppose in theory, gperf could be given an option to generate
code that expects EBCDIC instead of ASCII, and that source could be
used on this system. However, gperf has no such encoding-related
option, probably because anything other than ASCII is too niche for
their purposes.

--Daniel

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2020-01-09  5:48                                           ` Daniel Richard G.
@ 2020-01-19 21:52                                             ` Bruno Haible
  2020-01-19 21:59                                             ` Bruno Haible
  1 sibling, 0 replies; 49+ messages in thread
From: Bruno Haible @ 2020-01-19 21:52 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Hi Daniel,

> > Oh, certainly many of the IBM-nnn encodings are variants of what
> > Microsoft and the rest of the world do regarding codepage nnn. Find an
> > extensive comparison at
> > https://haible.de/bruno/charsets/conversion-tables/index.html .
> >
> > You find the tools to extract the conversion tables and compare
> > them here:
> > https://haible.de/bruno/charsets/conversion-tables/tools.html
> 
> I downloaded the tools, and gave them a try. I will discuss sending you
> the resulting information in a private message, as it is fairly large.

Thank you. With this information, I updated the charsets comparison
site at https://haible.de/bruno/charsets/conversion-tables/ . It turns
out that z/OS has a couple of encodings under names that we did not
guess. Also, for some encodings a non-intuitive encoding name is closer
to what one would expect. For example, "04962" is better than "IBM-866"
(see https://haible.de/bruno/charsets/conversion-tables/CP866.html).
Also, for EUC-TW there is no really suitable z/OS encoding; "IBM-eucTW"
differs too much from the standard (as measured by 'table-diff').


2020-01-19  Bruno Haible  <bruno@clisp.org>

	iconv_open: Improve z/OS support.
	* lib/iconv_open-zos.gperf: Choose better aliases. Add mapping for
	ISO-8859-3, KOI8-R, KOI8-U, CP775, CP857, CP865, CP1129, CP1131, CP1257.
	Remove mapping for EUC-TW.

diff --git a/lib/iconv_open-zos.gperf b/lib/iconv_open-zos.gperf
index 00e696e..918fdb9 100644
--- a/lib/iconv_open-zos.gperf
+++ b/lib/iconv_open-zos.gperf
@@ -28,41 +28,49 @@ struct mapping { int standard_name; const char vendor_name[10 + 1]; };
 ASCII, "00367"
 ISO-8859-1, "ISO8859-1"
 ISO-8859-2, "ISO8859-2"
+ISO-8859-3, "00913"
 ISO-8859-4, "ISO8859-4"
 ISO-8859-5, "ISO8859-5"
 ISO-8859-6, "ISO8859-6"
 ISO-8859-7, "ISO8859-7"
-ISO-8859-8, "ISO8859-8"
+ISO-8859-8, "05012"
 ISO-8859-9, "ISO8859-9"
 ISO-8859-13, "ISO8859-13"
 ISO-8859-15, "ISO8859-15"
+KOI8-R, "00878"
+KOI8-U, "01168"
 CP437, "IBM-437"
-CP850, "IBM-850"
+CP775, "00775"
+CP850, "09042"
 CP852, "IBM-852"
-CP855, "IBM-855"
+CP855, "13143"
 CP856, "IBM-856"
+CP857, "00857"
 CP861, "IBM-861"
 CP862, "IBM-862"
 CP864, "IBM-864"
-CP866, "IBM-866"
+CP865, "00865"
+CP866, "04962"
 CP869, "IBM-869"
 CP874, "TIS-620"
 CP922, "IBM-922"
-CP932, "IBM-eucJC"
+CP932, "IBM-943"
 CP943, "IBM-943"
-CP949, "IBM-949"
+CP949, "IBM-1363"
 CP1046, "IBM-1046"
 CP1124, "IBM-1124"
 CP1125, "IBM-1125"
-CP1250, "IBM-1250"
-CP1251, "IBM-1251"
-CP1252, "IBM-1252"
-CP1253, "IBM-1253"
-CP1254, "IBM-1254"
-CP1255, "IBM-1255"
-CP1256, "IBM-1256"
+CP1129, "01129"
+CP1131, "01131"
+CP1250, "IBM-5346"
+CP1251, "IBM-5347"
+CP1252, "IBM-5348"
+CP1253, "IBM-5349"
+CP1254, "IBM-5350"
+CP1255, "09447"
+CP1256, "09448"
+CP1257, "09449"
 GB2312, "IBM-eucCN"
-EUC-JP, "EUCJP"
+EUC-JP, "01350"
 EUC-KR, "IBM-eucKR"
-EUC-TW, "IBM-eucTW"
-GBK, "IBM-936"
+GBK, "IBM-1386"



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2020-01-09  5:48                                           ` Daniel Richard G.
  2020-01-19 21:52                                             ` Bruno Haible
@ 2020-01-19 21:59                                             ` Bruno Haible
  2020-01-19 22:32                                               ` Daniel Richard G.
  1 sibling, 1 reply; 49+ messages in thread
From: Bruno Haible @ 2020-01-19 21:59 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Hi Daniel,

> > > There isn't a way to compile gperf tables in an encoding-agnostic
> > > manner?
> >
> > No. gperf works by using character values as indices into arrays; the
> > arrays are filled by gperf at code generation time.
> >
> > Can you experiment with the pragmas to resolve this? For this, you
> > best take the gperf source distribution, remove the part that emits
> > the error message in gperf/src/output.cc:2103, and then work with
> > "make check" to get things working.
> 
> What is the intended outcome, however? There are pragmas to change the
> encoding assumed by the compiler in character/string literals, but if
> that is set to ASCII, then the compiled code will also assume ASCII
> input, which would typically not be the case on this system.
> 
> I suppose in theory, gperf could be given an option to generate
> code that expects EBCDIC instead of ASCII, and that source could be
> used on this system. However, gperf has no such encoding-related
> option, probably because anything other than ASCII is too niche for
> their purposes.

The intended outcome is that a gperf-generated mapping function, say,
for
    FOO, "BAR"
performs equivalently to
    if (strcmp (arg, "FOO") == 0)
      return "BAR";
just faster. Can you find the suitable compiler settings and #pragmas
- inside and outside the gperf-generated code - to make this happen?

In theory, it would be possible to introduce an option to gperf that
makes it generate ASCII- and EBCDIC-based tables in the same output
file, and let the compiler pick the right one at compile-time. But
this is a *lot* of work. Therefore, if you can get the same result
through compiler settings and #pragmas, that will be the way to go.

Bruno



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2020-01-19 21:59                                             ` Bruno Haible
@ 2020-01-19 22:32                                               ` Daniel Richard G.
  2020-01-20  0:13                                                 ` Bruno Haible
  0 siblings, 1 reply; 49+ messages in thread
From: Daniel Richard G. @ 2020-01-19 22:32 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

On Sun, 2020 Jan 19 16:59-05:00, Bruno Haible wrote:
> 
> The intended outcome is that a gperf-generated mapping function, say,
> for
>     FOO, "BAR"
> performs equivalently to
>     if (strcmp (arg, "FOO") == 0)
>       return "BAR";
> just faster. Can you find the suitable compiler settings and #pragmas
> - inside and outside the gperf-generated code - to make this happen?

But what good will that do, if the (ASCII-consuming) gperf code receives
e.g. the EBCDIC form of "ISO-8859-1"? "#pragma convert" only works at
compile time, not run time.

In order for a literal string in a user program to get passed in as
ASCII, the user program itself would also need to be compiled with this
#pragma surrounding the string. (I don't think that is a reasonable thing
to ask of user programs, and it doesn't address non-literal strings in
any event.)

There would need to be an additional step of explicit run-time EBCDIC-to-
ASCII conversion of the input in order for ASCII-based gperf code to
work on this platform. Is that feasible?

> In theory, it would be possible to introduce an option to gperf that
> makes it generate ASCII- and EBCDIC-based tables in the same output
> file, and let the compiler pick the right one at compile-time. But
> this is a *lot* of work. Therefore, if you can get the same result
> through compiler settings and #pragmas, that will be the way to go.

It's not the same result. Only the former would allow "char c = 0xC1" to
be recognized as a letter "A". The latter just makes 'A' == 0x41.

--Daniel

-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2020-01-19 22:32                                               ` Daniel Richard G.
@ 2020-01-20  0:13                                                 ` Bruno Haible
  2020-01-22  6:38                                                   ` Daniel Richard G.
  0 siblings, 1 reply; 49+ messages in thread
From: Bruno Haible @ 2020-01-20  0:13 UTC (permalink / raw)
  To: Daniel Richard G.; +Cc: bug-gnulib

Daniel Richard G. wrote:
> But what good will that do, if the (ASCII-consuming) gperf code receives
> e.g. the EBCDIC form of "ISO-8859-1"? "#pragma convert" only works at
> compile time, not run time.

OK, then we'll need
  a) for the short-term: in lib/iconv_open.c, apply an EBCDIC -> ASCII
     conversion to the 'from' and the 'to' strings. Can you implement that?
     And also a rule that removes the anti-EBCDIC guard from the gperf
     generated output (in modules/iconv_open).
  b) a feature request for the 'gperf' program, to generate two code
     bodies, one for ASCII and one for EBCDIC.

Bruno



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: z/OS, iconv, and gperf
  2020-01-20  0:13                                                 ` Bruno Haible
@ 2020-01-22  6:38                                                   ` Daniel Richard G.
  0 siblings, 0 replies; 49+ messages in thread
From: Daniel Richard G. @ 2020-01-22  6:38 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 2112 bytes --]

Hi Bruno,

On Sun, 2020 Jan 19 19:13-05:00, Bruno Haible wrote:
> 
> OK, then we'll need
>   a) for the short-term: in lib/iconv_open.c, apply an EBCDIC -> ASCII
>      conversion to the 'from' and the 'to' strings. Can you implement that?
>      And also a rule that removes the anti-EBCDIC guard from the gperf
>      generated output (in modules/iconv_open).

Please see the attached patch to iconv_open.c. I'll leave the makefile
rule to you, as that is less straightforward for me. The patch, plus a
disabled #error in iconv_open-zos.h, gets test-iconv to build and pass.

However, the following test failures are new to me:

    $ /tmp/testdir/gltests/test-btoc32-1.sh
    /tmp/testdir/gltests/test-btoc32.c:49: assertion 'btoc32 (c) == c' failed
    CEE5207E The signal SIGABRT was received.
    
    $ /tmp/testdir/gltests/test-mbrtoc32-1.sh
    /tmp/testdir/gltests/test-mbrtoc32.c:108: assertion 'wc == c' failed
    CEE5207E The signal SIGABRT was received.
    
    $ /tmp/testdir/gltests/test-mbrtoc32-5.sh 
    /tmp/testdir/gltests/test-mbrtoc32.c:115: assertion 'mbsinit (&state)' faid
    CEE5207E The signal SIGABRT was received.

I tested using using Git 49c6f78c. Poking a bit into the
test-btoc32-1.sh failure, I saw that it occurred when btoc32(4) yielded
156, which seems consistent with an IBM-1047-to-ASCII mapping. (Per
Wikipedia, 0-3 is the same as ASCII, but 4 is a "SEL" character. And
btoc32(5) returns 9.)

>   b) a feature request for the 'gperf' program, to generate two code
>      bodies, one for ASCII and one for EBCDIC.

What about generating a translation table at compile/run time, that is
used if ASCII is unavailable? Something like

    xlate['A'] = 65;
    xlate['B'] = 66;
    ...
    xlate['Z'] = 90;

    ...

    c = xlate[c];

As I recall, there are EBCDIC variants with minor differences in the
positions of certain punctuation marks, and while they may or may not be
commonly used on z/OS, it would be desirable to remain robust against
that possibility.


--Daniel


-- 
Daniel Richard G. || skunk@iSKUNK.ORG
My ASCII-art .sig got a bad case of Times New Roman.

[-- Attachment #2: zos-iconv-fix.patch.txt --]
[-- Type: text/plain, Size: 2198 bytes --]

diff --git a/lib/iconv_open.c b/lib/iconv_open.c
index 989bd9d57..72276b1c3 100644
--- a/lib/iconv_open.c
+++ b/lib/iconv_open.c
@@ -38,10 +38,25 @@
 #define ICONV_FLAVOR_SOLARIS "iconv_open-solaris.h"
 #define ICONV_FLAVOR_ZOS "iconv_open-zos.h"
 
+#if defined __MVS__ && defined __IBMC__ && 'A' != 0x41
+/* On IBM z/OS, the encoding names are in EBCDIC, but the gperf source still
+   expects and returns ASCII.  We need to convert between the two. */
+# define EBCDIC_CONVERT
+#endif
+
+#ifdef EBCDIC_CONVERT
+/* Ensure that the gperf source is compiled as ASCII.  */
+# pragma convert("ISO8859-1")
+#endif
+
 #ifdef ICONV_FLAVOR
 # include ICONV_FLAVOR
 #endif
 
+#ifdef EBCDIC_CONVERT
+# pragma convert(pop)
+#endif
+
 iconv_t
 rpl_iconv_open (const char *tocode, const char *fromcode)
 #undef iconv_open
@@ -50,6 +65,10 @@ rpl_iconv_open (const char *tocode, const char *fromcode)
   char tocode_upper[32];
   char *fromcode_upper_end;
   char *tocode_upper_end;
+#ifdef EBCDIC_CONVERT
+  char fromcode_ae[32];
+  char tocode_ae[32];
+#endif
 
 #if REPLACE_ICONV_UTF
   /* Special handling of conversion between UTF-8 and UTF-{16,32}{BE,LE}.
@@ -150,6 +169,15 @@ rpl_iconv_open (const char *tocode, const char *fromcode)
     tocode_upper_end = q;
   }
 
+#ifdef EBCDIC_CONVERT
+  /* Convert the encodings from EBCDIC to ASCII, as gperf expects the latter.  */
+  if (__etoa (fromcode_upper) < 0 || __etoa (tocode_upper) < 0)
+    {
+      errno = EINVAL;
+      return (iconv_t)(-1);
+    }
+#endif
+
 #ifdef ICONV_FLAVOR
   /* Apply the mappings.  */
   {
@@ -169,5 +197,20 @@ rpl_iconv_open (const char *tocode, const char *fromcode)
   tocode = tocode_upper;
 #endif
 
+#ifdef EBCDIC_CONVERT
+  /* Convert the encodings back to EBCDIC for iconv_open().  */
+  strncpy (fromcode_ae, fromcode, sizeof(fromcode_ae));
+  strncpy (tocode_ae, tocode, sizeof(tocode_ae));
+  fromcode_ae[SIZEOF (fromcode_ae) - 1] = '\0';
+  tocode_ae[SIZEOF (tocode_ae) - 1] = '\0';
+  if (__atoe (fromcode_ae) < 0 || __atoe (tocode_ae) < 0)
+    {
+      errno = EINVAL;
+      return (iconv_t)(-1);
+    }
+  fromcode = fromcode_ae;
+  tocode = tocode_ae;
+#endif
+
   return iconv_open (tocode, fromcode);
 }

^ permalink raw reply related	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2020-01-22  6:41 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22  2:28 [PATCH] IBM z/OS + EBCDIC support Daniel Richard G.
2015-09-22 15:23 ` Eric Blake
2015-09-22 19:27   ` Daniel Richard G.
2015-09-22 20:00     ` Paul Eggert
2015-09-22 20:08       ` Eric Blake
2015-09-22 20:51         ` Daniel Richard G.
2015-09-22 19:32 ` Paul Eggert
2015-09-22 19:46   ` Paul Eggert
2015-09-22 20:37   ` Daniel Richard G.
2015-09-22 22:03     ` Paul Eggert
2015-09-22 23:44       ` Daniel Richard G.
2015-09-23  2:02         ` Paul Eggert
2015-09-23  6:58           ` Daniel Richard G.
2015-09-23 19:05             ` Paul Eggert
2015-09-23 19:29             ` Paul Eggert
2015-09-23 21:57               ` Daniel Richard G.
2015-09-25  7:29                 ` Paul Eggert
2015-09-26  0:25                   ` Daniel Richard G.
2015-09-26  2:49                     ` Paul Eggert
2015-09-26  4:39                       ` Daniel Richard G.
2015-09-26 16:08                         ` Ben Pfaff
2015-09-27  6:31                           ` Daniel Richard G.
2015-09-27  6:59                             ` Paul Eggert
2015-09-28  2:09                               ` Daniel Richard G.
2015-10-15  4:49                               ` Daniel Richard G.
2016-08-18  0:47                                 ` Paul Eggert
2016-08-18  8:24                                   ` Daniel Richard G.
2016-08-18  8:53                                     ` Paul Eggert
2016-08-19  8:20                                       ` Daniel Richard G.
2016-08-19 11:03                                         ` Bruno Haible
2016-08-19 19:28                                         ` Paul Eggert
2016-08-19 20:38                                           ` Daniel Richard G.
2019-12-19  4:57                                 ` z/OS configure triple Bruno Haible
2019-12-20  0:22                                   ` Daniel Richard G.
2019-12-20  6:29                                     ` Bruno Haible
2019-12-19  5:16                                 ` z/OS, iconv, and charset aliases Bruno Haible
2019-12-19  5:21                                   ` Bruno Haible
2019-12-20  4:38                                   ` Daniel Richard G.
2019-12-20  8:19                                     ` Bruno Haible
2019-12-20 18:23                                       ` Daniel Richard G.
2019-12-21  5:49                                         ` z/OS, iconv, and gperf Bruno Haible
2020-01-09  5:48                                           ` Daniel Richard G.
2020-01-19 21:52                                             ` Bruno Haible
2020-01-19 21:59                                             ` Bruno Haible
2020-01-19 22:32                                               ` Daniel Richard G.
2020-01-20  0:13                                                 ` Bruno Haible
2020-01-22  6:38                                                   ` Daniel Richard G.
2015-09-22 19:50 ` [PATCH] IBM z/OS + EBCDIC support Paul Eggert
2015-09-22 20:47   ` Daniel Richard G.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).