bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* iconv: document not-fixed bugs, using POSIX terminology
@ 2020-12-20  0:35 Noah Misch
  2021-01-03 19:48 ` Bruno Haible
  0 siblings, 1 reply; 4+ messages in thread
From: Noah Misch @ 2020-12-20  0:35 UTC (permalink / raw
  To: bug-gnulib

iconv.m4 clears HAVE_ICONV if it detects iconv() bugs.  Docs list those bugs
under "Portability problems fixed by Gnulib:".  If changing HAVE_ICONV were
enough to qualify as a fix, then "This function is missing on some platforms"
would also belong under that heading.  I'm proposing to move those bugs under
"not fixed by Gnulib".  It might have been better to add a third section, like
"Portability problems reported by Gnulib preprocessor macros:".

Two of those iconv() bugs involve what POSIX calls "non-identical conversion".
(GNU libc calls it "non-reversible conversion".)  The gnulib docs and code
comments use terms "failures" and "conversion errors", but these bugs don't
entail the distinct POSIX concept of "error" or failure.  Hence, I propose
standardizing on the term "non-identical conversion".  (AIX places 0x1A
https://en.wikipedia.org/wiki/Substitute_character in the output buffer.
Solaris uses an underscore.)

The following patch has those changes.  It also updates modules/iconv_open to
check HAVE_ICONV where modules/iconv checks it.  (Neither module provides
functions to !HAVE_ICONV configurations.)

Incidentally, the AIX iconv() buffer overrun was gone no later than AIX
7100-03-02-1412.  The non-POSIX-conforming return value remains as of AIX
7100-05-06-2028 and AIX 7200-04-01-1939.  Moreover, IBM documents it:
https://www.ibm.com/support/knowledgecenter/ssw_aix_72/i_bostechref/iconv.html

Thanks,
nm


From 3b99e619bec2f9b0222da05a2bb58ca3e83a258e Mon Sep 17 00:00:00 2001
From: Noah Misch <nmisch@google.com>
Date: Sat, 19 Dec 2020 16:17:45 -0800
Subject: [PATCH] iconv, iconv_open: document not-fixed bugs, using POSIX
 terminology.

* doc/posix-functions/iconv.texi: Move the AIX and Solaris bugs to "not
fixed", because the module only declines to set HAVE_ICONV.  Use term
"non-identical conversions" for events POSIX names that way.
* doc/posix-functions/iconv_open.texi: Move HP-UX EUC-JP bug to "not
fixed", because the module only declines to set HAVE_ICONV.
* m4/iconv.m4 (AM_ICONV_LINK): Use term "non-identical conversions" for
events POSIX names that way.
* modules/iconv_open (Include): Check HAVE_ICONV, like modules/iconv.
---
 ChangeLog                           | 12 ++++++++++++
 doc/posix-functions/iconv.texi      | 17 +++++++++--------
 doc/posix-functions/iconv_open.texi |  6 +++---
 m4/iconv.m4                         |  8 ++++----
 modules/iconv_open                  |  4 +++-
 5 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 0d8c2ee..bed3e3b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2020-12-19  Noah Misch  <nmisch@google.com>
+
+	iconv, iconv_open: document not-fixed bugs, using POSIX terminology.
+	* doc/posix-functions/iconv.texi: Move the AIX and Solaris bugs to "not
+	fixed", because the module only declines to set HAVE_ICONV.  Use term
+	"non-identical conversions" for events POSIX names that way.
+	* doc/posix-functions/iconv_open.texi: Move HP-UX EUC-JP bug to "not
+	fixed", because the module only declines to set HAVE_ICONV.
+	* m4/iconv.m4 (AM_ICONV_LINK): Use term "non-identical conversions" for
+	events POSIX names that way.
+	* modules/iconv_open (Include): Check HAVE_ICONV, like modules/iconv.
+
 2020-12-19  Bruno Haible  <bruno@clisp.org>
 
 	free-posix: Assume future POSIX compliance only on OpenBSD and Solaris.
diff --git a/doc/posix-functions/iconv.texi b/doc/posix-functions/iconv.texi
index 01e7fd1..895afb3 100644
--- a/doc/posix-functions/iconv.texi
+++ b/doc/posix-functions/iconv.texi
@@ -10,12 +10,6 @@ Portability problems fixed by Gnulib:
 @itemize
 @item
 GNU libiconv is not found if installed in @file{$PREFIX/lib}.
-@item
-Failures are not distinguishable from successful returns on some platforms:
-AIX 5.1, Solaris 10.
-@item
-A buffer overrun can occur on some platforms:
-AIX 6.1..7.1.
 @end itemize
 
 Portability problems not fixed by Gnulib:
@@ -32,9 +26,16 @@ cannot be converted to the output character set, glibc's and GNU libiconv's
 @code{iconv} stop the conversion.  Some other implementations put an
 implementation-defined character into the output buffer.
 Gnulib provides higher-level facilities @code{striconv} and @code{striconveh}
-(wrappers around @code{iconv}) that deal with conversion errors in a platform
-independent way.
+(wrappers around @code{iconv}) that deal with non-identical conversions in a
+platform independent way.
 @item
 This function returns a positive return value, instead of zero, when
 converting from ISO-8859-1 to UTF-8 on HP-UX 11.
+@item
+This function returns zero, instead of a positive return value, for
+non-identical conversions on some platforms:
+AIX 5.1, Solaris 10.
+@item
+A buffer overrun can occur on some platforms:
+AIX 6.1..7.1.
 @end itemize
diff --git a/doc/posix-functions/iconv_open.texi b/doc/posix-functions/iconv_open.texi
index f46cf98..3af3e3b 100644
--- a/doc/posix-functions/iconv_open.texi
+++ b/doc/posix-functions/iconv_open.texi
@@ -10,9 +10,6 @@ Portability problems fixed by either Gnulib module @code{iconv} or @code{iconv_o
 @itemize
 @item
 GNU libiconv is not found if installed in @file{$PREFIX/lib}.
-@item
-No converter from EUC-JP to UTF-8 is provided on some platforms:
-HP-UX 11.
 @end itemize
 
 Portability problems fixed by Gnulib module @code{iconv_open}:
@@ -45,4 +42,7 @@ facility @code{striconveh} (a wrapper around @code{iconv}) that deals with
 this problem.
 @item
 The set of supported encodings and conversions is system dependent.
+@item
+No converter from EUC-JP to UTF-8 is provided on some platforms:
+HP-UX 11.
 @end itemize
diff --git a/m4/iconv.m4 b/m4/iconv.m4
index f8a536b..d205a62 100644
--- a/m4/iconv.m4
+++ b/m4/iconv.m4
@@ -92,8 +92,8 @@ AC_DEFUN([AM_ICONV_LINK],
 #endif
              ]],
              [[int result = 0;
-  /* Test against AIX 5.1 bug: Failures are not distinguishable from successful
-     returns.  */
+  /* Test against AIX 5.1 bug: Zero return despite non-identical
+     (non-reversible) conversion of one character.  */
   {
     iconv_t cd_utf8_to_88591 = iconv_open ("ISO8859-1", "UTF-8");
     if (cd_utf8_to_88591 != (iconv_t)(-1))
@@ -112,8 +112,8 @@ AC_DEFUN([AM_ICONV_LINK],
         iconv_close (cd_utf8_to_88591);
       }
   }
-  /* Test against Solaris 10 bug: Failures are not distinguishable from
-     successful returns.  */
+  /* Test against Solaris 10 bug: Zero return despite non-identical
+     (non-reversible) conversion of one character.  */
   {
     iconv_t cd_ascii_to_88591 = iconv_open ("ISO8859-1", "646");
     if (cd_ascii_to_88591 != (iconv_t)(-1))
diff --git a/modules/iconv_open b/modules/iconv_open
index 3486901..dba4592 100644
--- a/modules/iconv_open
+++ b/modules/iconv_open
@@ -58,7 +58,9 @@ MAINTAINERCLEANFILES += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h ico
 EXTRA_DIST           += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h iconv_open-zos.h
 
 Include:
-<iconv.h>
+#if HAVE_ICONV
+# include <iconv.h>
+#endif
 
 Link:
 $(LTLIBICONV) when linking with libtool, $(LIBICONV) otherwise
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: iconv: document not-fixed bugs, using POSIX terminology
  2020-12-20  0:35 iconv: document not-fixed bugs, using POSIX terminology Noah Misch
@ 2021-01-03 19:48 ` Bruno Haible
  2021-01-04  0:24   ` Noah Misch
  0 siblings, 1 reply; 4+ messages in thread
From: Bruno Haible @ 2021-01-03 19:48 UTC (permalink / raw
  To: bug-gnulib; +Cc: Noah Misch

[-- Attachment #1: Type: text/plain, Size: 2162 bytes --]

Hi Noah,

Thanks for the patch and explanations from
<https://lists.gnu.org/archive/html/bug-gnulib/2020-12/msg00172.html>.

> iconv.m4 clears HAVE_ICONV if it detects iconv() bugs.  Docs list those bugs
> under "Portability problems fixed by Gnulib:".  If changing HAVE_ICONV were
> enough to qualify as a fix, then "This function is missing on some platforms"
> would also belong under that heading.  I'm proposing to move those bugs under
> "not fixed by Gnulib".

Good point. Yes, it doesn't belong under "fixed by Gnulib". But "not fixed by
Gnulib" is also not the right category. I'll list these under "handled by
Gnulib", with an extra explanation.

> Two of those iconv() bugs involve what POSIX calls "non-identical conversion".
> (GNU libc calls it "non-reversible conversion".)  The gnulib docs and code
> comments use terms "failures" and "conversion errors", but these bugs don't
> entail the distinct POSIX concept of "error" or failure.  Hence, I propose
> standardizing on the term "non-identical conversion".

"non-identical conversion" is a nonsensical term. Therefore it's better not
to use it. (Two objects can be identical if they are in the same mathematical
set. But when we use iconv, we are doing a mapping between the minimal byte
sequences of the input character set and the minimal byte sequences of the
destination character set; these are two different sets. If anyone uses
the term "identical" here, it would mean that the sets have an intersection,
and that the corresponding byte sequences are the same (e.g. like ISO-8859-1
and UTF-8 have, as intersection, the set of 1-byte sequences with values >= 0,
<= 0x7F).)

> The following patch has those changes.  It also updates modules/iconv_open to
> check HAVE_ICONV where modules/iconv checks it.

Good point, yes.

> Incidentally, the AIX iconv() buffer overrun was gone no later than AIX
> 7100-03-02-1412.  The non-POSIX-conforming return value remains as of AIX
> 7100-05-06-2028 and AIX 7200-04-01-1939.  Moreover, IBM documents it:
> https://www.ibm.com/support/knowledgecenter/ssw_aix_72/i_bostechref/iconv.html

Thanks for the info. Reflected in the patches below.

Bruno

[-- Attachment #2: 0001-iconv_open-Fix-module-description.patch --]
[-- Type: text/x-patch, Size: 1293 bytes --]

From afae962fedb0e0fb10c855435922a33f7a20c597 Mon Sep 17 00:00:00 2001
From: Noah Misch <noah@leadboat.com>
Date: Sun, 3 Jan 2021 19:17:52 +0100
Subject: [PATCH 1/3] iconv_open: Fix module description.

* modules/iconv_open (Include): Check HAVE_ICONV, like modules/iconv.
---
 ChangeLog          | 5 +++++
 modules/iconv_open | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 82e4a10..78fa7be 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2021-01-03  Noah Misch  <noah@leadboat.com>
+
+	iconv_open: Fix module description.
+	* modules/iconv_open (Include): Check HAVE_ICONV, like modules/iconv.
+
 2021-01-03  Bruno Haible  <bruno@clisp.org>
 
 	stddef: Override wrong max_align_t on AIX 7 with xlc in 64-bit mode.
diff --git a/modules/iconv_open b/modules/iconv_open
index 3486901..dba4592 100644
--- a/modules/iconv_open
+++ b/modules/iconv_open
@@ -58,7 +58,9 @@ MAINTAINERCLEANFILES += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h ico
 EXTRA_DIST           += iconv_open-aix.h iconv_open-hpux.h iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h iconv_open-zos.h
 
 Include:
-<iconv.h>
+#if HAVE_ICONV
+# include <iconv.h>
+#endif
 
 Link:
 $(LTLIBICONV) when linking with libtool, $(LIBICONV) otherwise
-- 
2.7.4


[-- Attachment #3: 0002-iconv-h-Fix-module-description.patch --]
[-- Type: text/x-patch, Size: 975 bytes --]

From 00932fa5644e4a0da2214fe89bdc14663444921a Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Sun, 3 Jan 2021 19:19:01 +0100
Subject: [PATCH 2/3] iconv-h: Fix module description.

* modules/iconv-h (Include): Check HAVE_ICONV_H.
---
 ChangeLog       | 5 +++++
 modules/iconv-h | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 78fa7be..4b9f083 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2021-01-03  Bruno Haible  <bruno@clisp.org>
+
+	iconv-h: Fix module description.
+	* modules/iconv-h (Include): Check HAVE_ICONV_H.
+
 2021-01-03  Noah Misch  <noah@leadboat.com>
 
 	iconv_open: Fix module description.
diff --git a/modules/iconv-h b/modules/iconv-h
index 08371a4..c5f353a 100644
--- a/modules/iconv-h
+++ b/modules/iconv-h
@@ -46,7 +46,9 @@ endif
 MOSTLYCLEANFILES += iconv.h iconv.h-t
 
 Include:
-<iconv.h>
+#if HAVE_ICONV_H
+# include <iconv.h>
+#endif
 
 License:
 LGPLv2+
-- 
2.7.4


[-- Attachment #4: 0003-iconv-iconv_open-Improve-documentation.patch --]
[-- Type: text/x-patch, Size: 3754 bytes --]

From 5886f0967c83e27bf4cccfef13a1528195e3651a Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Sun, 3 Jan 2021 20:46:04 +0100
Subject: [PATCH 3/3] iconv, iconv_open: Improve documentation.

Reported by Noah Misch <noah@leadboat.com> in
<https://lists.gnu.org/archive/html/bug-gnulib/2020-12/msg00172.html>.

* doc/posix-functions/iconv_open.texi: Add new paragraph "Portability
problems handled by Gnulib".
* doc/posix-functions/iconv.texi: Likewise. Update info about AIX.
* m4/iconv.m4 (AM_ICONV_LINK): Improve comments.
---
 ChangeLog                           | 10 ++++++++++
 doc/posix-functions/iconv.texi      |  8 +++++++-
 doc/posix-functions/iconv_open.texi |  6 ++++++
 m4/iconv.m4                         |  7 ++++---
 4 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4b9f083..6c042f5 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,15 @@
 2021-01-03  Bruno Haible  <bruno@clisp.org>
 
+	iconv, iconv_open: Improve documentation.
+	Reported by Noah Misch <noah@leadboat.com> in
+	<https://lists.gnu.org/archive/html/bug-gnulib/2020-12/msg00172.html>.
+	* doc/posix-functions/iconv_open.texi: Add new paragraph "Portability
+	problems handled by Gnulib".
+	* doc/posix-functions/iconv.texi: Likewise. Update info about AIX.
+	* m4/iconv.m4 (AM_ICONV_LINK): Improve comments.
+
+2021-01-03  Bruno Haible  <bruno@clisp.org>
+
 	iconv-h: Fix module description.
 	* modules/iconv-h (Include): Check HAVE_ICONV_H.
 
diff --git a/doc/posix-functions/iconv.texi b/doc/posix-functions/iconv.texi
index 01e7fd1..4e1c182 100644
--- a/doc/posix-functions/iconv.texi
+++ b/doc/posix-functions/iconv.texi
@@ -10,9 +10,15 @@ Portability problems fixed by Gnulib:
 @itemize
 @item
 GNU libiconv is not found if installed in @file{$PREFIX/lib}.
+@end itemize
+
+Portability problems handled by Gnulib
+(in the sense that @code{HAVE_ICONV} does not get defined if the system's
+@code{iconv} function has this problem):
+@itemize
 @item
 Failures are not distinguishable from successful returns on some platforms:
-AIX 5.1, Solaris 10.
+AIX 5.1..7.2, Solaris 10.
 @item
 A buffer overrun can occur on some platforms:
 AIX 6.1..7.1.
diff --git a/doc/posix-functions/iconv_open.texi b/doc/posix-functions/iconv_open.texi
index f46cf98..98f22ae 100644
--- a/doc/posix-functions/iconv_open.texi
+++ b/doc/posix-functions/iconv_open.texi
@@ -10,6 +10,12 @@ Portability problems fixed by either Gnulib module @code{iconv} or @code{iconv_o
 @itemize
 @item
 GNU libiconv is not found if installed in @file{$PREFIX/lib}.
+@end itemize
+
+Portability problems handled by either Gnulib module @code{iconv} or @code{iconv_open}
+(in the sense that @code{HAVE_ICONV} does not get defined if the system's
+@code{iconv_open} function has this problem):
+@itemize
 @item
 No converter from EUC-JP to UTF-8 is provided on some platforms:
 HP-UX 11.
diff --git a/m4/iconv.m4 b/m4/iconv.m4
index 04e3bad..d0e61de 100644
--- a/m4/iconv.m4
+++ b/m4/iconv.m4
@@ -1,4 +1,4 @@
-# iconv.m4 serial 23
+# iconv.m4 serial 24
 dnl Copyright (C) 2000-2002, 2007-2014, 2016-2021 Free Software Foundation,
 dnl Inc.
 dnl This file is free software; the Free Software Foundation
@@ -92,8 +92,9 @@ AC_DEFUN([AM_ICONV_LINK],
 #endif
              ]],
              [[int result = 0;
-  /* Test against AIX 5.1 bug: Failures are not distinguishable from successful
-     returns.  */
+  /* Test against AIX 5.1...7.2 bug: Failures are not distinguishable from
+     successful returns.  This is even documented in
+     <https://www.ibm.com/support/knowledgecenter/ssw_aix_72/i_bostechref/iconv.html> */
   {
     iconv_t cd_utf8_to_88591 = iconv_open ("ISO8859-1", "UTF-8");
     if (cd_utf8_to_88591 != (iconv_t)(-1))
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: iconv: document not-fixed bugs, using POSIX terminology
  2021-01-03 19:48 ` Bruno Haible
@ 2021-01-04  0:24   ` Noah Misch
  2021-01-04  0:57     ` Bruno Haible
  0 siblings, 1 reply; 4+ messages in thread
From: Noah Misch @ 2021-01-04  0:24 UTC (permalink / raw
  To: Bruno Haible; +Cc: bug-gnulib

On Sun, Jan 03, 2021 at 08:48:20PM +0100, Bruno Haible wrote:
> > iconv.m4 clears HAVE_ICONV if it detects iconv() bugs.  Docs list those bugs
> > under "Portability problems fixed by Gnulib:".  If changing HAVE_ICONV were
> > enough to qualify as a fix, then "This function is missing on some platforms"
> > would also belong under that heading.  I'm proposing to move those bugs under
> > "not fixed by Gnulib".
> 
> Good point. Yes, it doesn't belong under "fixed by Gnulib". But "not fixed by
> Gnulib" is also not the right category. I'll list these under "handled by
> Gnulib", with an extra explanation.

I like having the separate section.  Since you're doing that, "This function
is missing on some platforms:" also belongs under "handled by", not under "not
fixed".

> > Two of those iconv() bugs involve what POSIX calls "non-identical conversion".
> > (GNU libc calls it "non-reversible conversion".)  The gnulib docs and code
> > comments use terms "failures" and "conversion errors", but these bugs don't
> > entail the distinct POSIX concept of "error" or failure.  Hence, I propose
> > standardizing on the term "non-identical conversion".
> 
> "non-identical conversion" is a nonsensical term. Therefore it's better not
> to use it. (Two objects can be identical if they are in the same mathematical
> set. But when we use iconv, we are doing a mapping between the minimal byte
> sequences of the input character set and the minimal byte sequences of the
> destination character set; these are two different sets. If anyone uses
> the term "identical" here, it would mean that the sets have an intersection,
> and that the corresponding byte sequences are the same (e.g. like ISO-8859-1
> and UTF-8 have, as intersection, the set of 1-byte sequences with values >= 0,
> <= 0x7F).)

If https://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html used
"identical" to mean "same byte sequence", it would be requiring an ISO-8859-1 =>
UTF-8 conversion to increment the iconv() return value upon converting 0xA1 to
0xC2 0xA1.  Most implementations don't do that.  (HP-UX does, and Gnulib calls
it a bug.)

https://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html also
defines what it means to "fail" or experience an "error", and it classifies
"non-identical conversion" as neither of those things.  Using a term unattested
in POSIX, like "non-reversible conversion", would solve that problem.

> --- a/modules/iconv-h
> +++ b/modules/iconv-h
> @@ -46,7 +46,9 @@ endif
>  MOSTLYCLEANFILES += iconv.h iconv.h-t
>  
>  Include:
> -<iconv.h>
> +#if HAVE_ICONV_H
> +# include <iconv.h>
> +#endif

Doesn't this module cause the header to exist?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: iconv: document not-fixed bugs, using POSIX terminology
  2021-01-04  0:24   ` Noah Misch
@ 2021-01-04  0:57     ` Bruno Haible
  0 siblings, 0 replies; 4+ messages in thread
From: Bruno Haible @ 2021-01-04  0:57 UTC (permalink / raw
  To: Noah Misch; +Cc: bug-gnulib

Noah Misch wrote:
> If https://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html used
> "identical" to mean "same byte sequence", it would be requiring an ISO-8859-1 =>
> UTF-8 conversion to increment the iconv() return value upon converting 0xA1 to
> 0xC2 0xA1.  Most implementations don't do that.

Yes. This proves that when this part of POSIX says "identical", they mean
"equivalent".

It's nonsense if a standard attempts to redefine terms, that are defined with a
certain meaning across mathematics and computer science, in a different way.

> > --- a/modules/iconv-h
> > +++ b/modules/iconv-h
> > @@ -46,7 +46,9 @@ endif
> >  MOSTLYCLEANFILES += iconv.h iconv.h-t
> >  
> >  Include:
> > -<iconv.h>
> > +#if HAVE_ICONV_H
> > +# include <iconv.h>
> > +#endif
> 
> Doesn't this module cause the header to exist?

No. I checked the code.

Bruno



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-04  0:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-12-20  0:35 iconv: document not-fixed bugs, using POSIX terminology Noah Misch
2021-01-03 19:48 ` Bruno Haible
2021-01-04  0:24   ` Noah Misch
2021-01-04  0:57     ` Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).