* gnulib does not always detect need for iconv() hack on musl
@ 2021-10-17 14:14 Sergei Trofimovich
2021-10-17 17:18 ` Bruno Haible
0 siblings, 1 reply; 6+ messages in thread
From: Sergei Trofimovich @ 2021-10-17 14:14 UTC (permalink / raw)
To: bug-gnulib; +Cc: Bruno Haible
Hi gnulib! The problem:
The following fails bison-3.8.2 tests:
$ ./configure && make && make check
The following succeeds:
$ ./configure --host=x86_64-unknown-linux-musl && make && make check
The failure happens due to unexpected '*' output in report logs instead
of '%empty' on 'ASCII' locales.
These unexpected '*' pop back again because gnulib relies on '--host='
parameter for './configure' to detect musl target (for lack of better
signal?):
https://git.savannah.gnu.org/cgit/gnulib.git/tree/m4/musl.m4#n16
case "$host_os" in
*-musl*) AC_DEFINE([MUSL_LIBC], [1], [Define to 1 on musl libc.]) ;;
https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/unicodeio.c#n151
/* FreeBSD iconv(), NetBSD iconv(), and Solaris 11 iconv() insert
a '?' if they cannot convert. */
# if !defined _LIBICONV_VERSION
|| (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
# endif
/* musl libc iconv() inserts a '*' if it cannot convert. */
# if !defined _LIBICONV_VERSION && MUSL_LIBC
|| (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
# endif
)
return failure (code, NULL, callback_arg);
What do you think of enabling the workaround regardless of MUSL_LIBC
define?
Or perhaps gnulib should perform runtime testing to detect the need for
a hack? Here is how musl mangles symbols:
https://git.musl-libc.org/cgit/musl/tree/src/locale/iconv.c#n545
case US_ASCII:
if (c > 0x7f) subst: x++, c='*';
Below implements unconditional workaround.
Thank you!
--- a/lib/unicodeio.c
+++ b/lib/unicodeio.c
@@ -148,7 +148,7 @@ unicode_to_mb (unsigned int code,
|| (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
# endif
/* musl libc iconv() inserts a '*' if it cannot convert. */
-# if !defined _LIBICONV_VERSION && MUSL_LIBC
+# if !defined _LIBICONV_VERSION
|| (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
# endif
)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: gnulib does not always detect need for iconv() hack on musl
2021-10-17 14:14 gnulib does not always detect need for iconv() hack on musl Sergei Trofimovich
@ 2021-10-17 17:18 ` Bruno Haible
2021-10-17 18:13 ` Sergei Trofimovich
0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-10-17 17:18 UTC (permalink / raw)
To: bug-gnulib, Sergei Trofimovich
Hello Sergei,
Sergei Trofimovich wrote:
> The following fails bison-3.8.2 tests:
> $ ./configure && make && make check
> The following succeeds:
> $ ./configure --host=x86_64-unknown-linux-musl && make && make check
>
> The failure happens due to unexpected '*' output in report logs instead
> of '%empty' on 'ASCII' locales.
>
> These unexpected '*' pop back again because gnulib relies on '--host='
> parameter for './configure' to detect musl target (for lack of better
> signal?):
>
> https://git.savannah.gnu.org/cgit/gnulib.git/tree/m4/musl.m4#n16
>
> case "$host_os" in
> *-musl*) AC_DEFINE([MUSL_LIBC], [1], [Define to 1 on musl libc.]) ;;
>
> https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/unicodeio.c#n151
>
> /* FreeBSD iconv(), NetBSD iconv(), and Solaris 11 iconv() insert
> a '?' if they cannot convert. */
> # if !defined _LIBICONV_VERSION
> || (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
> # endif
> /* musl libc iconv() inserts a '*' if it cannot convert. */
> # if !defined _LIBICONV_VERSION && MUSL_LIBC
> || (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
> # endif
> )
> return failure (code, NULL, callback_arg);
>
> What do you think of enabling the workaround regardless of MUSL_LIBC
> define?
The MUSL_LIBC symbol is supposed to be set on musl platforms; this is
what musl.m4 is for. The difference between your two invocations is that
in the first case, it used a $host triple inferred by config.guess,
while in the second case, it used the $host that you specified on the
command line.
When I try your two commands (just the configure step), the first one
prints
checking for host system type... x86_64-pc-linux-musl
while the second one prints
checking for host system type... x86_64-unknown-linux-musl
The next steps of the investigation are: In the first case,
- What did the "checking for host system type..." line look like?
- Which of the environment variables CC_FOR_BUILD, HOST_CC, CC,
CONFIG_SITE did you have defined, and to which values?
> Or perhaps gnulib should perform runtime testing to detect the need for
> a hack? Here is how musl mangles symbols:
>
> https://git.musl-libc.org/cgit/musl/tree/src/locale/iconv.c#n545
>
> case US_ASCII:
> if (c > 0x7f) subst: x++, c='*';
>
> Below implements unconditional workaround.
Thanks for the suggestion. But we try to limit the performance implications
of hacks/workarounds needed for one platform (here: musl) on other platforms
(especially glibc platforms).
Bruno
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: gnulib does not always detect need for iconv() hack on musl
2021-10-17 17:18 ` Bruno Haible
@ 2021-10-17 18:13 ` Sergei Trofimovich
2021-10-17 19:27 ` Bruno Haible
0 siblings, 1 reply; 6+ messages in thread
From: Sergei Trofimovich @ 2021-10-17 18:13 UTC (permalink / raw)
To: Bruno Haible, bug-gnulib
On Sun, Oct 17, 2021 at 07:18:51PM +0200, Bruno Haible wrote:
> Hello Sergei,
>
> Sergei Trofimovich wrote:
> > The following fails bison-3.8.2 tests:
> > $ ./configure && make && make check
> > The following succeeds:
> > $ ./configure --host=x86_64-unknown-linux-musl && make && make check
> >
> > The failure happens due to unexpected '*' output in report logs instead
> > of '%empty' on 'ASCII' locales.
> >
> > These unexpected '*' pop back again because gnulib relies on '--host='
> > parameter for './configure' to detect musl target (for lack of better
> > signal?):
> >
> > https://git.savannah.gnu.org/cgit/gnulib.git/tree/m4/musl.m4#n16
> >
> > case "$host_os" in
> > *-musl*) AC_DEFINE([MUSL_LIBC], [1], [Define to 1 on musl libc.]) ;;
> >
> > https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/unicodeio.c#n151
> >
> > /* FreeBSD iconv(), NetBSD iconv(), and Solaris 11 iconv() insert
> > a '?' if they cannot convert. */
> > # if !defined _LIBICONV_VERSION
> > || (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
> > # endif
> > /* musl libc iconv() inserts a '*' if it cannot convert. */
> > # if !defined _LIBICONV_VERSION && MUSL_LIBC
> > || (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
> > # endif
> > )
> > return failure (code, NULL, callback_arg);
> >
> > What do you think of enabling the workaround regardless of MUSL_LIBC
> > define?
>
> The MUSL_LIBC symbol is supposed to be set on musl platforms; this is
> what musl.m4 is for. The difference between your two invocations is that
> in the first case, it used a $host triple inferred by config.guess,
> while in the second case, it used the $host that you specified on the
> command line.
>
> When I try your two commands (just the configure step), the first one
> prints
> checking for host system type... x86_64-pc-linux-musl
> while the second one prints
> checking for host system type... x86_64-unknown-linux-musl
>
> The next steps of the investigation are: In the first case,
> - What did the "checking for host system type..." line look like?
> - Which of the environment variables CC_FOR_BUILD, HOST_CC, CC,
> CONFIG_SITE did you have defined, and to which values?
Aha, 'config.guess' clearly detects wrong libc here:
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
I did not realize 'config.guess' has the code to detect libc but it
clearly does. I'll dig from there and complain elsewhere.
Thank you!
--
Sergei
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: gnulib does not always detect need for iconv() hack on musl
2021-10-17 18:13 ` Sergei Trofimovich
@ 2021-10-17 19:27 ` Bruno Haible
2021-10-18 0:27 ` Bruno Haible
0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-10-17 19:27 UTC (permalink / raw)
To: bug-gnulib, Sergei Trofimovich
Sergei Trofimovich wrote:
> Aha, 'config.guess' clearly detects wrong libc here:
>
> checking build system type... x86_64-pc-linux-gnu
> checking host system type... x86_64-pc-linux-gnu
Yes, for a musl system, that's wrong.
The problem may come from your environment. Which of the environment
variables CC_FOR_BUILD, HOST_CC, CC, CONFIG_SITE did you have defined,
and to which values?
> I did not realize 'config.guess' has the code to detect libc but it
> clearly does. I'll dig from there and complain elsewhere.
The mailing list is https://lists.gnu.org/mailman/listinfo/config-patches .
The current code in config.guess is a heuristic (that has been working
on Alpine Linux up to 3.13), because the musl libc people refuse to have
their libc identify itself. [1]
Bruno
[1] https://wiki.musl-libc.org/faq.html#Q:-Why-is-there-no-%3Ccode%3E__MUSL__%3C/code%3E-macro?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: gnulib does not always detect need for iconv() hack on musl
2021-10-17 19:27 ` Bruno Haible
@ 2021-10-18 0:27 ` Bruno Haible
2021-10-18 8:16 ` Sergei Trofimovich
0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-10-18 0:27 UTC (permalink / raw)
To: Sergei Trofimovich; +Cc: bug-gnulib
> The current code in config.guess is a heuristic (that has been working
> on Alpine Linux up to 3.13)
It works also in Alpine Linux 3.14.2. Which distro are you using?
Bruno
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: gnulib does not always detect need for iconv() hack on musl
2021-10-18 0:27 ` Bruno Haible
@ 2021-10-18 8:16 ` Sergei Trofimovich
0 siblings, 0 replies; 6+ messages in thread
From: Sergei Trofimovich @ 2021-10-18 8:16 UTC (permalink / raw)
To: Bruno Haible; +Cc: bug-gnulib
On Mon, Oct 18, 2021 at 02:27:38AM +0200, Bruno Haible wrote:
> > The current code in config.guess is a heuristic (that has been working
> > on Alpine Linux up to 3.13)
>
> It works also in Alpine Linux 3.14.2. Which distro are you using?
I'm trying to use it on NixOS. I think I tracked it down to infelicity
of bootstrap environment.
For most packages config.guess returns correct value:
$ nix develop -f. pkgsMusl.bison
$ unpackPhase
$ cd bison-3.7.6
$ ./build-aux/config.guess
x86_64-pc-linux-musl
But for packages that use bootstrap toolchain the detection fails:
# don't know how to get better environment against bootstrap toolchain
$ nix develop /nix/store/iwlhpwbfmr6v5mh0g6iabl3161am5gdd-bison-3.8.2.drv
$ unpackPhase
$ cd bison-3.8.2
$ ./build-aux/config.guess
x86_64-pc-linux-gnu
When I compare the two the difference is in expansion of
#include <stdarg.h>
(exactly what 'config.guess' probes).
In a good case 'stdarg.h' from musl is used:
$ echo '#include <stdarg.h>' | gcc -E - | unnix
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/<<NIX>>/musl-1.2.2-dev/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "<stdin>"
# 1 "/<<NIX>>/musl-1.2.2-dev/include/stdarg.h" 1 3 4
# 10 "/<<NIX>>/musl-1.2.2-dev/include/stdarg.h" 3 4
# 1 "/<<NIX>>/musl-1.2.2-dev/include/bits/alltypes.h" 1 3 4
# 326 "/<<NIX>>/musl-1.2.2-dev/include/bits/alltypes.h" 3 4
# 326 "/<<NIX>>/musl-1.2.2-dev/include/bits/alltypes.h" 3 4
typedef __builtin_va_list va_list;
# 11 "/<<NIX>>/musl-1.2.2-dev/include/stdarg.h" 2 3 4
# 2 "<stdin>" 2
In a bad case we use gcc's wrapper of 'stdarg.h':
$ echo '#include <stdarg.h>' | gcc -E - | unnix
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/<<NIX>>/bootstrap-tools/include-libc/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "<stdin>"
# 1 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 1 3
# 40 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 3
# 40 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 3
typedef __builtin_va_list __gnuc_va_list;
# 99 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 3
typedef __gnuc_va_list va_list;
# 1 "<stdin>" 2
I think it happens because NixOS's bootstrap toolchain uses slightly
different include orders:
Good case:
/<<NIX>>/gcc-10.3.0/include
/<<NIX>>/musl-1.2.2-dev/include
# ^ <<<- picked 'stdarg.h' from here, musl version
/<<NIX>>/gcc-10.3.0/lib/gcc/x86_64-unknown-linux-musl/10.3.0/include
/<<NIX>>/gcc-10.3.0/lib/gcc/x86_64-unknown-linux-musl/10.3.0/include-fixed
Bad case:
/<<NIX>>/bootstrap-tools/bin/../lib/gcc/x86_64-unknown-linux-musl/7.3.0/include
# ^ <<<- picked stdarg from here, gcc version
/<<NIX>>/bootstrap-tools/bin/../lib/gcc/../../include
/<<NIX>>/bootstrap-stage0-musl-bootstrap/include
/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include-fixed
Perhaps "Bad case" is more natural include order as 'gcc' tries hard
nowadays to isolate standard headers from accidental namespace
pollution (like '__DEFINED_va_list' define config.guess searches for).
I'll bring it to NixOS first to find out what is intended order here first.
--
Sergei
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-18 8:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17 14:14 gnulib does not always detect need for iconv() hack on musl Sergei Trofimovich
2021-10-17 17:18 ` Bruno Haible
2021-10-17 18:13 ` Sergei Trofimovich
2021-10-17 19:27 ` Bruno Haible
2021-10-18 0:27 ` Bruno Haible
2021-10-18 8:16 ` Sergei Trofimovich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).