bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* gnulib does not always detect need for iconv() hack on musl
@ 2021-10-17 14:14 Sergei Trofimovich
  2021-10-17 17:18 ` Bruno Haible
  0 siblings, 1 reply; 6+ messages in thread
From: Sergei Trofimovich @ 2021-10-17 14:14 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Bruno Haible

Hi gnulib! The problem:

The following fails bison-3.8.2 tests:
    $ ./configure && make && make check
The following succeeds:
    $ ./configure --host=x86_64-unknown-linux-musl && make && make check

The failure happens due to unexpected '*' output in report logs instead
of '%empty' on 'ASCII' locales.

These unexpected '*' pop back again because gnulib relies on '--host='
parameter for './configure' to detect musl target (for lack of better
signal?):

  https://git.savannah.gnu.org/cgit/gnulib.git/tree/m4/musl.m4#n16

    case "$host_os" in
      *-musl*) AC_DEFINE([MUSL_LIBC], [1], [Define to 1 on musl libc.]) ;;

  https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/unicodeio.c#n151

    /* FreeBSD iconv(), NetBSD iconv(), and Solaris 11 iconv() insert
       a '?' if they cannot convert.  */
    # if !defined _LIBICONV_VERSION
              || (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
    # endif
      /* musl libc iconv() inserts a '*' if it cannot convert.  */
    # if !defined _LIBICONV_VERSION && MUSL_LIBC
              || (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
    # endif
         )
        return failure (code, NULL, callback_arg);

What do you think of enabling the workaround regardless of MUSL_LIBC
define?

Or perhaps gnulib should perform runtime testing to detect the need for
a hack? Here is how musl mangles symbols:

  https://git.musl-libc.org/cgit/musl/tree/src/locale/iconv.c#n545

    case US_ASCII:
        if (c > 0x7f) subst: x++, c='*';

Below implements unconditional workaround.

Thank you!

--- a/lib/unicodeio.c
+++ b/lib/unicodeio.c
@@ -148,7 +148,7 @@ unicode_to_mb (unsigned int code,
           || (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
 # endif
           /* musl libc iconv() inserts a '*' if it cannot convert.  */
-# if !defined _LIBICONV_VERSION && MUSL_LIBC
+# if !defined _LIBICONV_VERSION
           || (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
 # endif
          )


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnulib does not always detect need for iconv() hack on musl
  2021-10-17 14:14 gnulib does not always detect need for iconv() hack on musl Sergei Trofimovich
@ 2021-10-17 17:18 ` Bruno Haible
  2021-10-17 18:13   ` Sergei Trofimovich
  0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-10-17 17:18 UTC (permalink / raw)
  To: bug-gnulib, Sergei Trofimovich

Hello Sergei,

Sergei Trofimovich wrote:
> The following fails bison-3.8.2 tests:
>     $ ./configure && make && make check
> The following succeeds:
>     $ ./configure --host=x86_64-unknown-linux-musl && make && make check
> 
> The failure happens due to unexpected '*' output in report logs instead
> of '%empty' on 'ASCII' locales.
> 
> These unexpected '*' pop back again because gnulib relies on '--host='
> parameter for './configure' to detect musl target (for lack of better
> signal?):
> 
>   https://git.savannah.gnu.org/cgit/gnulib.git/tree/m4/musl.m4#n16
> 
>     case "$host_os" in
>       *-musl*) AC_DEFINE([MUSL_LIBC], [1], [Define to 1 on musl libc.]) ;;
> 
>   https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/unicodeio.c#n151
> 
>     /* FreeBSD iconv(), NetBSD iconv(), and Solaris 11 iconv() insert
>        a '?' if they cannot convert.  */
>     # if !defined _LIBICONV_VERSION
>               || (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
>     # endif
>       /* musl libc iconv() inserts a '*' if it cannot convert.  */
>     # if !defined _LIBICONV_VERSION && MUSL_LIBC
>               || (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
>     # endif
>          )
>         return failure (code, NULL, callback_arg);
> 
> What do you think of enabling the workaround regardless of MUSL_LIBC
> define?

The MUSL_LIBC symbol is supposed to be set on musl platforms; this is
what musl.m4 is for. The difference between your two invocations is that
in the first case, it used a $host triple inferred by config.guess,
while in the second case, it used the $host that you specified on the
command line.

When I try your two commands (just the configure step), the first one
prints
  checking for host system type... x86_64-pc-linux-musl
while the second one prints
  checking for host system type... x86_64-unknown-linux-musl

The next steps of the investigation are: In the first case,
  - What did the "checking for host system type..." line look like?
  - Which of the environment variables CC_FOR_BUILD, HOST_CC, CC,
    CONFIG_SITE did you have defined, and to which values?

> Or perhaps gnulib should perform runtime testing to detect the need for
> a hack? Here is how musl mangles symbols:
> 
>   https://git.musl-libc.org/cgit/musl/tree/src/locale/iconv.c#n545
> 
>     case US_ASCII:
>         if (c > 0x7f) subst: x++, c='*';
> 
> Below implements unconditional workaround.

Thanks for the suggestion. But we try to limit the performance implications
of hacks/workarounds needed for one platform (here: musl) on other platforms
(especially glibc platforms).

Bruno





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnulib does not always detect need for iconv() hack on musl
  2021-10-17 17:18 ` Bruno Haible
@ 2021-10-17 18:13   ` Sergei Trofimovich
  2021-10-17 19:27     ` Bruno Haible
  0 siblings, 1 reply; 6+ messages in thread
From: Sergei Trofimovich @ 2021-10-17 18:13 UTC (permalink / raw)
  To: Bruno Haible, bug-gnulib

On Sun, Oct 17, 2021 at 07:18:51PM +0200, Bruno Haible wrote:
> Hello Sergei,
> 
> Sergei Trofimovich wrote:
> > The following fails bison-3.8.2 tests:
> >     $ ./configure && make && make check
> > The following succeeds:
> >     $ ./configure --host=x86_64-unknown-linux-musl && make && make check
> > 
> > The failure happens due to unexpected '*' output in report logs instead
> > of '%empty' on 'ASCII' locales.
> > 
> > These unexpected '*' pop back again because gnulib relies on '--host='
> > parameter for './configure' to detect musl target (for lack of better
> > signal?):
> > 
> >   https://git.savannah.gnu.org/cgit/gnulib.git/tree/m4/musl.m4#n16
> > 
> >     case "$host_os" in
> >       *-musl*) AC_DEFINE([MUSL_LIBC], [1], [Define to 1 on musl libc.]) ;;
> > 
> >   https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/unicodeio.c#n151
> > 
> >     /* FreeBSD iconv(), NetBSD iconv(), and Solaris 11 iconv() insert
> >        a '?' if they cannot convert.  */
> >     # if !defined _LIBICONV_VERSION
> >               || (res > 0 && outptr - outbuf == 1 && *outbuf == '?')
> >     # endif
> >       /* musl libc iconv() inserts a '*' if it cannot convert.  */
> >     # if !defined _LIBICONV_VERSION && MUSL_LIBC
> >               || (res > 0 && outptr - outbuf == 1 && *outbuf == '*')
> >     # endif
> >          )
> >         return failure (code, NULL, callback_arg);
> > 
> > What do you think of enabling the workaround regardless of MUSL_LIBC
> > define?
> 
> The MUSL_LIBC symbol is supposed to be set on musl platforms; this is
> what musl.m4 is for. The difference between your two invocations is that
> in the first case, it used a $host triple inferred by config.guess,
> while in the second case, it used the $host that you specified on the
> command line.
> 
> When I try your two commands (just the configure step), the first one
> prints
>   checking for host system type... x86_64-pc-linux-musl
> while the second one prints
>   checking for host system type... x86_64-unknown-linux-musl
> 
> The next steps of the investigation are: In the first case,
>   - What did the "checking for host system type..." line look like?
>   - Which of the environment variables CC_FOR_BUILD, HOST_CC, CC,
>     CONFIG_SITE did you have defined, and to which values?

Aha, 'config.guess' clearly detects wrong libc here:

  checking build system type... x86_64-pc-linux-gnu
  checking host system type... x86_64-pc-linux-gnu

I did not realize 'config.guess' has the code to detect libc but it
clearly does. I'll dig from there and complain elsewhere.

Thank you!

-- 

  Sergei


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnulib does not always detect need for iconv() hack on musl
  2021-10-17 18:13   ` Sergei Trofimovich
@ 2021-10-17 19:27     ` Bruno Haible
  2021-10-18  0:27       ` Bruno Haible
  0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-10-17 19:27 UTC (permalink / raw)
  To: bug-gnulib, Sergei Trofimovich

Sergei Trofimovich wrote:
> Aha, 'config.guess' clearly detects wrong libc here:
> 
>   checking build system type... x86_64-pc-linux-gnu
>   checking host system type... x86_64-pc-linux-gnu

Yes, for a musl system, that's wrong.

The problem may come from your environment. Which of the environment
variables CC_FOR_BUILD, HOST_CC, CC, CONFIG_SITE did you have defined,
and to which values?

> I did not realize 'config.guess' has the code to detect libc but it
> clearly does. I'll dig from there and complain elsewhere.

The mailing list is https://lists.gnu.org/mailman/listinfo/config-patches .

The current code in config.guess is a heuristic (that has been working
on Alpine Linux up to 3.13), because the musl libc people refuse to have
their libc identify itself. [1]

Bruno

[1] https://wiki.musl-libc.org/faq.html#Q:-Why-is-there-no-%3Ccode%3E__MUSL__%3C/code%3E-macro?





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnulib does not always detect need for iconv() hack on musl
  2021-10-17 19:27     ` Bruno Haible
@ 2021-10-18  0:27       ` Bruno Haible
  2021-10-18  8:16         ` Sergei Trofimovich
  0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2021-10-18  0:27 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: bug-gnulib

> The current code in config.guess is a heuristic (that has been working
> on Alpine Linux up to 3.13)

It works also in Alpine Linux 3.14.2. Which distro are you using?

Bruno





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnulib does not always detect need for iconv() hack on musl
  2021-10-18  0:27       ` Bruno Haible
@ 2021-10-18  8:16         ` Sergei Trofimovich
  0 siblings, 0 replies; 6+ messages in thread
From: Sergei Trofimovich @ 2021-10-18  8:16 UTC (permalink / raw)
  To: Bruno Haible; +Cc: bug-gnulib

On Mon, Oct 18, 2021 at 02:27:38AM +0200, Bruno Haible wrote:
> > The current code in config.guess is a heuristic (that has been working
> > on Alpine Linux up to 3.13)
> 
> It works also in Alpine Linux 3.14.2. Which distro are you using?

I'm trying to use it on NixOS. I think I tracked it down to infelicity
of bootstrap environment.

For most packages config.guess returns correct value:

    $ nix develop -f. pkgsMusl.bison
    $ unpackPhase
    $ cd bison-3.7.6
    $ ./build-aux/config.guess
    x86_64-pc-linux-musl

But for packages that use bootstrap toolchain the detection fails:

   # don't know how to get better environment against bootstrap toolchain
   $ nix develop /nix/store/iwlhpwbfmr6v5mh0g6iabl3161am5gdd-bison-3.8.2.drv
   $ unpackPhase
   $ cd bison-3.8.2
   $ ./build-aux/config.guess
   x86_64-pc-linux-gnu

When I compare the two the difference is in expansion of
    #include <stdarg.h>
(exactly what 'config.guess' probes).

In a good case 'stdarg.h' from musl is used:

    $ echo '#include <stdarg.h>' | gcc -E - | unnix
    # 1 "<stdin>"
    # 1 "<built-in>"
    # 1 "<command-line>"
    # 1 "/<<NIX>>/musl-1.2.2-dev/include/stdc-predef.h" 1 3 4
    # 1 "<command-line>" 2
    # 1 "<stdin>"
    # 1 "/<<NIX>>/musl-1.2.2-dev/include/stdarg.h" 1 3 4
    # 10 "/<<NIX>>/musl-1.2.2-dev/include/stdarg.h" 3 4
    # 1 "/<<NIX>>/musl-1.2.2-dev/include/bits/alltypes.h" 1 3 4
    # 326 "/<<NIX>>/musl-1.2.2-dev/include/bits/alltypes.h" 3 4
    
    # 326 "/<<NIX>>/musl-1.2.2-dev/include/bits/alltypes.h" 3 4
    typedef __builtin_va_list va_list;
    # 11 "/<<NIX>>/musl-1.2.2-dev/include/stdarg.h" 2 3 4
    # 2 "<stdin>" 2

In a bad case we use gcc's wrapper of 'stdarg.h':

    $ echo '#include <stdarg.h>' | gcc -E - | unnix
    # 1 "<stdin>"
    # 1 "<built-in>"
    # 1 "<command-line>"
    # 1 "/<<NIX>>/bootstrap-tools/include-libc/stdc-predef.h" 1 3 4
    # 1 "<command-line>" 2
    # 1 "<stdin>"
    # 1 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 1 3
    # 40 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 3
    
    # 40 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 3
    typedef __builtin_va_list __gnuc_va_list;
    # 99 "/<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include/stdarg.h" 3
    typedef __gnuc_va_list va_list;
    # 1 "<stdin>" 2

I think it happens because NixOS's bootstrap toolchain uses slightly
different include orders:

Good case:

  /<<NIX>>/gcc-10.3.0/include
  /<<NIX>>/musl-1.2.2-dev/include
    # ^ <<<- picked 'stdarg.h' from here, musl version
  /<<NIX>>/gcc-10.3.0/lib/gcc/x86_64-unknown-linux-musl/10.3.0/include
  /<<NIX>>/gcc-10.3.0/lib/gcc/x86_64-unknown-linux-musl/10.3.0/include-fixed

Bad case:

  /<<NIX>>/bootstrap-tools/bin/../lib/gcc/x86_64-unknown-linux-musl/7.3.0/include
    # ^ <<<- picked stdarg from here, gcc version
  /<<NIX>>/bootstrap-tools/bin/../lib/gcc/../../include
  /<<NIX>>/bootstrap-stage0-musl-bootstrap/include
  /<<NIX>>/bootstrap-tools/lib/gcc/x86_64-unknown-linux-musl/7.3.0/include-fixed

Perhaps "Bad case" is more natural include order as 'gcc' tries hard
nowadays to isolate standard headers from accidental namespace
pollution (like '__DEFINED_va_list' define config.guess searches for).

I'll bring it to NixOS first to find out what is intended order here first.

-- 

  Sergei


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-18  8:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17 14:14 gnulib does not always detect need for iconv() hack on musl Sergei Trofimovich
2021-10-17 17:18 ` Bruno Haible
2021-10-17 18:13   ` Sergei Trofimovich
2021-10-17 19:27     ` Bruno Haible
2021-10-18  0:27       ` Bruno Haible
2021-10-18  8:16         ` Sergei Trofimovich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).