bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
From: Bruno Haible <bruno@clisp.org>
To: bug-gnulib@gnu.org
Cc: "Tim Rühsen" <tim.ruehsen@gmx.de>
Subject: Re: u8_strconv_to_locale() misbehaves on OSX (Travis CI runner)
Date: Thu, 08 Feb 2018 18:05:34 +0100	[thread overview]
Message-ID: <2057454.6PEsBuzmPS@omega> (raw)
In-Reply-To: <bbbc3c3e-ff14-ad5f-5874-dba3d7fb3df6@gmx.de>

Hi Tim,

> locale_charset() returns with "UTF-8".

That is as it should be on Mac OS X.

> u8_strconv_to_locale() and u8_strconv_from_locale() seem not to work as
> expected:
> 
> 
> One problem seems to be that u8_strconv_to_locale() outputs decomposed
> characters, e.g. u8_strconv_to_locale(bücher.de) returns b"ucher.de.
> 
> Hex/u32:
> 
> Result: U+0062 U+0022 U+0075 U+0063 U+0068 U+0065 U+0072 U+002e U+0064
> U+0065)
> 
> Expected: U+0062 U+00fc U+0063 U+0068 U+0065 U+0072 U+002e U+0064 U+0065

This would indicate that locale_charset() returns "ASCII".
What happens then is that, because u8_strconv_to_locale invokes
u8_strconv_to_encoding, which invokes mem_iconveha with transliterate=true,
which appends '//TRANSLIT' when invoking iconv_open. you get the
transliteration, e.g. from 'ü' to '"u'.

> The second problem is that characters beyond 255 are translated into ?
> (U+003f).

This would indicate that locale_charset() returns "ISO-8859-1". The
question marks then come from the transliteration, again.

> Do you have any hints how to fix these problems ?

I would compile without -O and with -ggdb, then single-step through the code,
paying particular attention to the value of locale_charset() and to
the arguments of iconv_open().

> I would expect u8_strconv_to_locale() to work in a defined manner on
> UTF-8 locales

That's certainly how it is intended to be.

Bruno



  reply	other threads:[~2018-02-08 17:31 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-08 15:59 u8_strconv_to_locale() misbehaves on OSX (Travis CI runner) Tim Rühsen
2018-02-08 17:05 ` Bruno Haible [this message]
2018-02-08 19:22   ` Tim Ruehsen
2024-02-23 18:52     ` Bruno Haible

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2057454.6PEsBuzmPS@omega \
    --to=bruno@clisp.org \
    --cc=bug-gnulib@gnu.org \
    --cc=tim.ruehsen@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).