bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* localcharset, nl_langinfo: fix return value for UTF-8 locales on MSVC
@ 2019-07-02 18:56 Bruno Haible
  0 siblings, 0 replies; only message in thread
From: Bruno Haible @ 2019-07-02 18:56 UTC (permalink / raw)
  To: bug-gnulib

On MSVC (on Windows 10), I'm seeing this test failure:

FAIL: test-nl_langinfo.sh
=========================

C:\testdir-posix-msvc\gltests\test-nl_langinfo.c:57: assertion 'c_strcasecmp (codeset, "UTF-8") == 0 || c_strcasecmp (codeset, "UTF8") == 0' failed
FAIL test-nl_langinfo.sh (exit status: 1)

The test uses the return value of setlocale (.., NULL) in the locale named
French_France.65001. Apparently the locale name returned by setlocale is now
"French_France.utf8". This is the first sign of explicit support of an UTF-8
locale in the Microsoft runtime library!

nl_langinfo.c "canonicalizes" this result to "CPutf8", which is nonsense. It
should be "UTF-8".

Likewise in localcharset.c.


2019-07-02  Bruno Haible  <bruno@clisp.org>

	localcharset, nl_langinfo: Fix return value for UTF-8 locales on MSVC.
	* lib/localcharset.c (locale_charset): Return "UTF-8" instead of
	"CPutf8".
	* lib/nl_langinfo.c (ctype_codeset): Likewise.

diff --git a/lib/localcharset.c b/lib/localcharset.c
index 80a20b1..173d116 100644
--- a/lib/localcharset.c
+++ b/lib/localcharset.c
@@ -787,7 +787,12 @@ locale_charset (void)
         encoding is the best bet.  */
       sprintf (buf, "CP%u", GetACP ());
     }
-  codeset = buf;
+  /* For a locale name such as "French_France.65001", in Windows 10,
+     setlocale now returns "French_France.utf8" instead.  */
+  if (strcmp (buf + 2, "65001") == 0 || strcmp (buf + 2, "utf8") == 0)
+    codeset = "UTF-8";
+  else
+    codeset = buf;
 
 # elif defined OS2
 
diff --git a/lib/nl_langinfo.c b/lib/nl_langinfo.c
index e8a5595..76579a8 100644
--- a/lib/nl_langinfo.c
+++ b/lib/nl_langinfo.c
@@ -76,9 +76,15 @@ ctype_codeset (void)
     memmove (buf + 2, codeset, codesetlen + 1);
   else
     sprintf (buf + 2, "%u", GetACP ());
-  codeset = memcpy (buf, "CP", 2);
-# endif
+  /* For a locale name such as "French_France.65001", in Windows 10,
+     setlocale now returns "French_France.utf8" instead.  */
+  if (strcmp (buf + 2, "65001") == 0 || strcmp (buf + 2, "utf8") == 0)
+    return "UTF-8";
+  else
+    return memcpy (buf, "CP", 2);
+# else
   return codeset;
+#endif
 }
 #endif
 



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2019-07-02 20:11 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-02 18:56 localcharset, nl_langinfo: fix return value for UTF-8 locales on MSVC Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).