From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id C01101F47C for ; Mon, 16 Jan 2023 14:58:00 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=clisp.org header.i=@clisp.org header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=PNjoEHNt; dkim-atps=neutral Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHQvR-0006Cg-3H; Mon, 16 Jan 2023 09:57:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHQvP-0006BN-P0 for bug-gnulib@gnu.org; Mon, 16 Jan 2023 09:57:27 -0500 Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.217]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHQvM-0002nc-LH for bug-gnulib@gnu.org; Mon, 16 Jan 2023 09:57:27 -0500 ARC-Seal: i=1; a=rsa-sha256; t=1673881041; cv=none; d=strato.com; s=strato-dkim-0002; b=HQ+xasBOZ6E8f2eII8BxBRQVvUZmy3Hp27Ux1CwvEYJKDM7LQ0vg+Hx4Ml0cVXICSl BU4ZaS36GrxKEaFjd/Ju9WB6t+PcsZpeRh8EEKnBdspTvDxmKk+X3EvVZPxS0El6xCjz I3Lz0Yr3Wc6bYFSC5+pD2lQQ/67NMIVWx134XiR+3FRoMvQwZbnEwL9c51PnjWEGdpnL BMjpoJhb00Uxutyjjg4O4Hd+kqwxThPW/CJDPfrezLV1NGJTAicrGQsDJ0H02EIqkTCW l4m62aMdLhz11FMquTV7IHVNDmgqzRLztIfM3cMletcdtaCdg3vOvYQmO0JcH43p2TX/ j8gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1673881041; s=strato-dkim-0002; d=strato.com; h=Message-ID:Date:Subject:To:From:Cc:Date:From:Subject:Sender; bh=yvibshEdbjJNwH+024xzycxYQZ38yiKqLQNtYUZH9tI=; b=srqzHgyyyW6FBjEHPBWT5O9wNrFnEwv7fpPpdEHVNm/c6yjrS9L3LJz4OGZ1ukVmRf CM7j16n16tRQMO4bcbJ0/wBDhMTA/BlpOOQukp73a45hgTnSWa0yZR2tMK1vJJ2DD9W6 tfXDTHKoZKgJKpdSASBouz6dFUO2nvzAJHekd5kz3EEHMwPqmLgA0Q4l4kC3jGPrfetA mz7cQExAxTE15jJD8iIeD9kjAI11PzMi09hz9eZdDAljf9vc5foxDj2bOS02T3cfrxeV rMgRH3+/tedT4xJBSvFzTrot1IQp+ZbeBZ2EK/UXmdv6BW5ADD6M/Ao7nHYMxW3hP/H6 pYKA== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1673881041; s=strato-dkim-0002; d=clisp.org; h=Message-ID:Date:Subject:To:From:Cc:Date:From:Subject:Sender; bh=yvibshEdbjJNwH+024xzycxYQZ38yiKqLQNtYUZH9tI=; b=PNjoEHNtGdy58md7vDV52akZWqvoLFIXDzjc4pq9nBLfBkJkheGoQTAa2LWLyziKsx Hbe0jf8LsQJNjlUAJ7D4HOTYHpm8Ug8/5n5+gWsLfgAokiDc3dQs4ZJK2/6gsnUEweqR fq6yvfknpzYieq+XYePAraQXCmQOos0IkpQw7lrgxGnQ5Gn/F4D/Ky7jC/k6t2IB1J3x Fh4qZLR+1Yrms7zAvw8zbaS7EXsFel7g7shLXB7qTi5zptGMhV8rdPE2f6tZ+JEky3oW NMoSa/QuZVkJ7PSXbcN2XBlZmq2xCO2Q1KQ7m/ivOwH+1/9GEssTSYCbcDIzb02VYmV/ 4acA== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlIWs+iCP5vnk6shH0WWb0LN8XZoH94zq68+3cfpOejvVUYts2exEnhvWSurYgn7u6IQ==" Received: from nimes.localnet by smtp.strato.de (RZmta 48.6.2 AUTH) with ESMTPSA id I8f358z0GEvLNnD (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Mon, 16 Jan 2023 15:57:21 +0100 (CET) From: Bruno Haible To: bug-gnulib@gnu.org Subject: Android and the C locale Date: Mon, 16 Jan 2023 15:57:21 +0100 Message-ID: <1990639.uacIGzncQW@nimes> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="nextPart3204088.cIm3N6CGri" Content-Transfer-Encoding: 7Bit Received-SPF: none client-ip=81.169.146.217; envelope-from=bruno@clisp.org; helo=mo4-p00-ob.smtp.rzone.de X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org This is a multi-part message in MIME format. --nextPart3204088.cIm3N6CGri Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" Android < 5.0 had only dummy locales. Starting with Android 5.0 (according to the Android libc's git history), they have locales. But there are two problems: 1) The default locale (i.e. the locale in use when setlocale was not called) is the "C.UTF-8" locale, not the "C" locale. Test case: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D #include #include #include int main () { printf ("Locale=3D|%s| LC_CTYPE=3D|%s| MB_CUR_MAX=3D%d\n", setlocale (LC_ALL, NULL), setlocale (LC_CTYPE, NULL), (int) MB_CU= R_MAX); } =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D prints Locale=3D|C.UTF-8| LC_CTYPE=3D|C.UTF-8| MB_CUR_MAX=3D4 rather than the expected Locale=3D|C| LC_CTYPE=3D|C| MB_CUR_MAX=3D1 POSIX says that the default locale should be the "C"/"POSIX" locale. 2) A setlocale call that is meant to set the "C" or "POSIX" locale actually sets a locale with UTF-8 encoding. Test case 1: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D #include #include #include #include #include int main () { mbstate_t state; if (setlocale (LC_ALL, "") =3D=3D NULL) return 1; memset (&state, '\0', sizeof (state)); printf ("Locale=3D|%s| LC_CTYPE=3D|%s| MB_CUR_MAX=3D%d mbrtowc(0xC0)=3D%d= \n", setlocale (LC_ALL, NULL), setlocale (LC_CTYPE, NULL), (int) MB_CU= R_MAX, (int) mbrtowc (NULL, "\xC0", 1, &state)); } =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D $ LC_ALL=3DC ./a.out and $ LC_ALL=3DPOSIX ./a.out print Locale=3D|C.UTF-8| LC_CTYPE=3D|C.UTF-8| MB_CUR_MAX=3D4 mbrtowc(0xC0)=3D-2 rather than the expected Locale=3D|C| LC_CTYPE=3D|C| MB_CUR_MAX=3D1 mbrtowc(0xC0)=3D-1 Test case 2: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D #include #include #include #include #include int main () { mbstate_t state; if (setlocale (LC_ALL, "C") =3D=3D NULL) return 1; memset (&state, '\0', sizeof (state)); printf ("Locale=3D|%s| LC_CTYPE=3D|%s| MB_CUR_MAX=3D%d mbrtowc(0xC0)=3D%d= \n", setlocale (LC_ALL, NULL), setlocale (LC_CTYPE, NULL), (int) MB_CU= R_MAX, (int) mbrtowc (NULL, "\xC0", 1, &state)); if (setlocale (LC_ALL, "POSIX") =3D=3D NULL) return 1; memset (&state, '\0', sizeof (state)); printf ("Locale=3D|%s| LC_CTYPE=3D|%s| MB_CUR_MAX=3D%d mbrtowc(0xC0)=3D%d= \n", setlocale (LC_ALL, NULL), setlocale (LC_CTYPE, NULL), (int) MB_CU= R_MAX, (int) mbrtowc (NULL, "\xC0", 1, &state)); } =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D prints Locale=3D|C| LC_CTYPE=3D|C| MB_CUR_MAX=3D4 mbrtowc(0xC0)=3D-2 Locale=3D|C| LC_CTYPE=3D|C| MB_CUR_MAX=3D4 mbrtowc(0xC0)=3D-2 rather than the expected Locale=3D|C| LC_CTYPE=3D|C| MB_CUR_MAX=3D1 mbrtowc(0xC0)=3D-1 Locale=3D|C| LC_CTYPE=3D|C| MB_CUR_MAX=3D1 mbrtowc(0xC0)=3D-1 One of the consequences are these two test failures: =46AIL: test-mbrtoc32-5.sh =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =2E./../gltests/test-mbrtoc32.c:105: assertion 'ret =3D=3D 1' failed Aborted =46AIL test-mbrtoc32-5.sh (exit status: 134) =46AIL: test-mbrtowc5.sh =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =2E./../gltests/test-mbrtowc.c:105: assertion 'ret =3D=3D 1' failed Aborted =46AIL test-mbrtowc5.sh (exit status: 134) As a workaround, I'm applying these two patches. 2023-01-16 Bruno Haible mbrtowc, mbrtoc32 tests: Avoid test failure on Android =E2=89=A5 5.0. * tests/test-mbrtowc.c (main): On Android 5.0 or newer, when testing the "C" locale, verify that the encoding is UTF-8. * tests/test-mbrtoc32.c (main): Likewise. * doc/posix-functions/setlocale.texi: Mention the Android problems. mbrtowc, mbrtoc32 tests: Refactor. * tests/test-mbrtowc.c (main): Straighten convoluted code. * tests/test-mbrtoc32.c (main): Likewise. --nextPart3204088.cIm3N6CGri Content-Disposition: attachment; filename="0001-mbrtowc-mbrtoc32-tests-Refactor.patch" Content-Transfer-Encoding: 7Bit Content-Type: text/x-patch; charset="UTF-8"; name="0001-mbrtowc-mbrtoc32-tests-Refactor.patch" >From 1ca5866371acd6b4bdcb1913d18cc14b7a8528c1 Mon Sep 17 00:00:00 2001 From: Bruno Haible Date: Mon, 16 Jan 2023 14:30:06 +0100 Subject: [PATCH 1/2] mbrtowc, mbrtoc32 tests: Refactor. * tests/test-mbrtowc.c (main): Straighten convoluted code. * tests/test-mbrtoc32.c (main): Likewise. --- ChangeLog | 6 +++++ tests/test-mbrtoc32.c | 54 ++++++++++++++++++++++++++++++------------- tests/test-mbrtowc.c | 54 ++++++++++++++++++++++++++++++------------- 3 files changed, 82 insertions(+), 32 deletions(-) diff --git a/ChangeLog b/ChangeLog index 9bc953423f..045e1c6247 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-01-16 Bruno Haible + + mbrtowc, mbrtoc32 tests: Refactor. + * tests/test-mbrtowc.c (main): Straighten convoluted code. + * tests/test-mbrtoc32.c (main): Likewise. + 2023-01-16 Paul Eggert sigpipe tests: Modernize use of 'head'. diff --git a/tests/test-mbrtoc32.c b/tests/test-mbrtoc32.c index c8f735d520..36b520f7b8 100644 --- a/tests/test-mbrtoc32.c +++ b/tests/test-mbrtoc32.c @@ -72,10 +72,6 @@ main (int argc, char *argv[]) for (c = 0; c < 0x100; c++) switch (c) { - default: - if (! (c && 1 < argc && argv[1][0] == '5')) - break; - FALLTHROUGH; case '\t': case '\v': case '\f': case ' ': case '!': case '"': case '#': case '%': case '&': case '\'': case '(': case ')': case '*': @@ -97,25 +93,23 @@ main (int argc, char *argv[]) case 'p': case 'q': case 'r': case 's': case 't': case 'u': case 'v': case 'w': case 'x': case 'y': case 'z': case '{': case '|': case '}': case '~': - /* c is in the ISO C "basic character set", or argv[1] starts - with '5' so we are testing all nonnull bytes. */ + /* c is in the ISO C "basic character set". */ + ASSERT (c < 0x80); + /* c is an ASCII character. */ buf[0] = c; + wc = (char32_t) 0xBADFACE; ret = mbrtoc32 (&wc, buf, 1, &state); ASSERT (ret == 1); - if (c < 0x80) - /* c is an ASCII character. */ - ASSERT (wc == c); - else - /* argv[1] starts with '5', that is, we are testing the C or POSIX - locale. - On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF. - But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF. */ - ASSERT (wc == (btowc (c) == 0xDF00 + c ? btowc (c) : c)); + ASSERT (wc == c); ASSERT (mbsinit (&state)); + ret = mbrtoc32 (NULL, buf, 1, &state); ASSERT (ret == 1); ASSERT (mbsinit (&state)); + + break; + default: break; } } @@ -368,7 +362,35 @@ main (int argc, char *argv[]) return 0; case '5': - /* C locale; tested above. */ + /* C or POSIX locale. */ + { + int c; + char buf[1]; + + memset (&state, '\0', sizeof (mbstate_t)); + for (c = 0; c < 0x100; c++) + if (c != 0) + { + /* We are testing all nonnull bytes. */ + buf[0] = c; + + wc = (char32_t) 0xBADFACE; + ret = mbrtoc32 (&wc, buf, 1, &state); + ASSERT (ret == 1); + if (c < 0x80) + /* c is an ASCII character. */ + ASSERT (wc == c); + else + /* On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF. + But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF. */ + ASSERT (wc == (btowc (c) == 0xDF00 + c ? btowc (c) : c)); + ASSERT (mbsinit (&state)); + + ret = mbrtoc32 (NULL, buf, 1, &state); + ASSERT (ret == 1); + ASSERT (mbsinit (&state)); + } + } return 0; } diff --git a/tests/test-mbrtowc.c b/tests/test-mbrtowc.c index 9019ea0e71..b358d8d583 100644 --- a/tests/test-mbrtowc.c +++ b/tests/test-mbrtowc.c @@ -72,10 +72,6 @@ main (int argc, char *argv[]) for (c = 0; c < 0x100; c++) switch (c) { - default: - if (! (c && 1 < argc && argv[1][0] == '5')) - break; - FALLTHROUGH; case '\t': case '\v': case '\f': case ' ': case '!': case '"': case '#': case '%': case '&': case '\'': case '(': case ')': case '*': @@ -97,25 +93,23 @@ main (int argc, char *argv[]) case 'p': case 'q': case 'r': case 's': case 't': case 'u': case 'v': case 'w': case 'x': case 'y': case 'z': case '{': case '|': case '}': case '~': - /* c is in the ISO C "basic character set", or argv[1] starts - with '5' so we are testing all nonnull bytes. */ + /* c is in the ISO C "basic character set". */ + ASSERT (c < 0x80); + /* c is an ASCII character. */ buf[0] = c; + wc = (wchar_t) 0xBADFACE; ret = mbrtowc (&wc, buf, 1, &state); ASSERT (ret == 1); - if (c < 0x80) - /* c is an ASCII character. */ - ASSERT (wc == c); - else - /* argv[1] starts with '5', that is, we are testing the C or POSIX - locale. - On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF. - But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF. */ - ASSERT (wc == (btowc (c) == 0xDF00 + c ? btowc (c) : c)); + ASSERT (wc == c); ASSERT (mbsinit (&state)); + ret = mbrtowc (NULL, buf, 1, &state); ASSERT (ret == 1); ASSERT (mbsinit (&state)); + + break; + default: break; } } @@ -349,7 +343,35 @@ main (int argc, char *argv[]) return 0; case '5': - /* C locale; tested above. */ + /* C or POSIX locale. */ + { + int c; + char buf[1]; + + memset (&state, '\0', sizeof (mbstate_t)); + for (c = 0; c < 0x100; c++) + if (c != 0) + { + /* We are testing all nonnull bytes. */ + buf[0] = c; + + wc = (wchar_t) 0xBADFACE; + ret = mbrtowc (&wc, buf, 1, &state); + ASSERT (ret == 1); + if (c < 0x80) + /* c is an ASCII character. */ + ASSERT (wc == c); + else + /* On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF. + But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF. */ + ASSERT (wc == (btowc (c) == 0xDF00 + c ? btowc (c) : c)); + ASSERT (mbsinit (&state)); + + ret = mbrtowc (NULL, buf, 1, &state); + ASSERT (ret == 1); + ASSERT (mbsinit (&state)); + } + } return 0; } -- 2.34.1 --nextPart3204088.cIm3N6CGri Content-Disposition: attachment; filename="0002-mbrtowc-mbrtoc32-tests-Avoid-test-failure-on-Android.patch" Content-Transfer-Encoding: quoted-printable Content-Type: text/x-patch; charset="UTF-8"; name="0002-mbrtowc-mbrtoc32-tests-Avoid-test-failure-on-Android.patch" =46rom 653bc7d23e08ab61ee2382f8773f0a95d93ab871 Mon Sep 17 00:00:00 2001 =46rom: Bruno Haible Date: Mon, 16 Jan 2023 14:34:56 +0100 Subject: [PATCH 2/2] =3D?UTF-8?q?mbrtowc,=3D20mbrtoc32=3D20tests:=3D20Avoid= =3D20test?=3D =3D?UTF-8?q?=3D20failure=3D20on=3D20Android=3D20=3DE2=3D89=3DA5=3D205.0.?= =3D MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit * tests/test-mbrtowc.c (main): On Android 5.0 or newer, when testing the "C" locale, verify that the encoding is UTF-8. * tests/test-mbrtoc32.c (main): Likewise. * doc/posix-functions/setlocale.texi: Mention the Android problems. =2D-- ChangeLog | 6 ++++++ doc/posix-functions/setlocale.texi | 8 +++++++- tests/test-mbrtoc32.c | 10 ++++++++++ tests/test-mbrtowc.c | 10 ++++++++++ 4 files changed, 33 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 045e1c6247..0051e3237f 100644 =2D-- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,11 @@ 2023-01-16 Bruno Haible =20 + mbrtowc, mbrtoc32 tests: Avoid test failure on Android =E2=89=A5 5.0. + * tests/test-mbrtowc.c (main): On Android 5.0 or newer, when testing + the "C" locale, verify that the encoding is UTF-8. + * tests/test-mbrtoc32.c (main): Likewise. + * doc/posix-functions/setlocale.texi: Mention the Android problems. + mbrtowc, mbrtoc32 tests: Refactor. * tests/test-mbrtowc.c (main): Straighten convoluted code. * tests/test-mbrtoc32.c (main): Likewise. diff --git a/doc/posix-functions/setlocale.texi b/doc/posix-functions/setlo= cale.texi index 11364d3901..6e232200f8 100644 =2D-- a/doc/posix-functions/setlocale.texi +++ b/doc/posix-functions/setlocale.texi @@ -21,7 +21,7 @@ On Windows platforms (excluding Cygwin), @code{setlocale}= understands different locale names, that are not based on ISO 639 language names and ISO 3166 co= untry names. @item =2DOn Android 4.3, which which doesn't have locales, the @code{setlocale} f= unction +On Android < 5.0, which doesn't have locales, the @code{setlocale} function always fails. The replacement, however, supports only the locale names @code{"C"} and @code{"POSIX"}. @end itemize @@ -52,4 +52,10 @@ In addition any value is accepted for @code{LC_CTYPE}, a= nd so NULL is never returned to indicate a failure to set locale. To verify category values, each category must be set individually with @code{setlocale(LC_COLLATE,"")} etc. +@item +On Android 5.0 and newer, the default locale (i.e.@: the locale in use when +@code{setlocale} was not called) is the @code{"C.UTF-8"} locale, not the +@code{"C"} locale. Additionally, a @code{setlocale} call that is meant to= set +the @code{"C"} or @code{"POSIX"} locale actually sets an equivalent of the +@code{"C.UTF-8"} locale. @end itemize diff --git a/tests/test-mbrtoc32.c b/tests/test-mbrtoc32.c index 36b520f7b8..0d75c3db14 100644 =2D-- a/tests/test-mbrtoc32.c +++ b/tests/test-mbrtoc32.c @@ -26,6 +26,7 @@ SIGNATURE_CHECK (mbrtoc32, size_t, =20 #include #include +#include #include =20 #include "macros.h" @@ -124,6 +125,15 @@ main (int argc, char *argv[]) ASSERT (mbsinit (&state)); } =20 +#ifdef __ANDROID__ + /* On Android =E2=89=A5 5.0, the default locale is the "C.UTF-8" locale,= not the + "C" locale. Furthermore, when you attempt to set the "C" or "POSIX" + locale via setlocale(), what you get is a "C" locale with UTF-8 encod= ing, + that is, effectively the "C.UTF-8" locale. */ + if (argc > 1 && strcmp (argv[1], "5") =3D=3D 0 && MB_CUR_MAX > 1) + argv[1] =3D "2"; +#endif + if (argc > 1) switch (argv[1][0]) { diff --git a/tests/test-mbrtowc.c b/tests/test-mbrtowc.c index b358d8d583..1fdf039c42 100644 =2D-- a/tests/test-mbrtowc.c +++ b/tests/test-mbrtowc.c @@ -26,6 +26,7 @@ SIGNATURE_CHECK (mbrtowc, size_t, (wchar_t *, char const = *, size_t, =20 #include #include +#include #include =20 #include "macros.h" @@ -124,6 +125,15 @@ main (int argc, char *argv[]) ASSERT (mbsinit (&state)); } =20 +#ifdef __ANDROID__ + /* On Android =E2=89=A5 5.0, the default locale is the "C.UTF-8" locale,= not the + "C" locale. Furthermore, when you attempt to set the "C" or "POSIX" + locale via setlocale(), what you get is a "C" locale with UTF-8 encod= ing, + that is, effectively the "C.UTF-8" locale. */ + if (argc > 1 && strcmp (argv[1], "5") =3D=3D 0 && MB_CUR_MAX > 1) + argv[1] =3D "2"; +#endif + if (argc > 1) switch (argv[1][0]) { =2D-=20 2.34.1 --nextPart3204088.cIm3N6CGri--