From: Carlos O'Donell via Libc-alpha <libc-alpha@sourceware.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v9 2/2] Add generic C.UTF-8 locale (Bug 17318)
Date: Sun, 5 Sep 2021 23:41:17 -0400 [thread overview]
Message-ID: <837d13d5-fccd-0dfe-759f-910cf9a01f5d@redhat.com> (raw)
In-Reply-To: <87mtov81g2.fsf@oldenburg.str.redhat.com>
On 9/2/21 11:03 AM, Florian Weimer wrote:
> * Carlos O'Donell:
>
>> diff --git a/NEWS b/NEWS
>> index 79c895e382..807105a596 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -9,7 +9,15 @@ Version 2.35
>>
>> Major new features:
>>
>> - [Add new features here]
>> +* Support for the C.UTF-8 locale has been added to glibc. The locale
>> + supports full code-point sorting for all valid Unicode code points.
>> + A limitation in the framework for fnmatch, regexec, and regcomp requires
>> + a compromise to save space and only ASCII-based range expressions are
>> + supported for now (see bug 28255). The full size of the locale is only
>> + ~400KiB, with 346KiB coming from LC_CTYPE information for Unicode. This
>> + locale harmonizes downstream C.UTF-8 already shipping in Gentoo, Debian,
>> + Ubuntu, Fedora, CentOS Stream, and RHEL. The locale is not built into
>> + glibc, and must be installed.
>
> I would say “various downstream distributions”. You left out SUSE's
> distributions, and they have C.UTF-8 as well:
>
> <https://build.opensuse.org/package/view_file/openSUSE:Factory/glibc/glibc-c-utf8-locale.patch?expand=1>
I double checked that implementation and it's a copy Mike Fabian's original
that we put into Fedora/RHEL so we are already harmonized with that, which
is good.
I've adjusted the text following your recommendation though, it's clearer.
>> --- /dev/null
>> +++ b/iconv/tst-iconv9.c
>
>> + /* From ISO-8859-1 to ASCII. */
>
>> + /* From UTF-8 to ASCII. */
>
> Missing spaces after “.”.
Fixed.
>> diff --git a/posix/transbug.c b/posix/transbug.c
>> index d0983b4d44..71632b7976 100644
>> --- a/posix/transbug.c
>> +++ b/posix/transbug.c
>> @@ -116,14 +116,30 @@ do_test (void)
>> static const char lower[] = "[[:lower:]]+";
>> static const char upper[] = "[[:upper:]]+";
>> struct re_registers regs[4];
>> + int result;
>>
>> +#define CHECK(exp) \
>> + if (exp) { puts (#exp); result = 1; }
>> +
>> + printf ("INFO: Checking C.\n");
>> setlocale (LC_ALL, "C");
>>
>> (void) re_set_syntax (RE_SYNTAX_GNU_AWK);
>>
>> - int result;
>> -#define CHECK(exp) \
>> - if (exp) { puts (#exp); result = 1; }
>> + result = run_test (lower, regs);
>> + result |= run_test (upper, ®s[2]);
>> + if (! result)
>> + {
>> + CHECK (regs[0].start[0] != regs[2].start[0]);
>> + CHECK (regs[0].end[0] != regs[2].end[0]);
>> + CHECK (regs[1].start[0] != regs[3].start[0]);
>> + CHECK (regs[1].end[0] != regs[3].end[0]);
>> + }
>> +
>> + printf ("INFO: Checking C.UTF-8.\n");
>> + setlocale (LC_ALL, "C.UTF-8");
>> +
>> + (void) re_set_syntax (RE_SYNTAX_GNU_AWK);
>>
>> result = run_test (lower, regs);
>> result |= run_test (upper, ®s[2]);
>
> The second-to-last line overwrites the previous test results.
>
> I think this can go in if you address those nits.
Fixed. I'll use |= for all of them and init to zero.
I'll post a v10. Only 2/2 needs a Reviewed-by.
Thanks for your review.
--
Cheers,
Carlos.
prev parent reply other threads:[~2021-09-06 3:41 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-02 2:05 [PATCH v9 0/2] C.UTF-8 Carlos O'Donell via Libc-alpha
2021-09-02 2:05 ` [PATCH v9 1/2] Add 'codepoint_collation' support for LC_COLLATE Carlos O'Donell via Libc-alpha
2021-09-02 2:05 ` [PATCH v9 2/2] Add generic C.UTF-8 locale (Bug 17318) Carlos O'Donell via Libc-alpha
2021-09-02 15:03 ` Florian Weimer via Libc-alpha
2021-09-06 3:41 ` Carlos O'Donell via Libc-alpha [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/libc/involved.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=837d13d5-fccd-0dfe-759f-910cf9a01f5d@redhat.com \
--to=libc-alpha@sourceware.org \
--cc=carlos@redhat.com \
--cc=fweimer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).