unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell via Libc-alpha <libc-alpha@sourceware.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v9 2/2] Add generic C.UTF-8 locale (Bug 17318)
Date: Sun, 5 Sep 2021 23:41:17 -0400	[thread overview]
Message-ID: <837d13d5-fccd-0dfe-759f-910cf9a01f5d@redhat.com> (raw)
In-Reply-To: <87mtov81g2.fsf@oldenburg.str.redhat.com>

On 9/2/21 11:03 AM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> diff --git a/NEWS b/NEWS
>> index 79c895e382..807105a596 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -9,7 +9,15 @@ Version 2.35
>>  
>>  Major new features:
>>  
>> -  [Add new features here]
>> +* Support for the C.UTF-8 locale has been added to glibc.  The locale
>> +  supports full code-point sorting for all valid Unicode code points.
>> +  A limitation in the framework for fnmatch, regexec, and regcomp requires
>> +  a compromise to save space and only ASCII-based range expressions are
>> +  supported for now (see bug 28255).  The full size of the locale is only
>> +  ~400KiB, with 346KiB coming from LC_CTYPE information for Unicode. This
>> +  locale harmonizes downstream C.UTF-8 already shipping in Gentoo, Debian,
>> +  Ubuntu, Fedora, CentOS Stream, and RHEL.  The locale is not built into
>> +  glibc, and must be installed.
> 
> I would say “various downstream distributions”.  You left out SUSE's
> distributions, and they have C.UTF-8 as well:
> 
> <https://build.opensuse.org/package/view_file/openSUSE:Factory/glibc/glibc-c-utf8-locale.patch?expand=1>

I double checked that implementation and it's a copy Mike Fabian's original
that we put into Fedora/RHEL so we are already harmonized with that, which
is good.

I've adjusted the text following your recommendation though, it's clearer.

>> --- /dev/null
>> +++ b/iconv/tst-iconv9.c
> 
>> +  /* From ISO-8859-1 to ASCII. */
> 
>> +  /* From UTF-8 to ASCII. */
> 
> Missing spaces after “.”.

Fixed.

>> diff --git a/posix/transbug.c b/posix/transbug.c
>> index d0983b4d44..71632b7976 100644
>> --- a/posix/transbug.c
>> +++ b/posix/transbug.c
>> @@ -116,14 +116,30 @@ do_test (void)
>>    static const char lower[] = "[[:lower:]]+";
>>    static const char upper[] = "[[:upper:]]+";
>>    struct re_registers regs[4];
>> +  int result;
>>  
>> +#define CHECK(exp) \
>> +  if (exp) { puts (#exp); result = 1; }
>> +
>> +  printf ("INFO: Checking C.\n");
>>    setlocale (LC_ALL, "C");
>>  
>>    (void) re_set_syntax (RE_SYNTAX_GNU_AWK);
>>  
>> -  int result;
>> -#define CHECK(exp) \
>> -  if (exp) { puts (#exp); result = 1; }
>> +  result = run_test (lower, regs);
>> +  result |= run_test (upper, &regs[2]);
>> +  if (! result)
>> +    {
>> +      CHECK (regs[0].start[0] != regs[2].start[0]);
>> +      CHECK (regs[0].end[0] != regs[2].end[0]);
>> +      CHECK (regs[1].start[0] != regs[3].start[0]);
>> +      CHECK (regs[1].end[0] != regs[3].end[0]);
>> +    }
>> +
>> +  printf ("INFO: Checking C.UTF-8.\n");
>> +  setlocale (LC_ALL, "C.UTF-8");
>> +
>> +  (void) re_set_syntax (RE_SYNTAX_GNU_AWK);
>>  
>>    result = run_test (lower, regs);
>>    result |= run_test (upper, &regs[2]);
> 
> The second-to-last line overwrites the previous test results.
> 
> I think this can go in if you address those nits.

Fixed. I'll use |= for all of them and init to zero.

I'll post a v10. Only 2/2 needs a Reviewed-by.

Thanks for your review.

-- 
Cheers,
Carlos.


      reply	other threads:[~2021-09-06  3:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-02  2:05 [PATCH v9 0/2] C.UTF-8 Carlos O'Donell via Libc-alpha
2021-09-02  2:05 ` [PATCH v9 1/2] Add 'codepoint_collation' support for LC_COLLATE Carlos O'Donell via Libc-alpha
2021-09-02  2:05 ` [PATCH v9 2/2] Add generic C.UTF-8 locale (Bug 17318) Carlos O'Donell via Libc-alpha
2021-09-02 15:03   ` Florian Weimer via Libc-alpha
2021-09-06  3:41     ` Carlos O'Donell via Libc-alpha [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=837d13d5-fccd-0dfe-759f-910cf9a01f5d@redhat.com \
    --to=libc-alpha@sourceware.org \
    --cc=carlos@redhat.com \
    --cc=fweimer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).