Re: [Patch v3 6/14] [BZ #14095] update collation data from Unicode / ISO 14651

unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Carlos O'Donell <carlos@redhat.com>
To: Mike FABIAN <mfabian@redhat.com>, libc-alpha@sourceware.org
Cc: "Dmitry V. Levin" <ldv@altlinux.org>
Subject: Re: [Patch v3 6/14] [BZ #14095] update collation data from Unicode / ISO 14651
Date: Fri, 23 Feb 2018 21:59:44 -0800	[thread overview]
Message-ID: <ba27055d-d58a-6f29-d9ed-00818f7b9541@redhat.com> (raw)
In-Reply-To: <s9dvaeoc8gt.fsf@taka.site>

On 02/23/2018 02:21 AM, Mike FABIAN wrote:
> From 759aedd5ec485d9f792022e2432262ebaf4f74d8 Mon Sep 17 00:00:00 2001
> From: Mike FABIAN <mfabian@redhat.com>
> Date: Wed, 31 Jan 2018 06:18:47 +0100
> Subject: [PATCH 06/14] iso14651_t1_common: make the fourth level the codepoint
>  for characters which are ignorable on all 4 levels
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Entries for characters which have â€œIGNOREâ€ on all 4 levels like:
> 
>  <U0001> IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429)
> 
> are changed into:
> 
>  <U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING (in ISO 6429)
> 
> i.e. putting the code point of the character into the fourth level
> instead of â€œIGNOREâ€. Without that change, all such characters
> would compare equal which would make a wcscoll test case fail.
> It is better to have a clearly defined sort order even for characters
> like this so it is good to use the code point as a tie-break.
> 
> 	* localedata/locales/iso14651_t1_common: Use the code point of a character
> 	in the fourth collation level instead of IGNORE for all entries which
> 	have IGNORE on all 4 levels.
> ---
>  localedata/locales/iso14651_t1_common | 914 +++++++++++++++++-----------------
>  1 file changed, 457 insertions(+), 457 deletions(-)

LGTM.

I agree completely, the code point should be a tie-break, and I'm working the
same thing into the C.UTF-8 locale. I'll get back to that after this work and
hopefully you can't review that work for me :-)

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

-- 
Cheers,
Carlos.

next prev parent reply	other threads:[~2018-02-24  5:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-23 10:21 [Patch v3 6/14] [BZ #14095] update collation data from Unicode / ISO 14651 Mike FABIAN
2018-02-24  5:59 ` Carlos O'Donell [this message]
  -- strict thread matches above, loose matches on Subject: below --
2018-02-23 10:22 Mike FABIAN
2018-02-24  6:01 ` Carlos O'Donell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba27055d-d58a-6f29-d9ed-00818f7b9541@redhat.com \
    --to=carlos@redhat.com \
    --cc=ldv@altlinux.org \
    --cc=libc-alpha@sourceware.org \
    --cc=mfabian@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).