From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Carlos O'Donell Newsgroups: gmane.comp.lib.glibc.alpha Subject: Re: [Patch v4 6/14] [BZ #14095] update collation data from Unicode / ISO 14651 Date: Mon, 26 Feb 2018 10:14:15 -0800 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1519668767 3068 195.159.176.226 (26 Feb 2018 18:12:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 26 Feb 2018 18:12:47 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 Cc: "Dmitry V. Levin" To: Mike FABIAN , libc-alpha@sourceware.org Original-X-From: libc-alpha-return-90603-glibc-alpha=m.gmane.org@sourceware.org Mon Feb 26 19:12:43 2018 Return-path: Envelope-to: glibc-alpha@blaine.gmane.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=MtlUONrzwfXQl9mk xlTKewVdwwDA7XNzp59iHgdVLs7GIwVBAmljJuq9icnF0ECpiFB48LICaClWqO25 AtUpb6tHfhGJIVHI5U/rXHMGmoCX4++9wzhOL/G9B2lGzrMiw86/zZOZWxCjdO3O 1MwD6+VGv1aV1Vxf6bze+56zjis= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=8ln277mp7f4/tEUPiSRUgU T489w=; b=nqGEHusM/dQiLN1czdxuD2FBeTI8hNtE7gco9M1djni2EYxh4xW8og cbPL9/cEUHSb4/IuoFw+Oy2jfE5lWBhie/agbrDpzwsEehR+AyXpKlfKFW/7JI7p MajGwiTle7gDqG+wrlWqrV+2dQTJMMGWpXnDUxBbBEAkhBxQ3MCiw= Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_00,GARBLED_BODY,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy= X-HELO: mail-qk0-f170.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=OMlDhUfScl5BkE1PUyn7x6AMEZzQBz+AobK9o7fMCp0=; b=TxjTIxeS49qIsU49ZQa/etzkYpopug52roozSmulKC9Fzpz0Feu3QWzs95JWTy4KfW 4W0I7MSDpRE5zmiCE01BpJzyZOy8YmiQTh5gunoHDNhZZXbmeazkjmvYs+2mibrmfRmi MXmENc7y0eWbONg5DLEVHDHRXIE8aOFKcFkcAVpqRy64ahk0TmbenoZL4DOkWvokqBcx wKq4S0hxO+o427WMQac8t48R6PfIf112tG+WSjLreMOGWcKR/VPZU40KD6VhwzvB2TZL IeFCLsPp/rUIf8Vqcn/qQ1WkG4IObAeBQ/kjVCh+tmZOF8EGgocdNu6TDb2k4WNh+xG6 tRww== X-Gm-Message-State: APf1xPCfRoSHC/8CAxfyoMDDexYQlhIjcnqf574Wb5cVXar4HgNsjNvb twWGjuP2FHM2DyRJhA3Ep781bI3No0s= X-Google-Smtp-Source: AG47ELse03bGi0tEc8Vl9E6+qz5HxqGrmb2dttWaOmAxEOQddEJQowCuJWNC06dqFUePF9yyxb3BXA== X-Received: by 10.55.34.197 with SMTP id i188mr18539655qki.180.1519668857774; Mon, 26 Feb 2018 10:14:17 -0800 (PST) In-Reply-To: Xref: news.gmane.org gmane.comp.lib.glibc.alpha:82935 Archived-At: Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eqNGU-0000Bo-5J for glibc-alpha@blaine.gmane.org; Mon, 26 Feb 2018 19:12:42 +0100 Received: (qmail 44273 invoked by alias); 26 Feb 2018 18:14:22 -0000 Received: (qmail 44160 invoked by uid 89); 26 Feb 2018 18:14:21 -0000 On 02/26/2018 07:08 AM, Mike FABIAN wrote: > From b517dae2da9fa61acd31053d3bf150141f20611e Mon Sep 17 00:00:00 2001 > From: Mike FABIAN > Date: Wed, 31 Jan 2018 06:18:47 +0100 > Subject: [PATCH 06/14] iso14651_t1_common: make the fourth level the codepoint > for characters which are ignorable on all 4 levels > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Entries for characters which have “IGNORE” on all 4 levels like: > > IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429) > > are changed into: > > IGNORE;IGNORE;IGNORE; % START OF HEADING (in ISO 6429) > > i.e. putting the code point of the character into the fourth level > instead of “IGNORE”. Without that change, all such characters > would compare equal which would make a wcscoll test case fail. > It is better to have a clearly defined sort order even for characters > like this so it is good to use the code point as a tie-break. > > * localedata/locales/iso14651_t1_common: Use the code point of a character > in the fourth collation level instead of IGNORE for all entries which > have IGNORE on all 4 levels. LGTM. Reviewed-by: Carlos O'Donell -- Cheers, Carlos.