From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Carlos O'Donell Newsgroups: gmane.comp.lib.glibc.alpha Subject: Re: [Patch v3 6/14] [BZ #14095] update collation data from Unicode / ISO 14651 Date: Fri, 23 Feb 2018 21:59:44 -0800 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1519451875 23689 195.159.176.226 (24 Feb 2018 05:57:55 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 24 Feb 2018 05:57:55 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 Cc: "Dmitry V. Levin" To: Mike FABIAN , libc-alpha@sourceware.org Original-X-From: libc-alpha-return-90542-glibc-alpha=m.gmane.org@sourceware.org Sat Feb 24 06:57:51 2018 Return-path: Envelope-to: glibc-alpha@blaine.gmane.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=W9iTcUq7yil4Pn/Z lj8HACEtdd4U1WiLueyJqMJ9a5pm1nCtCimXIrR6MC1LxCCr5T6M/9NA2GnyjdTN Tv/YlpiE/KRJtob1z0kPtZo8LlPphsMBRcJz0bN5ccRrlpTKgAZAKIKjWsf20+Vo qfn+df3Uh45z8r9kt2W9+0c7PsE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=sntVpAvQqShKYLK9IPCHZE +NQ8g=; b=uOJC1MWE7fjH+A4UTaGpGahDbIOmeKEFxjcxLKBxQ0tvtv0odVQLa4 aX7QbZdsSh9bjo/JJPyPH1MKzvtMIpxobYnwXEt1CTGF7UDK1lGvKD8EG54rD+WQ rsaMN4AAnkduSnBeWwP+HwCA0PT++4oeWpuPP5kd+YBZ3lQv+wg6M= Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL,BAYES_00,GARBLED_BODY,GIT_PATCH_2,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy= X-HELO: mail-qk0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=lXxzOT4RwH79ldkiLw6sfPtx3/m/IQxCXGW5xIJu4aU=; b=aXvZ1P6h8i3fsr3+1RwmOEG3hTDTnFgp7eTAptSyqBM0KtZMO+qFV5QfaPoTQ65u9X jLqOvfP9Gu2ddZqUKLq0FWWwOi5+r/ErVas7D/YrMtySzF445aKHDBHo91PB3FFrXO1D R6JHtVl0o8/5ETKJiQSkfPQzRJuOc+urZnsvy5/dAhI3pdPaZgy4cdkrVxMxrFNnVJJg yVVcfurdR0CCGwTooaTRrOZy9MbzxOeT1rip2Fvr5L9tq6sTDpxJ+TffXfHiFJGDHQrH YrvIxTAkl2+07ddlOT0u/8onPYxsIZ2Dyp4fRhq/+BQNtrC3oN3BiQI40wpdlENoU45Y xtjw== X-Gm-Message-State: APf1xPBEUfozCERmkauwdnZttzUJH16etk2zDnosqbvigREZr/1nLR6C 1I46/bnzV3fHgfQ5PM9nnAXgkvcu6fY= X-Google-Smtp-Source: AG47ELsjgTkUnRgoy3lXB0R4jmTq0Kgh8zwnTcaVLJwM+HR15nXDE86jr6iYXSNdlw4RiLnVakwsyw== X-Received: by 10.55.115.1 with SMTP id o1mr6729111qkc.50.1519451986877; Fri, 23 Feb 2018 21:59:46 -0800 (PST) In-Reply-To: Xref: news.gmane.org gmane.comp.lib.glibc.alpha:82874 Archived-At: Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1epSqE-0005f1-8V for glibc-alpha@blaine.gmane.org; Sat, 24 Feb 2018 06:57:50 +0100 Received: (qmail 11202 invoked by alias); 24 Feb 2018 05:59:52 -0000 Received: (qmail 11151 invoked by uid 89); 24 Feb 2018 05:59:49 -0000 On 02/23/2018 02:21 AM, Mike FABIAN wrote: > From 759aedd5ec485d9f792022e2432262ebaf4f74d8 Mon Sep 17 00:00:00 2001 > From: Mike FABIAN > Date: Wed, 31 Jan 2018 06:18:47 +0100 > Subject: [PATCH 06/14] iso14651_t1_common: make the fourth level the codepoint > for characters which are ignorable on all 4 levels > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Entries for characters which have “IGNORE” on all 4 levels like: > > IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429) > > are changed into: > > IGNORE;IGNORE;IGNORE; % START OF HEADING (in ISO 6429) > > i.e. putting the code point of the character into the fourth level > instead of “IGNORE”. Without that change, all such characters > would compare equal which would make a wcscoll test case fail. > It is better to have a clearly defined sort order even for characters > like this so it is good to use the code point as a tie-break. > > * localedata/locales/iso14651_t1_common: Use the code point of a character > in the fourth collation level instead of IGNORE for all entries which > have IGNORE on all 4 levels. > --- > localedata/locales/iso14651_t1_common | 914 +++++++++++++++++----------------- > 1 file changed, 457 insertions(+), 457 deletions(-) LGTM. I agree completely, the code point should be a tie-break, and I'm working the same thing into the C.UTF-8 locale. I'll get back to that after this work and hopefully you can't review that work for me :-) Reviewed-by: Carlos O'Donell -- Cheers, Carlos.