From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id B197F1F8C6 for ; Mon, 6 Sep 2021 17:21:24 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 86C73385AC1D for ; Mon, 6 Sep 2021 17:21:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 86C73385AC1D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630948883; bh=MHLaFUd6KTka/aRLop71rhKwToTLltznOtIm8r5Z+ic=; h=References:To:Subject:In-reply-to:Date:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=lTIs+YyX2xOH3ftlLwUUor28+vLExykh/whAAH351/NXiQhpbqqTFjnjJCptLP+++ lHGkT2SlrZJDjudamNT1giT5a1hiL0sgwyFnd10obrb3rlrrtyXyoCEUaJXHG6yukI 10KyLO2F5rR099tkTAZR8YGtXs1kRaIW33+hoHMQ= Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id CC3343857C5B for ; Mon, 6 Sep 2021 17:21:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CC3343857C5B Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 186H2pmX185455 for ; Mon, 6 Sep 2021 13:21:02 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3awj5seqgw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 06 Sep 2021 13:21:02 -0400 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 186HHvq1050634 for ; Mon, 6 Sep 2021 13:21:02 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0b-001b2d01.pphosted.com with ESMTP id 3awj5seqgs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 06 Sep 2021 13:21:02 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 186HHPVw001834; Mon, 6 Sep 2021 17:21:01 GMT Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by ppma05wdc.us.ibm.com with ESMTP id 3av0e9vvdv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 06 Sep 2021 17:21:01 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 186HL0aI53870988 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 6 Sep 2021 17:21:00 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D8CA4AC05B; Mon, 6 Sep 2021 17:21:00 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7EBABAC059; Mon, 6 Sep 2021 17:20:59 +0000 (GMT) Received: from TP480.linux.ibm.com (unknown [9.211.46.78]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Mon, 6 Sep 2021 17:20:59 +0000 (GMT) References: <20210906154336.610973-1-carlos@redhat.com> <20210906154336.610973-2-carlos@redhat.com> User-agent: mu4e 1.4.10; emacs 27.2 To: "Carlos O'Donell" Subject: Re: [PATCH v12 1/2] Add 'codepoint_collation' support for LC_COLLATE. In-reply-to: <20210906154336.610973-2-carlos@redhat.com> Date: Mon, 06 Sep 2021 14:20:57 -0300 Message-ID: <878s097h8m.fsf@linux.ibm.com> Content-Type: text/plain; charset=utf-8 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: LdHcs8p9qZLr4v2Xcbl2be0jOXdvriN- X-Proofpoint-ORIG-GUID: H31acyQzV-zdF4KR5lf0_Sr9t1u1lJIL Content-Transfer-Encoding: quoted-printable X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-09-06_08:2021-09-03, 2021-09-06 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 malwarescore=0 spamscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 mlxscore=0 suspectscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2108310000 definitions=main-2109060109 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Matheus Castanho via Libc-alpha Reply-To: Matheus Castanho Cc: Florian Weimer , libc-alpha@sourceware.org Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Carlos O'Donell via Libc-alpha writes: > Support a new directive 'codepoint_collation' in the LC_COLLATE > section of a locale source file. This new directive causes all > collation rules to be dropped and instead STRCMP (strcmp or > wcscmp) is used for collation of the input character set. This > is required to allow for a C.UTF-8 that contains zero collation > rules (minimal size) and sorts using code point sorting. > [...] > diff --git a/locale/C-collate-seq.c b/locale/C-collate-seq.c > new file mode 100644 > index 0000000000..4fb82cb835 > --- /dev/null > +++ b/locale/C-collate-seq.c > @@ -0,0 +1,100 @@ > +/* Copyright (C) 1995-2021 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +static const char collseqmb[] =3D > +{ > + '\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07', > + '\x08', '\x09', '\x0a', '\x0b', '\x0c', '\x0d', '\x0e', '\x0f', > + '\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17', > + '\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f', > + '\x20', '\x21', '\x22', '\x23', '\x24', '\x25', '\x26', '\x27', > + '\x28', '\x29', '\x2a', '\x2b', '\x2c', '\x2d', '\x2e', '\x2f', > + '\x30', '\x31', '\x32', '\x33', '\x34', '\x35', '\x36', '\x37', > + '\x38', '\x39', '\x3a', '\x3b', '\x3c', '\x3d', '\x3e', '\x3f', > + '\x40', '\x41', '\x42', '\x43', '\x44', '\x45', '\x46', '\x47', > + '\x48', '\x49', '\x4a', '\x4b', '\x4c', '\x4d', '\x4e', '\x4f', > + '\x50', '\x51', '\x52', '\x53', '\x54', '\x55', '\x56', '\x57', > + '\x58', '\x59', '\x5a', '\x5b', '\x5c', '\x5d', '\x5e', '\x5f', > + '\x60', '\x61', '\x62', '\x63', '\x64', '\x65', '\x66', '\x67', > + '\x68', '\x69', '\x6a', '\x6b', '\x6c', '\x6d', '\x6e', '\x6f', > + '\x70', '\x71', '\x72', '\x73', '\x74', '\x75', '\x76', '\x77', > + '\x78', '\x79', '\x7a', '\x7b', '\x7c', '\x7d', '\x7e', '\x7f', > + '\x80', '\x81', '\x82', '\x83', '\x84', '\x85', '\x86', '\x87', > + '\x88', '\x89', '\x8a', '\x8b', '\x8c', '\x8d', '\x8e', '\x8f', > + '\x90', '\x91', '\x92', '\x93', '\x94', '\x95', '\x96', '\x97', > + '\x98', '\x99', '\x9a', '\x9b', '\x9c', '\x9d', '\x9e', '\x9f', > + '\xa0', '\xa1', '\xa2', '\xa3', '\xa4', '\xa5', '\xa6', '\xa7', > + '\xa8', '\xa9', '\xaa', '\xab', '\xac', '\xad', '\xae', '\xaf', > + '\xb0', '\xb1', '\xb2', '\xb3', '\xb4', '\xb5', '\xb6', '\xb7', > + '\xb8', '\xb9', '\xba', '\xbb', '\xbc', '\xbd', '\xbe', '\xbf', > + '\xc0', '\xc1', '\xc2', '\xc3', '\xc4', '\xc5', '\xc6', '\xc7', > + '\xc8', '\xc9', '\xca', '\xcb', '\xcc', '\xcd', '\xce', '\xcf', > + '\xd0', '\xd1', '\xd2', '\xd3', '\xd4', '\xd5', '\xd6', '\xd7', > + '\xd8', '\xd9', '\xda', '\xdb', '\xdc', '\xdd', '\xde', '\xdf', > + '\xe0', '\xe1', '\xe2', '\xe3', '\xe4', '\xe5', '\xe6', '\xe7', > + '\xe8', '\xe9', '\xea', '\xeb', '\xec', '\xed', '\xee', '\xef', > + '\xf0', '\xf1', '\xf2', '\xf3', '\xf4', '\xf5', '\xf6', '\xf7', > + '\xf8', '\xf9', '\xfa', '\xfb', '\xfc', '\xfd', '\xfe', '\xff' > +}; > + > +/* This table must be 256 bytes in size. We index bytes into the > + table to find the collation sequence. */ > +_Static_assert (sizeof (collseqmb) =3D=3D 256); Hi Carlos, glibc doesn't build after this patch went in, looks like the assert message is missing: In file included from C-collate.c:22: C-collate-seq.c:58:42: error: expected =E2=80=98,=E2=80=99 before =E2=80=98= )=E2=80=99 token _Static_assert (sizeof (collseqmb) =3D=3D 256); ^ -- Matheus Castanho