From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,RDNS_DYNAMIC,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id A07CB1F8C7 for ; Thu, 2 Sep 2021 15:05:35 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 644583857421 for ; Thu, 2 Sep 2021 15:05:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 644583857421 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630595134; bh=zaUEbnXwA3dBeepjwaY39W6OLStVSG8zZkEKNBCyib8=; h=To:Subject:References:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=RtTlIPjVp/ZGoDlBphMTNv5oUkn6dotOHrVPRni8W1S+EMZq1rLGTUQUrFKFaRwV8 dYMKcc04g+Uv7qS1BVb9JYWQt2rbrrkdYdWGM++vmXgLjNdZto5X+yy2VJmG8UKDQ4 +Vggi2TP7cF8ZAsW8cZtVbdrlcTGNj/n1yx5BVkA= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id DD5BF38515DA for ; Thu, 2 Sep 2021 15:03:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DD5BF38515DA Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-287-gnKxbb1sNMeueipxQyOJCQ-1; Thu, 02 Sep 2021 11:03:18 -0400 X-MC-Unique: gnKxbb1sNMeueipxQyOJCQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 236A4101373C for ; Thu, 2 Sep 2021 15:03:17 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.194.140]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5EC591AC18; Thu, 2 Sep 2021 15:03:11 +0000 (UTC) To: Carlos O'Donell Subject: Re: [PATCH v9 2/2] Add generic C.UTF-8 locale (Bug 17318) References: <20210902020546.90935-1-carlos@redhat.com> <20210902020546.90935-3-carlos@redhat.com> Date: Thu, 02 Sep 2021 17:03:09 +0200 Message-ID: <87mtov81g2.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Florian Weimer via Libc-alpha Reply-To: Florian Weimer Cc: libc-alpha@sourceware.org Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" * Carlos O'Donell: > diff --git a/NEWS b/NEWS > index 79c895e382..807105a596 100644 > --- a/NEWS > +++ b/NEWS > @@ -9,7 +9,15 @@ Version 2.35 > =20 > Major new features: > =20 > - [Add new features here] > +* Support for the C.UTF-8 locale has been added to glibc. The locale > + supports full code-point sorting for all valid Unicode code points. > + A limitation in the framework for fnmatch, regexec, and regcomp requir= es > + a compromise to save space and only ASCII-based range expressions are > + supported for now (see bug 28255). The full size of the locale is onl= y > + ~400KiB, with 346KiB coming from LC_CTYPE information for Unicode. Thi= s > + locale harmonizes downstream C.UTF-8 already shipping in Gentoo, Debia= n, > + Ubuntu, Fedora, CentOS Stream, and RHEL. The locale is not built into > + glibc, and must be installed. I would say =E2=80=9Cvarious downstream distributions=E2=80=9D. You left o= ut SUSE's distributions, and they have C.UTF-8 as well: > --- /dev/null > +++ b/iconv/tst-iconv9.c > + /* From ISO-8859-1 to ASCII. */ > + /* From UTF-8 to ASCII. */ Missing spaces after =E2=80=9C.=E2=80=9D. > diff --git a/posix/transbug.c b/posix/transbug.c > index d0983b4d44..71632b7976 100644 > --- a/posix/transbug.c > +++ b/posix/transbug.c > @@ -116,14 +116,30 @@ do_test (void) > static const char lower[] =3D "[[:lower:]]+"; > static const char upper[] =3D "[[:upper:]]+"; > struct re_registers regs[4]; > + int result; > =20 > +#define CHECK(exp) \ > + if (exp) { puts (#exp); result =3D 1; } > + > + printf ("INFO: Checking C.\n"); > setlocale (LC_ALL, "C"); > =20 > (void) re_set_syntax (RE_SYNTAX_GNU_AWK); > =20 > - int result; > -#define CHECK(exp) \ > - if (exp) { puts (#exp); result =3D 1; } > + result =3D run_test (lower, regs); > + result |=3D run_test (upper, ®s[2]); > + if (! result) > + { > + CHECK (regs[0].start[0] !=3D regs[2].start[0]); > + CHECK (regs[0].end[0] !=3D regs[2].end[0]); > + CHECK (regs[1].start[0] !=3D regs[3].start[0]); > + CHECK (regs[1].end[0] !=3D regs[3].end[0]); > + } > + > + printf ("INFO: Checking C.UTF-8.\n"); > + setlocale (LC_ALL, "C.UTF-8"); > + > + (void) re_set_syntax (RE_SYNTAX_GNU_AWK); > =20 > result =3D run_test (lower, regs); > result |=3D run_test (upper, ®s[2]); The second-to-last line overwrites the previous test results. I think this can go in if you address those nits. Thanks, Florian