From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 9F99C1F8C6 for ; Thu, 29 Jul 2021 06:36:07 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9BD4B3857C5E for ; Thu, 29 Jul 2021 06:36:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9BD4B3857C5E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1627540563; bh=lcO0jwC47s1X0tFwQ9pD9jb+iF3y0cM75r5ejexkuP0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=TuQgQl2yZXxtBZ6teyZvD+gly4EuJC+sdEIvoqq5Naj+25IJQ46x+eJfCFz1i8czH slADr2Qclr85x3Q8xbT5zsC5tnBHCTBF1iTb89t1eHEsMTZY47sV389VLkJ8nxBo9p lPQEoYfEVsaqZ5if+PfyYqzFHRcmCnMp45TjgZZs= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 7CC66389800E for ; Thu, 29 Jul 2021 06:35:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7CC66389800E Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-180-ZTs4TjAHOYCAvSwZVn-yaw-1; Thu, 29 Jul 2021 02:35:19 -0400 X-MC-Unique: ZTs4TjAHOYCAvSwZVn-yaw-1 Received: by mail-qt1-f197.google.com with SMTP id g10-20020ac8768a0000b029023c90fba3dcso2304908qtr.7 for ; Wed, 28 Jul 2021 23:35:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=lcO0jwC47s1X0tFwQ9pD9jb+iF3y0cM75r5ejexkuP0=; b=RMS7JLbP7OZ5cYQCr/nLgj/itJMUk+8g+VUrz3qr7hxr+DA0p1fOZ3I+Z4qWd3E/r6 EXHXsdXGSQ1r2QE2uUgo9hCcCdjewaskZDq53raf76a0317vEp0pAMApX+22bwjYPs8k OM2xGRIWk5YLdtodsQnghaDna9WUUO1o4qGeyXm2HyrFvz/NqtG+apg+Z/TUlDCpkzon ObiFt5xGraNb4TEI9u3hvkkzM5sw5irjEIrYWtbO5/fz9lvC6RVDC2XigNMshVkw7zD3 4zTCSkwwtv1vUrNvEBqNPssjfK0thAqHSW6GtkWkALkb8YcmfJt1/K+tv0ynt03oCC6m kPcA== X-Gm-Message-State: AOAM532HxQnK/ZGdJiVBfFmg1A5wd4A6bo3aw0J1gtzXVBrbtTrVFqLe uzSaatqbouk163GIuUStlEeMztsOVB8UC7JLeJiRTMj2BSiIUdB2E/hciFmd51TdAnJhCRPAS68 GCbzndO5+C7v+SVY77Jn5uoW697RsTAlYvYuO65p++L2w24l7UJdC6mlRI95JYaYtMekdCg== X-Received: by 2002:ac8:424b:: with SMTP id r11mr3044888qtm.188.1627540519228; Wed, 28 Jul 2021 23:35:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwks8mPh9enYHmNS1i8Pf6PFxNEst8j3nE2nkjFcw72u+HryW6DQlHR+C2KnGGG/ubLIfQ8DA== X-Received: by 2002:ac8:424b:: with SMTP id r11mr3044873qtm.188.1627540518976; Wed, 28 Jul 2021 23:35:18 -0700 (PDT) Received: from athas.redhat.com (198-84-214-74.cpe.teksavvy.com. [198.84.214.74]) by smtp.gmail.com with ESMTPSA id y2sm1311857qkd.38.2021.07.28.23.35.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Jul 2021 23:35:18 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v4 0/3] C.UTF-8 Date: Thu, 29 Jul 2021 02:35:12 -0400 Message-Id: <20210729063515.1541388-1-carlos@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Carlos O'Donell via Libc-alpha Reply-To: Carlos O'Donell Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" The following changes implement a minimally sized C.UTF-8. First we implement the 'strcmp_collation' directive. Then we implement C.UTF-8 with an LC_COLLATE that uses the 'strcmp_collation' directive to support using strcmp for collation i.e. code point sorting. The final C.UTF-8 is only ~396KiB with the largest ~346KiB in LC_CTYPE for all of Unicode. This v4 fixes the regressions detected in Fedora Rawhide here: https://bugzilla.redhat.com/show_bug.cgi?id=1986421 Additional testing coverage is provided for fnmatch, regcomp, and regexec (which would have caught the regression). Carlos O'Donell (3): Add support for locales with zero collation rules. Add 'strcmp_collation' support for LC_COLLATE. Add generic C.UTF-8 locale (Bug 17318) iconv/Makefile | 22 +- iconv/tst-iconv9.c | 87 +++++ locale/programs/ld-collate.c | 24 +- locale/programs/locfile-kw.gperf | 1 + locale/programs/locfile-kw.h | 306 ++++++++--------- locale/programs/locfile-token.h | 1 + localedata/C.UTF-8.in | 157 +++++++++ localedata/Makefile | 2 + localedata/SUPPORTED | 1 + localedata/locales/C | 194 +++++++++++ posix/bug-regex1.c | 20 ++ posix/bug-regex19.c | 22 +- posix/bug-regex4.c | 25 ++ posix/bug-regex6.c | 2 +- posix/fnmatch_loop.c | 95 ++++-- posix/regcomp.c | 12 +- posix/regexec.c | 85 +++-- posix/transbug.c | 22 +- posix/tst-fnmatch.input | 549 ++++++++++++++++++++++++++++++- posix/tst-regcomp-truncated.c | 1 + posix/tst-regex.c | 25 +- 21 files changed, 1385 insertions(+), 268 deletions(-) create mode 100644 iconv/tst-iconv9.c create mode 100644 localedata/C.UTF-8.in create mode 100644 localedata/locales/C -- 2.31.1