From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id DCC5A1F463 for ; Wed, 18 Dec 2019 10:30:03 +0000 (UTC) Received: from localhost ([::1]:52142 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihWag-0000lG-6m for normalperson@yhbt.net; Wed, 18 Dec 2019 05:30:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51456) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihWaa-0000jZ-09 for bug-gnulib@gnu.org; Wed, 18 Dec 2019 05:29:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihWaX-0004ZS-Ot for bug-gnulib@gnu.org; Wed, 18 Dec 2019 05:29:55 -0500 Received: from mo6-p00-ob.smtp.rzone.de ([2a01:238:20a:202:5300::7]:13084) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ihWaV-0004Q5-L7 for bug-gnulib@gnu.org; Wed, 18 Dec 2019 05:29:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1576664988; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: X-RZG-CLASS-ID:X-RZG-AUTH:From:Subject:Sender; bh=psjP6Ku8h0us9MlVeQ1/yO1Li3x6HhYEeEbWkUZmCec=; b=S9utda606q7gFxo2rdwzG7T7F07pfWMEl8kgKENPJNKSaUYD1DxhPn3xgJtSfVF4me RIVUdd8KsuclTFfoCbiTfb4IMv/gf7Xjwth0LXV+MoP5ZneJDijT9h9raAbVvltVxp5u wqiUNDAwdigwKxbCBYLxK2LLcvyfb4xR78tUscHSKBT9jPAd7FmRiCWLbMNumIvpwkM+ OKMy2ebuGOsKOnnzY8GPmJWu4rDFwd7jy/Gde373iQdBJoHMubOvloB3x463AdPMDWEh uYPh/RAo9VLMv4KlG8iE1PXgeS7izX4LtZWEQVIHGvyPJl3qd8r+z29QgZf0RzeP92V9 NLeQ== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlIWs+iCP5vnk6shH+AHjwLuWOH6fzxfs=" X-RZG-CLASS-ID: mo00 Received: from bruno.haible.de by smtp.strato.de (RZmta 46.0.7 DYNA|AUTH) with ESMTPSA id t0ad5bvBIATl4xA (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (curve X9_62_prime256v1 with 256 ECDH bits, eq. 3072 bits RSA)) (Client did not present a certificate); Wed, 18 Dec 2019 11:29:47 +0100 (CET) From: Bruno Haible To: Paul Eggert Subject: Re: LC_COLLATE in the C locale Date: Wed, 18 Dec 2019 11:29:46 +0100 Message-ID: <8726723.BRRUbPPWXg@omega> User-Agent: KMail/5.1.3 (Linux/4.4.0-166-generic; KDE/5.18.0; x86_64; ; ) In-Reply-To: <09a43701-a998-5c26-ea9e-51c8c3446084@cs.ucla.edu> References: <175192568.e2XXTFFdkW@omega> <09a43701-a998-5c26-ea9e-51c8c3446084@cs.ucla.edu> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a01:238:20a:202:5300::7 X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bug-gnulib@gnu.org Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" Hi Paul, > I do have a qualm in that coreutils (and I assume others) interpret !hard_locale > (LC_COLLATE) as meaning that the locale is unibyte and uses native byte > comparison. Isn't this warranted by section "LC_COLLATE Category in the POSIX Locale" in ? > As I recall on some platforms (macOS maybe?), the C locale uses > UTF-8 so this interpretation isn't correct. UTF-8 has the nice property that byte-per-byte comparison and codepoint-per- codepoint comparison are equivalent. If the encoding was not UTF-8, but e.g. GB18030, I would agree that there is a problem. But there is no C locale with GB18030 encoding on any platform. Bruno