bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Bruno Haible <bruno@clisp.org>
Cc: libc-alpha@sourceware.org, Paul Eggert <eggert@cs.ucla.edu>,
	bug-gnulib@gnu.org,
	Adhemerval Zanella <adhemerval.zanella@linaro.org>
Subject: Re: [PATCH 1/2] posix: User scratch_buffer on fnmatch
Date: Thu, 14 Jan 2021 11:00:35 +0100	[thread overview]
Message-ID: <87bldrsurg.fsf@oldenburg2.str.redhat.com> (raw)
In-Reply-To: <6269852.OAlbKWrXbI@omega> (Bruno Haible's message of "Thu, 14 Jan 2021 00:36:31 +0100")

* Bruno Haible:

> Paul Eggert asked:
>> > By the way, how important is it to support awful encodings like
>> > shift-JIS that contain bytes that look like '\'? If we don't have to 
>> > support these encodings any more, things get a bit easier.
>
> Here we are talking about locale encodings, and Shift_JIS (as well as
> SHIFT_JISX0213) are not usable as a locale encoding in glibc. See e.g.
> [1], [2].
>
> That's the reason why no Shift_JIS locale is listed in
> glibc/localedata/SUPPORTED. [3]

> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=3140
> [2] https://sourceware.org/legacy-ml/libc-alpha/2000-10/msg00311.html
> [3] https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/SUPPORTED

We used to have a fully supported product based on the original
Shift-JIS.  It did not require glibc changes (we package both localedef
and the locale sources, so it's easy to build custom locales), but other
GNU components had to be patched.

> Florian Weimer wrote:
>> There is a Shift-JIS variant which is ASCII-transparent (Windows-31J,
>> it's also specified by WhatWG/HTML5), so from a glibc point of view, it
>> would be just an ordinary charset like any other.
>> 
>> But feedback we have received is that the users who want Shift-JIS
>> really want the original thing.
>> 
>> We do not presently support either variant downstream, but one potential
>> way forward would be to turn Windows-31J into a fully supported glibc
>> charset with a corresponding ja_JP locale (which would imply downstream
>> support as well), and just hope that it displaces the original Shift-JIS
>> in the future.
>
> I don't think there's a real need for that. In the years 1995 ... 2005
> there was a lot of resistence against Unicode in Japan, because
> Unicode maps several slightly differently looking glyph images to the
> same glyph/character (even for Western encodings, for example the
> Polish accents look a bit different than the French ones), and - at
> the time - Unicode did not have means to disambiguate these, thus
> people complained about "characters are rendered incorrectly if you
> use Unicode". This has been resolved for more than 10 years already.

We saw commercial demand for Shift-JIS much later than that.  I think an
official Windows-31J-based ja_JP would still be welcomed at this point.

A Windows-31J locale could be added to localedata/SUPPORTED.  We have
not done that yet because someone wanted to look into alignment between
Windows, HTML/WhatWG and what we currently have in the source tree, but
that hasn't happened yet, unfortunately.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



  reply	other threads:[~2021-01-14 10:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 20:25 [PATCH 1/2] posix: User scratch_buffer on fnmatch Adhemerval Zanella
2021-01-04 20:25 ` [PATCH 2/2] posix: Remove alloca usage for internal fnmatch implementation Adhemerval Zanella
2021-03-08 12:59   ` Florian Weimer
2021-10-20 15:12     ` Adhemerval Zanella
2021-10-21  9:54       ` Florian Weimer
2021-01-04 20:35 ` [PATCH 1/2] posix: User scratch_buffer on fnmatch Florian Weimer
2021-01-05 13:07   ` Adhemerval Zanella
2021-01-13 19:25     ` Paul Eggert
2021-01-13 19:39       ` Florian Weimer
2021-01-13 23:36         ` Bruno Haible
2021-01-14 10:00           ` Florian Weimer [this message]
2021-03-06 17:18             ` Paul Eggert
2021-03-06 20:17               ` dealing with non-ASCII-safe encodings Bruno Haible
2021-01-14 11:44       ` [PATCH 1/2] posix: User scratch_buffer on fnmatch Adhemerval Zanella
2021-01-15  6:56         ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bldrsurg.fsf@oldenburg2.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=bruno@clisp.org \
    --cc=bug-gnulib@gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).