From: Egor Kobylkin <egor@kobylkin.com>
To: libc-alpha@sourceware.org, libc-locales@sourceware.org,
"Dmitry V. Levin" <ldv@altlinux.org>,
Marko Myllynen <myllynen@redhat.com>,
mfabian@redhat.com
Subject: Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
Date: Thu, 20 Dec 2018 00:02:21 +0100 [thread overview]
Message-ID: <e20793e0-2ac1-3b20-3b2e-4d4a9e3ad4ab@kobylkin.com> (raw)
In-Reply-To: <749726562.674232.1545259279320@poczta.nazwa.pl>
On 19.12.18 23:41, Rafal Luzynski wrote:
> 8.12.2018 22:51 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> Rafal, Dmitry, Marko, Mike
>>
>> On 08.12.18 00:35, Rafal Luzynski wrote:
>>> 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
>>>> (transliteration to Latin with diacritics) as conflicting with
>>>> System B within glibc mechanics and not solving BZ #2872
>>>
>>> I'm in favor of implementing System A and dropping System B instead.
>>
>> The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
>> fails". The ISO 9 System A does not map to ASCII so it is not a solution
>> to BZ #2872 at all.
>
> I did not mean implementing System A and nothing more. I meant implementing
> System A and a fallback for ASCII which can be similar to System B but
> we wouldn't be able to call it "System B" because it would differ in
> few cases.
Just for the record, I have no objection on my side to that (Using A as
a basis for ASCII as well).
But I'm not sure anymore that inserting a translit table into every
locale is the right solution for ASCII problem. Especially because
distributions may not include any locale but C.
>
>> I was scratching my head as to how can we avoid the explosion of the
>> scope for this patch. And then it appeared to me that it was wrong to
>> target all the present locales for the ASCII translit. This seems to be
>> the root cause for this prolonged A vs. B discussions. The proper target
>> for my table is actually the C locale translit file
>> (locale/C-translit.h.in). I will submit a proper patch shortly.
>
> I saw your patch v11 and now I must say I'm sorry for making noise because
> it was me who said that I didn't mind adding Cyrillic -> ASCII
> transliteration
> to C locale. I said so before taking a look at the current contents of
> transliteration in C locale. When I looked at this I realized that it does
> not support any national characters, even from modified Latin alphabets
> (like
> used in most of western European languages). It only contains mathematical,
> physical, commercial, diacritical etc. characters. So I'm no longer sure
> it should support Cyrillic -> ASCII. But maybe again I'm wrong, maybe
> it should support but just nobody implemented it yet.
Actually there are quite a few letters already transliterated in
locale/C-translit.h.in. (Note the CAPCAP transliteration style for the
capitals, i.e. LATIN CAPITAL LETTER AE is mapped to AE, not to Ae.)
"\x00c6" "AE" /* <U00C6> LATIN CAPITAL LETTER AE */
"\x00d7" "x" /* <U00D7> MULTIPLICATION SIGN */
"\x00df" "ss" /* <U00DF> LATIN SMALL LETTER SHARP S */
"\x00e6" "ae" /* <U00E6> LATIN SMALL LETTER AE */
"\x0132" "IJ" /* <U0132> LATIN CAPITAL LIGATURE IJ */
"\x0133" "ij" /* <U0133> LATIN SMALL LIGATURE IJ */
"\x0149" "'n" /* <U0149> LATIN SMALL LETTER N PRECEDED BY APOSTROPHE */
"\x0152" "OE" /* <U0152> LATIN CAPITAL LIGATURE OE */
"\x0153" "oe" /* <U0153> LATIN SMALL LIGATURE OE */
"\x017f" "s" /* <U017F> LATIN SMALL LETTER LONG S */
"\x01c7" "LJ" /* <U01C7> LATIN CAPITAL LETTER LJ */
"\x01c8" "Lj" /* <U01C8> LATIN CAPITAL LETTER L WITH SMALL LETTER J */
"\x01c9" "lj" /* <U01C9> LATIN SMALL LETTER LJ */
"\x01ca" "NJ" /* <U01CA> LATIN CAPITAL LETTER NJ */
"\x01cb" "Nj" /* <U01CB> LATIN CAPITAL LETTER N WITH SMALL LETTER J */
"\x01cc" "nj" /* <U01CC> LATIN SMALL LETTER NJ */
"\x01f1" "DZ" /* <U01F1> LATIN CAPITAL LETTER DZ */
"\x01f2" "Dz" /* <U01F2> LATIN CAPITAL LETTER D WITH SMALL LETTER Z */
"\x01f3" "dz" /* <U01F3> LATIN SMALL LETTER DZ */
>> My focus is super sharp on helping with Cyrillic -> ASCII translit
>> availability for a default installation with glibc.
>
> I understand your aim and I agree to support ASCII. Our disagreements are:
>
> * whether to support conversion Cyrillic -> extended Latin as well,
no contest on my side
> * which standard to implement,
no contest on my side
> * what to do if the standard is ambiguous or if some details cannot be
> implemented for technical reasons.
no contest on my side either
I just think we may work around all those decisions with a smaller pure
ASCII patch first (more useful too if covers C locale).
next prev parent reply other threads:[~2018-12-19 23:02 UTC|newest]
Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com>
[not found] ` <20180412224352.GB2911@altlinux.org>
2018-07-17 19:34 ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-07-17 19:40 ` Carlos O'Donell
2018-07-17 19:50 ` Egor Kobylkin
2018-07-17 19:59 ` Carlos O'Donell
2018-08-06 19:00 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
2018-10-03 8:26 ` Egor Kobylkin
2018-10-03 9:19 ` Keld Simonsen
2018-10-03 9:32 ` Egor Kobylkin
2018-10-05 8:43 ` Marko Myllynen
2018-10-05 9:20 ` Rafal Luzynski
2018-10-05 10:36 ` Egor Kobylkin
2018-10-08 22:04 ` Rafal Luzynski
2018-10-08 22:52 ` Egor Kobylkin
2018-10-09 21:43 ` Rafal Luzynski
2018-10-08 23:20 ` Zack Weinberg
2018-10-09 15:26 ` Carlos O'Donell
2018-10-09 21:51 ` Rafal Luzynski
2018-10-09 16:10 ` Marko Myllynen
2018-10-09 16:22 ` Egor Kobylkin
2018-10-09 16:49 ` Marko Myllynen
2018-10-09 22:08 ` Rafal Luzynski
2018-10-10 11:21 ` Marko Myllynen
2018-10-11 10:10 ` Marko Myllynen
[not found] ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
2018-10-05 11:54 ` Marko Myllynen
2018-10-05 12:00 ` Egor Kobylkin
2018-10-05 12:21 ` Marko Myllynen
2018-10-05 20:47 ` Egor Kobylkin
2018-10-08 12:40 ` Marko Myllynen
2018-10-08 22:23 ` Rafal Luzynski
2018-10-08 23:35 ` Egor Kobylkin
2018-10-09 13:18 ` Egor Kobylkin
2018-10-09 18:34 ` Egor Kobylkin
2018-10-09 22:17 ` Rafal Luzynski
2018-10-09 22:40 ` Egor Kobylkin
2018-10-09 22:42 ` Egor Kobylkin
2018-10-10 11:22 ` Marko Myllynen
2018-10-10 12:19 ` Egor Kobylkin
2018-10-10 12:34 ` Marko Myllynen
2018-10-10 22:29 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
2018-10-11 9:59 ` Marko Myllynen
2018-10-11 11:04 ` Rafal Luzynski
2018-10-11 13:10 ` Marko Myllynen
2018-10-11 13:50 ` Volodymyr Lisivka
2018-10-11 14:59 ` Egor Kobylkin
2018-10-11 21:30 ` Egor Kobylkin
2018-10-11 15:05 ` Egor Kobylkin
2018-10-11 15:44 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
2018-10-11 21:33 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
2018-10-12 14:05 ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-10-13 0:59 ` Rafal Luzynski
2018-10-13 16:58 ` Egor Kobylkin
2018-10-15 11:04 ` Marko Myllynen
2018-10-15 11:54 ` Egor Kobylkin
2018-10-23 23:08 ` Rafal Luzynski
2018-10-17 14:16 ` [PATCH v6] " Egor Kobylkin
2018-11-01 22:51 ` [PATCH v7] " Egor Kobylkin
2018-11-02 0:00 ` [PATCH v8] " Egor Kobylkin
2018-11-02 22:22 ` Rafal Luzynski
2018-11-02 23:27 ` Egor Kobylkin
2018-11-14 21:25 ` [PATCH v9] " Egor Kobylkin
2018-11-16 22:17 ` Rafal Luzynski
2018-11-17 18:34 ` Egor Kobylkin
2018-11-19 7:13 ` Marko Myllynen
2018-11-19 9:21 ` Egor Kobylkin
2018-11-19 19:35 ` Marko Myllynen
2018-12-01 22:07 ` Rafal Luzynski
2018-12-01 22:53 ` Egor Kobylkin
2018-12-03 22:19 ` Egor Kobylkin
2018-12-08 1:15 ` Rafal Luzynski
2018-12-10 21:20 ` Marko Myllynen
2018-12-19 22:25 ` Rafal Luzynski
2018-12-19 22:48 ` Egor Kobylkin
2018-12-19 23:50 ` Rafal Luzynski
2018-11-19 11:10 ` [PATCH v10] " Egor Kobylkin
2018-12-07 23:35 ` Rafal Luzynski
2018-12-08 21:51 ` Egor Kobylkin
2018-12-19 22:41 ` Rafal Luzynski
2018-12-19 23:02 ` Egor Kobylkin [this message]
2018-12-20 0:05 ` Rafal Luzynski
2018-12-08 22:28 ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
2018-12-19 23:16 ` Egor Kobylkin
2018-12-26 10:07 ` Siddhesh Poyarekar
2018-12-26 12:13 ` Egor Kobylkin
2018-12-27 1:30 ` Siddhesh Poyarekar
2018-12-27 11:28 ` Rafal Luzynski
2019-01-02 18:38 ` [PATCH v12] " Egor Kobylkin
2019-01-05 14:35 ` Rafal Luzynski
2019-01-05 21:12 ` Egor Kobylkin
2019-01-07 20:37 ` Marko Myllynen
2019-01-09 0:46 ` Egor Kobylkin
2019-01-09 20:03 ` Marko Myllynen
2019-02-04 7:14 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
2019-02-14 16:48 ` Marko Myllynen
2019-03-04 22:11 ` Egor Kobylkin
2019-03-11 13:59 ` PING " Egor Kobylkin
2019-03-14 19:48 ` Egor Kobylkin
2019-04-19 22:24 ` Rafal Luzynski
[not found] ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
2019-04-27 2:51 ` Siddhesh Poyarekar
2019-04-27 7:34 ` Diego (Egor) Kobylkin
2019-04-09 1:04 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
2019-03-19 10:39 ` ping " Egor Kobylkin
2019-03-28 16:20 ` [PING^4][PATCH " Marko Myllynen
2019-04-04 19:44 ` [PING^5][PATCH " Egor Kobylkin
2019-04-06 1:36 ` Siddhesh Poyarekar
2019-04-16 7:15 ` [PING^6][PATCH " Marko Myllynen
2019-04-16 13:17 ` Carlos O'Donell
2019-04-16 17:06 ` Egor Kobylkin
2019-04-16 17:58 ` Carlos O'Donell
2019-04-16 18:41 ` Egor Kobylkin
2019-04-16 19:06 ` Carlos O'Donell
2019-05-10 12:19 ` Marko Myllynen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/libc/involved.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e20793e0-2ac1-3b20-3b2e-4d4a9e3ad4ab@kobylkin.com \
--to=egor@kobylkin.com \
--cc=ldv@altlinux.org \
--cc=libc-alpha@sourceware.org \
--cc=libc-locales@sourceware.org \
--cc=mfabian@redhat.com \
--cc=myllynen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).