unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Egor Kobylkin <egor@kobylkin.com>
To: libc-alpha@sourceware.org, libc-locales@sourceware.org,
	Carlos O'Donell <carlos@redhat.com>,
	Siddhesh Poyarekar <siddhesh@gotplt.org>,
	Rafal Luzynski <digitalfreak@lingonborough.com>
Cc: Marko Myllynen <myllynen@redhat.com>, mfabian@redhat.com
Subject: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
Date: Wed, 2 Jan 2019 19:38:37 +0100	[thread overview]
Message-ID: <a1db6ae3-2847-1482-b849-dd383e8c85aa@kobylkin.com> (raw)
In-Reply-To: <20180412224352.GB2911@altlinux.org>

[-- Attachment #1: Type: text/plain, Size: 5979 bytes --]

Changelog v12:
* Adjusted to the new comment style suddenly appearing in the target 
file locale/C-translit.h.in (the original file changed on the master 
branch from /* style to # style since v11)
* Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to 
"sh`" instead of erroneous "SH`" in v11

Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.

Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
   to sequences of all uppercase Latin letters in all languages (whenever
   a Cyrillic letter is transliterated to more than one Latin letter),
   for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration rows to locale/C-translit.h.in.

The patch is attached.


Current bug effect:

The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:

iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- it produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here.


COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].

The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.

Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3

Best regards,
Egor Kobylkin




[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 10494 bytes --]

From 46e0d0e3d07805ec853fdd72dc3793995cb5593c Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 2 Jan 2019 05:50:13 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 169 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 169 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index d5f00df0f3..758171c394 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -56,6 +56,175 @@
 "\x02cd"	"_"	# <U02CD> MODIFIER LETTER LOW MACRON
 "\x02d0"	":"	# <U02D0> MODIFIER LETTER TRIANGULAR COLON
 "\x02dc"	"~"	# <U02DC> SMALL TILDE
+"\x0401"	"YO"	# <U0401> CYRILLIC CAPITAL LETTER IO
+"\x0402"	"DJ"	# <U0402> CYRILLIC CAPITAL LETTER DJE
+"\x0403"	"G`"	# <U0403> CYRILLIC CAPITAL LETTER GJE
+"\x0404"	"YE"	# <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE
+"\x0405"	"Z`"	# <U0405> CYRILLIC CAPITAL LETTER DZE
+"\x0406"	"I"	# <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0407"	"YI"	# <U0407> CYRILLIC CAPITAL LETTER YI
+"\x0408"	"J"	# <U0408> CYRILLIC CAPITAL LETTER JE
+"\x0409"	"L`"	# <U0409> CYRILLIC CAPITAL LETTER LJE
+"\x040a"	"N`"	# <U040A> CYRILLIC CAPITAL LETTER NJE
+"\x040b"	"TSH"	# <U040B> CYRILLIC CAPITAL LETTER TSHE
+"\x040c"	"K`"	# <U040C> CYRILLIC CAPITAL LETTER KJE
+"\x040e"	"U`"	# <U040E> CYRILLIC CAPITAL LETTER SHORT U
+"\x040f"	"DH"	# <U040F> CYRILLIC CAPITAL LETTER DZHE
+"\x0410"	"A"	# <U0410> CYRILLIC CAPITAL LETTER A
+"\x0411"	"B"	# <U0411> CYRILLIC CAPITAL LETTER BE
+"\x0412"	"V"	# <U0412> CYRILLIC CAPITAL LETTER VE
+"\x0413"	"G"	# <U0413> CYRILLIC CAPITAL LETTER GHE
+"\x0414"	"D"	# <U0414> CYRILLIC CAPITAL LETTER DE
+"\x0415"	"E"	# <U0415> CYRILLIC CAPITAL LETTER IE
+"\x0416"	"ZH"	# <U0416> CYRILLIC CAPITAL LETTER ZHE
+"\x0417"	"Z"	# <U0417> CYRILLIC CAPITAL LETTER ZE
+"\x0418"	"I"	# <U0418> CYRILLIC CAPITAL LETTER I
+"\x0419"	"J"	# <U0419> CYRILLIC CAPITAL LETTER SHORT I
+"\x041a"	"K"	# <U041A> CYRILLIC CAPITAL LETTER KA
+"\x041b"	"L"	# <U041B> CYRILLIC CAPITAL LETTER EL
+"\x041c"	"M"	# <U041C> CYRILLIC CAPITAL LETTER EM
+"\x041d"	"N"	# <U041D> CYRILLIC CAPITAL LETTER EN
+"\x041e"	"O"	# <U041E> CYRILLIC CAPITAL LETTER O
+"\x041f"	"P"	# <U041F> CYRILLIC CAPITAL LETTER PE
+"\x0420"	"R"	# <U0420> CYRILLIC CAPITAL LETTER ER
+"\x0421"	"S"	# <U0421> CYRILLIC CAPITAL LETTER ES
+"\x0422"	"T"	# <U0422> CYRILLIC CAPITAL LETTER TE
+"\x0423"	"U"	# <U0423> CYRILLIC CAPITAL LETTER U
+"\x0424"	"F"	# <U0424> CYRILLIC CAPITAL LETTER EF
+"\x0425"	"X"	# <U0425> CYRILLIC CAPITAL LETTER HA
+"\x0426"	"CZ"	# <U0426> CYRILLIC CAPITAL LETTER TSE
+"\x0427"	"CH"	# <U0427> CYRILLIC CAPITAL LETTER CHE
+"\x0428"	"SH"	# <U0428> CYRILLIC CAPITAL LETTER SHA
+"\x0429"	"SHH"	# <U0429> CYRILLIC CAPITAL LETTER SHCHA
+"\x042a"	"A`"	# <U042A> CYRILLIC CAPITAL LETTER HARD SIGN
+"\x042b"	"Y`"	# <U042B> CYRILLIC CAPITAL LETTER YERU
+"\x042c"	"`"	# <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN
+"\x042d"	"E`"	# <U042D> CYRILLIC CAPITAL LETTER E
+"\x042e"	"YU"	# <U042E> CYRILLIC CAPITAL LETTER YU
+"\x042f"	"YA"	# <U042F> CYRILLIC CAPITAL LETTER YA
+"\x0430"	"a"	# <U0430> CYRILLIC SMALL LETTER A
+"\x0431"	"b"	# <U0431> CYRILLIC SMALL LETTER BE
+"\x0432"	"v"	# <U0432> CYRILLIC SMALL LETTER VE
+"\x0433"	"g"	# <U0433> CYRILLIC SMALL LETTER GHE
+"\x0434"	"d"	# <U0434> CYRILLIC SMALL LETTER DE
+"\x0435"	"e"	# <U0435> CYRILLIC SMALL LETTER IE
+"\x0436"	"zh"	# <U0436> CYRILLIC SMALL LETTER ZHE
+"\x0437"	"z"	# <U0437> CYRILLIC SMALL LETTER ZE
+"\x0438"	"i"	# <U0438> CYRILLIC SMALL LETTER I
+"\x0439"	"j"	# <U0439> CYRILLIC SMALL LETTER SHORT I
+"\x043a"	"k"	# <U043A> CYRILLIC SMALL LETTER KA
+"\x043b"	"l"	# <U043B> CYRILLIC SMALL LETTER EL
+"\x043c"	"m"	# <U043C> CYRILLIC SMALL LETTER EM
+"\x043d"	"n"	# <U043D> CYRILLIC SMALL LETTER EN
+"\x043e"	"o"	# <U043E> CYRILLIC SMALL LETTER O
+"\x043f"	"p"	# <U043F> CYRILLIC SMALL LETTER PE
+"\x0440"	"r"	# <U0440> CYRILLIC SMALL LETTER ER
+"\x0441"	"s"	# <U0441> CYRILLIC SMALL LETTER ES
+"\x0442"	"t"	# <U0442> CYRILLIC SMALL LETTER TE
+"\x0443"	"u"	# <U0443> CYRILLIC SMALL LETTER U
+"\x0444"	"f"	# <U0444> CYRILLIC SMALL LETTER EF
+"\x0445"	"x"	# <U0445> CYRILLIC SMALL LETTER HA
+"\x0446"	"cz"	# <U0446> CYRILLIC SMALL LETTER TSE
+"\x0447"	"ch"	# <U0447> CYRILLIC SMALL LETTER CHE
+"\x0448"	"sh"	# <U0448> CYRILLIC SMALL LETTER SHA
+"\x0449"	"shh"	# <U0449> CYRILLIC SMALL LETTER SHCHA
+"\x044a"	"``"	# <U044A> CYRILLIC SMALL LETTER HARD SIGN
+"\x044b"	"y`"	# <U044B> CYRILLIC SMALL LETTER YERU
+"\x044c"	"`"	# <U044C> CYRILLIC SMALL LETTER SOFT SIGN
+"\x044d"	"e`"	# <U044D> CYRILLIC SMALL LETTER E
+"\x044e"	"yu"	# <U044E> CYRILLIC SMALL LETTER YU
+"\x044f"	"ya"	# <U044F> CYRILLIC SMALL LETTER YA
+"\x0451"	"yo"	# <U0451> CYRILLIC SMALL LETTER IO
+"\x0452"	"dj"	# <U0452> CYRILLIC SMALL LETTER DJE
+"\x0453"	"g`"	# <U0453> CYRILLIC SMALL LETTER GJE
+"\x0454"	"ye"	# <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE
+"\x0455"	"z`"	# <U0455> CYRILLIC SMALL LETTER DZE
+"\x0456"	"i"	# <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0457"	"yi"	# <U0457> CYRILLIC SMALL LETTER YI
+"\x0458"	"j"	# <U0458> CYRILLIC SMALL LETTER JE
+"\x0459"	"l`"	# <U0459> CYRILLIC SMALL LETTER LJE
+"\x045a"	"n`"	# <U045A> CYRILLIC SMALL LETTER NJE
+"\x045b"	"tsh"	# <U045B> CYRILLIC SMALL LETTER TSHE
+"\x045c"	"k`"	# <U045C> CYRILLIC SMALL LETTER KJE
+"\x045e"	"u`"	# <U045E> CYRILLIC SMALL LETTER SHORT U
+"\x045f"	"dh"	# <U045F> CYRILLIC SMALL LETTER DZHE
+"\x046a"	"O`"	# <U046A> CYRILLIC CAPITAL LETTER BIG YUS
+"\x046b"	"o`"	# <U046B> CYRILLIC SMALL LETTER BIG YUS
+"\x0472"	"FH"	# <U0472> CYRILLIC CAPITAL LETTER FITA
+"\x0473"	"fh"	# <U0473> CYRILLIC SMALL LETTER FITA
+"\x0474"	"YH"	# <U0474> CYRILLIC CAPITAL LETTER IZHITSA
+"\x0475"	"yh"	# <U0475> CYRILLIC SMALL LETTER IZHITSA
+"\x048c"	"E`"	# <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+"\x048d"	"e`"	# <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN
+"\x0490"	"G`"	# <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+"\x0491"	"g`"	# <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN
+"\x0492"	"GH"	# <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE
+"\x0493"	"gh"	# <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE
+"\x0494"	"GH"	# <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+"\x0495"	"gh"	# <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+"\x0496"	"ZH`"	# <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+"\x0497"	"zh`"	# <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+"\x049a"	"K`"	# <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+"\x049b"	"k`"	# <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER
+"\x049e"	"K`"	# <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE
+"\x049f"	"k`"	# <U049F> CYRILLIC SMALL LETTER KA WITH STROKE
+"\x04a2"	"N`"	# <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+"\x04a3"	"n`"	# <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER
+"\x04a4"	"NG"	# <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE
+"\x04a5"	"ng"	# <U04A5> CYRILLIC SMALL LIGATURE EN GHE
+"\x04a6"	"P`"	# <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+"\x04a7"	"p`"	# <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+"\x04a8"	"O`"	# <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA
+"\x04a9"	"o`"	# <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA
+"\x04aa"	"C`"	# <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+"\x04ab"	"C`"	# <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER
+"\x04ac"	"T`"	# <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+"\x04ad"	"t`"	# <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER
+"\x04ae"	"U"	# <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U
+"\x04af"	"u"	# <U04AF> CYRILLIC SMALL LETTER STRAIGHT U
+"\x04b2"	"H`"	# <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+"\x04b3"	"h`"	# <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER
+"\x04b4"	"TCZ"	# <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE
+"\x04b5"	"tcz"	# <U04B5> CYRILLIC SMALL LIGATURE TE TSE
+"\x04ba"	"SH`"	# <U04BA> CYRILLIC CAPITAL LETTER SHHA
+"\x04bb"	"sh`"	# <U04BB> CYRILLIC SMALL LETTER SHHA
+"\x04bc"	"CH`"	# <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+"\x04bd"	"ch`"	# <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE
+"\x04be"	"CH`"	# <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04bf"	"ch`"	# <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04c0"	"i"	# <U04C0> CYRILLIC LETTER PALOCHKA
+"\x04c1"	"ZH`"	# <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+"\x04c2"	"zh`"	# <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE
+"\x04cb"	"CH`"	# <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+"\x04cc"	"ch`"	# <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE
+"\x04d0"	"A`"	# <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE
+"\x04d1"	"a`"	# <U04D1> CYRILLIC SMALL LETTER A WITH BREVE
+"\x04d2"	"A`"	# <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+"\x04d3"	"a`"	# <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS
+"\x04d6"	"E`"	# <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE
+"\x04d7"	"e`"	# <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE
+"\x04d8"	"A`"	# <U04D8> CYRILLIC CAPITAL LETTER SCHWA
+"\x04d9"	"a`"	# <U04D9> CYRILLIC SMALL LETTER SCHWA
+"\x04dc"	"ZH`"	# <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+"\x04dd"	"zh`"	# <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+"\x04de"	"Z`"	# <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+"\x04df"	"z`"	# <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+"\x04e0"	"Z`"	# <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+"\x04e1"	"z`"	# <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE
+"\x04e4"	"I`"	# <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+"\x04e5"	"i`"	# <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS
+"\x04e6"	"O`"	# <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+"\x04e7"	"o`"	# <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS
+"\x04e8"	"O`"	# <U04E8> CYRILLIC CAPITAL LETTER BARRED O
+"\x04e9"	"o`"	# <U04E9> CYRILLIC SMALL LETTER BARRED O
+"\x04f0"	"U`"	# <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+"\x04f1"	"u`"	# <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS
+"\x04f2"	"U`"	# <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+"\x04f3"	"u`"	# <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+"\x04f4"	"CH`"	# <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+"\x04f5"	"ch`"	# <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+"\x04f8"	"Y`"	# <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+"\x04f9"	"y`"	# <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS
 "\x2002"	" "	# <U2002> EN SPACE
 "\x2003"	" "	# <U2003> EM SPACE
 "\x2004"	" "	# <U2004> THREE-PER-EM SPACE
-- 
2.17.1


  parent reply	other threads:[~2019-01-02 18:39 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com>
     [not found] ` <20180412224352.GB2911@altlinux.org>
2018-07-17 19:34   ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-07-17 19:40     ` Carlos O'Donell
2018-07-17 19:50       ` Egor Kobylkin
2018-07-17 19:59         ` Carlos O'Donell
2018-08-06 19:00   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
2018-10-03  8:26     ` Egor Kobylkin
2018-10-03  9:19       ` Keld Simonsen
2018-10-03  9:32         ` Egor Kobylkin
2018-10-05  8:43           ` Marko Myllynen
2018-10-05  9:20           ` Rafal Luzynski
2018-10-05 10:36             ` Egor Kobylkin
2018-10-08 22:04               ` Rafal Luzynski
2018-10-08 22:52                 ` Egor Kobylkin
2018-10-09 21:43                   ` Rafal Luzynski
2018-10-08 23:20                 ` Zack Weinberg
2018-10-09 15:26                   ` Carlos O'Donell
2018-10-09 21:51                     ` Rafal Luzynski
2018-10-09 16:10                 ` Marko Myllynen
2018-10-09 16:22                   ` Egor Kobylkin
2018-10-09 16:49                     ` Marko Myllynen
2018-10-09 22:08                   ` Rafal Luzynski
2018-10-10 11:21                     ` Marko Myllynen
2018-10-11 10:10                   ` Marko Myllynen
     [not found]             ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
2018-10-05 11:54               ` Marko Myllynen
2018-10-05 12:00                 ` Egor Kobylkin
2018-10-05 12:21                   ` Marko Myllynen
2018-10-05 20:47                     ` Egor Kobylkin
2018-10-08 12:40                       ` Marko Myllynen
2018-10-08 22:23                         ` Rafal Luzynski
2018-10-08 23:35                           ` Egor Kobylkin
2018-10-09 13:18                             ` Egor Kobylkin
2018-10-09 18:34                               ` Egor Kobylkin
2018-10-09 22:17                                 ` Rafal Luzynski
2018-10-09 22:40                                   ` Egor Kobylkin
2018-10-09 22:42                                     ` Egor Kobylkin
2018-10-10 11:22                                       ` Marko Myllynen
2018-10-10 12:19                                         ` Egor Kobylkin
2018-10-10 12:34                                           ` Marko Myllynen
2018-10-10 22:29   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
2018-10-11  9:59     ` Marko Myllynen
2018-10-11 11:04     ` Rafal Luzynski
2018-10-11 13:10       ` Marko Myllynen
2018-10-11 13:50       ` Volodymyr Lisivka
2018-10-11 14:59       ` Egor Kobylkin
2018-10-11 21:30         ` Egor Kobylkin
2018-10-11 15:05       ` Egor Kobylkin
2018-10-11 15:44   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
2018-10-11 21:33   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
2018-10-12 14:05   ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-10-13  0:59     ` Rafal Luzynski
2018-10-13 16:58       ` Egor Kobylkin
2018-10-15 11:04         ` Marko Myllynen
2018-10-15 11:54           ` Egor Kobylkin
2018-10-23 23:08         ` Rafal Luzynski
2018-10-17 14:16   ` [PATCH v6] " Egor Kobylkin
2018-11-01 22:51   ` [PATCH v7] " Egor Kobylkin
2018-11-02  0:00   ` [PATCH v8] " Egor Kobylkin
2018-11-02 22:22     ` Rafal Luzynski
2018-11-02 23:27       ` Egor Kobylkin
2018-11-14 21:25   ` [PATCH v9] " Egor Kobylkin
2018-11-16 22:17     ` Rafal Luzynski
2018-11-17 18:34       ` Egor Kobylkin
2018-11-19  7:13         ` Marko Myllynen
2018-11-19  9:21           ` Egor Kobylkin
2018-11-19 19:35             ` Marko Myllynen
2018-12-01 22:07           ` Rafal Luzynski
2018-12-01 22:53             ` Egor Kobylkin
2018-12-03 22:19             ` Egor Kobylkin
2018-12-08  1:15               ` Rafal Luzynski
2018-12-10 21:20                 ` Marko Myllynen
2018-12-19 22:25                   ` Rafal Luzynski
2018-12-19 22:48                     ` Egor Kobylkin
2018-12-19 23:50                       ` Rafal Luzynski
2018-11-19 11:10   ` [PATCH v10] " Egor Kobylkin
2018-12-07 23:35     ` Rafal Luzynski
2018-12-08 21:51       ` Egor Kobylkin
2018-12-19 22:41         ` Rafal Luzynski
2018-12-19 23:02           ` Egor Kobylkin
2018-12-20  0:05             ` Rafal Luzynski
2018-12-08 22:28   ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
2018-12-19 23:16     ` Egor Kobylkin
2018-12-26 10:07       ` Siddhesh Poyarekar
2018-12-26 12:13         ` Egor Kobylkin
2018-12-27  1:30           ` Siddhesh Poyarekar
2018-12-27 11:28             ` Rafal Luzynski
2019-01-02 18:38   ` Egor Kobylkin [this message]
2019-01-05 14:35     ` [PATCH v12] " Rafal Luzynski
2019-01-05 21:12       ` Egor Kobylkin
2019-01-07 20:37         ` Marko Myllynen
2019-01-09  0:46           ` Egor Kobylkin
2019-01-09 20:03             ` Marko Myllynen
2019-02-04  7:14               ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
2019-02-14 16:48                 ` Marko Myllynen
2019-03-04 22:11                   ` Egor Kobylkin
2019-03-11 13:59                     ` PING " Egor Kobylkin
2019-03-14 19:48                       ` Egor Kobylkin
2019-04-19 22:24                   ` Rafal Luzynski
     [not found]                     ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
2019-04-27  2:51                       ` Siddhesh Poyarekar
2019-04-27  7:34                         ` Diego (Egor) Kobylkin
2019-04-09  1:04     ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
2019-03-19 10:39   ` ping " Egor Kobylkin
2019-03-28 16:20     ` [PING^4][PATCH " Marko Myllynen
2019-04-04 19:44     ` [PING^5][PATCH " Egor Kobylkin
2019-04-06  1:36       ` Siddhesh Poyarekar
2019-04-16  7:15     ` [PING^6][PATCH " Marko Myllynen
2019-04-16 13:17       ` Carlos O'Donell
2019-04-16 17:06         ` Egor Kobylkin
2019-04-16 17:58           ` Carlos O'Donell
2019-04-16 18:41             ` Egor Kobylkin
2019-04-16 19:06               ` Carlos O'Donell
2019-05-10 12:19                 ` Marko Myllynen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1db6ae3-2847-1482-b849-dd383e8c85aa@kobylkin.com \
    --to=egor@kobylkin.com \
    --cc=carlos@redhat.com \
    --cc=digitalfreak@lingonborough.com \
    --cc=libc-alpha@sourceware.org \
    --cc=libc-locales@sourceware.org \
    --cc=mfabian@redhat.com \
    --cc=myllynen@redhat.com \
    --cc=siddhesh@gotplt.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).