unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
@ 2018-07-17 19:34   ` Egor Kobylkin
  2018-07-17 19:40     ` Carlos O'Donell
  2018-08-06 19:00   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-07-17 19:34 UTC (permalink / raw)
  To: libc-alpha, libc-locales; +Cc: Dmitry V. Levin, Volodymyr Lisivka

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.


From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.



Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compliation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files but have
received not reply so far except from Volodymyr Lisivka
<vlisivka@gmail.com> (uk_UA) who has confirmed the exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618

Best regards,
Egor Kobylkin

---
2018-07-17  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
	* locales/aa_DJ: likewise
	* locales/af_ZA: likewise
	* locales/ak_GH: likewise
	* locales/am_ET: likewise
	* locales/ar_EG: likewise
	* locales/be_BY: likewise
	* locales/bem_ZM: likewise
	* locales/ber_DZ: likewise
	* locales/ber_MA: likewise
	* locales/bg_BG: likewise
	* locales/bi_VU: likewise
	* locales/bn_BD: likewise
	* locales/bo_CN: likewise
	* locales/ca_ES: likewise
	* locales/ce_RU: likewise
	* locales/cs_CZ: likewise
	* locales/cv_RU: likewise
	* locales/cy_GB: likewise
	* locales/da_DK: likewise
	* locales/de_DE: likewise
	* locales/dv_MV: likewise
	* locales/dz_BT: likewise
	* locales/el_GR: likewise
	* locales/en_GB: likewise
	* locales/en_NG: likewise
	* locales/en_ZM: likewise
	* locales/es_CU: likewise
	* locales/es_ES: likewise
	* locales/et_EE: likewise
	* locales/fa_IR: likewise
	* locales/ff_SN: likewise
	* locales/fi_FI: likewise
	* locales/fr_FR: likewise
	* locales/ga_IE: likewise
	* locales/gd_GB: likewise
	* locales/gu_IN: likewise
	* locales/gv_GB: likewise
	* locales/he_IL: likewise
	* locales/hi_IN: likewise
	* locales/hif_FJ: likewise
	* locales/hr_HR: likewise
	* locales/ht_HT: likewise
	* locales/hu_HU: likewise
	* locales/hy_AM: likewise
	* locales/id_ID: likewise
	* locales/is_IS: likewise
	* locales/it_IT: likewise
	* locales/ja_JP: likewise
	* locales/kk_KZ: likewise
	* locales/km_KH: likewise
	* locales/kn_IN: likewise
	* locales/ko_KR: likewise
	* locales/ks_IN: likewise
	* locales/kw_GB: likewise
	* locales/lb_LU: likewise
	* locales/lg_UG: likewise
	* locales/lij_IT: likewise
	* locales/ln_CD: likewise
	* locales/lo_LA: likewise
	* locales/lt_LT: likewise
	* locales/lv_LV: likewise
	* locales/mg_MG: likewise
	* locales/mhr_RU: likewise
	* locales/mk_MK: likewise
	* locales/ml_IN: likewise
	* locales/ms_MY: likewise
	* locales/mt_MT: likewise
	* locales/nan_TW@latin: likewise
	* locales/nb_NO: likewise
	* locales/ne_NP: likewise
	* locales/nhn_MX: likewise
	* locales/niu_NU: likewise
	* locales/niu_NZ: likewise
	* locales/nl_NL: likewise
	* locales/nr_ZA: likewise
	* locales/oc_FR: likewise
	* locales/om_KE: likewise
	* locales/or_IN: likewise
	* locales/os_RU: likewise
	* locales/pa_IN: likewise
	* locales/pa_PK: likewise
	* locales/pl_PL: likewise
	* locales/pt_PT: likewise
	* locales/quz_PE: likewise
	* locales/ro_RO: likewise
	* locales/ru_RU: likewise
	* locales/rw_RW: likewise
	* locales/sa_IN: likewise
	* locales/sd_IN: likewise
	* locales/sd_IN@devanagari: likewise
	* locales/sd_PK: likewise
	* locales/se_NO: likewise
	* locales/sgs_LT: likewise
	* locales/si_LK: likewise
	* locales/sk_SK: likewise
	* locales/sl_SI: likewise
	* locales/sm_WS: likewise
	* locales/so_SO: likewise
	* locales/sq_AL: likewise
	* locales/ss_ZA: likewise
	* locales/st_ZA: likewise
	* locales/sv_SE: likewise
	* locales/sw_KE: likewise
	* locales/ta_IN: likewise
	* locales/te_IN: likewise
	* locales/th_TH: likewise
	* locales/ti_ET: likewise
	* locales/tn_ZA: likewise
	* locales/to_TO: likewise
	* locales/tpi_PG: likewise
	* locales/tr_TR: likewise
	* locales/ts_ZA: likewise
	* locales/unm_US: likewise
	* locales/ur_IN: likewise
	* locales/ur_PK: likewise
	* locales/ve_ZA: likewise
	* locales/vi_VN: likewise
	* locales/wa_BE: likewise
	* locales/wo_SN: likewise
	* locales/xh_ZA: likewise
	* locales/yi_US: likewise
	* locales/zh_CN: likewise
	* locales/zu_ZA: likewise


diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
 include "translit_combining";""


+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "<U0065><U005E>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
@@ -53,6 +53,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
+0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>

 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@

 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
 <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
 <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
 <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
 <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
 <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE




^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-07-17 19:34   ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-07-17 19:40     ` Carlos O'Donell
  2018-07-17 19:50       ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2018-07-17 19:40 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales
  Cc: Dmitry V. Levin, Volodymyr Lisivka

On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

We are currently preparing for the 2.28 release and it may take
a while to review this change and the structure of the changes,
and the data itself.

Is it OK if this material is reviewed for 2.29 inclusion (after
August 1st)?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-07-17 19:40     ` Carlos O'Donell
@ 2018-07-17 19:50       ` Egor Kobylkin
  2018-07-17 19:59         ` Carlos O'Donell
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-07-17 19:50 UTC (permalink / raw)
  To: Carlos O'Donell, libc-alpha, libc-locales
  Cc: Dmitry V. Levin, Volodymyr Lisivka

On 17.07.2018 21:40, Carlos O'Donell wrote:
> On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
>> Dear locale maintainers,
>>
>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> We are currently preparing for the 2.28 release and it may take
> a while to review this change and the structure of the changes,
> and the data itself.
> 
> Is it OK if this material is reviewed for 2.29 inclusion (after
> August 1st)?

It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
Should I send a reminder in August?

Bests,
Egor

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-07-17 19:50       ` Egor Kobylkin
@ 2018-07-17 19:59         ` Carlos O'Donell
  0 siblings, 0 replies; 111+ messages in thread
From: Carlos O'Donell @ 2018-07-17 19:59 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales
  Cc: Dmitry V. Levin, Volodymyr Lisivka

On 07/17/2018 03:50 PM, Egor Kobylkin wrote:
> On 17.07.2018 21:40, Carlos O'Donell wrote:
>> On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
>>> Dear locale maintainers,
>>>
>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>
>> We are currently preparing for the 2.28 release and it may take
>> a while to review this change and the structure of the changes,
>> and the data itself.
>>
>> Is it OK if this material is reviewed for 2.29 inclusion (after
>> August 1st)?
> 
> It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
> Should I send a reminder in August?

Yes please, ping the original patches again in August and we can
review. In the meantime others may feel free to review, but we won't
consider them for inclusion yet e.g. don't block the release.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
       [not found] ` <20180412224352.GB2911@altlinux.org>
  2018-07-17 19:34   ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-08-06 19:00   ` Egor Kobylkin
  2018-10-03  8:26     ` Egor Kobylkin
  2018-10-10 22:29   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
                     ` (11 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-08-06 19:00 UTC (permalink / raw)
  To: libc-alpha, libc-locales
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Carlos O'Donell,
	Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 53923 bytes --]

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.



Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618

Best regards,
Egor Kobylkin

---
2018-07-17  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
	* locales/aa_DJ: likewise
	* locales/af_ZA: likewise
	* locales/ak_GH: likewise
	* locales/am_ET: likewise
	* locales/ar_EG: likewise
	* locales/be_BY: likewise
	* locales/bem_ZM: likewise
	* locales/ber_DZ: likewise
	* locales/ber_MA: likewise
	* locales/bg_BG: likewise
	* locales/bi_VU: likewise
	* locales/bn_BD: likewise
	* locales/bo_CN: likewise
	* locales/ca_ES: likewise
	* locales/ce_RU: likewise
	* locales/cs_CZ: likewise
	* locales/cv_RU: likewise
	* locales/cy_GB: likewise
	* locales/da_DK: likewise
	* locales/de_DE: likewise
	* locales/dv_MV: likewise
	* locales/dz_BT: likewise
	* locales/el_GR: likewise
	* locales/en_GB: likewise
	* locales/en_NG: likewise
	* locales/en_ZM: likewise
	* locales/es_CU: likewise
	* locales/es_ES: likewise
	* locales/et_EE: likewise
	* locales/fa_IR: likewise
	* locales/ff_SN: likewise
	* locales/fi_FI: likewise
	* locales/fr_FR: likewise
	* locales/ga_IE: likewise
	* locales/gd_GB: likewise
	* locales/gu_IN: likewise
	* locales/gv_GB: likewise
	* locales/he_IL: likewise
	* locales/hi_IN: likewise
	* locales/hif_FJ: likewise
	* locales/hr_HR: likewise
	* locales/ht_HT: likewise
	* locales/hu_HU: likewise
	* locales/hy_AM: likewise
	* locales/id_ID: likewise
	* locales/is_IS: likewise
	* locales/it_IT: likewise
	* locales/ja_JP: likewise
	* locales/kk_KZ: likewise
	* locales/km_KH: likewise
	* locales/kn_IN: likewise
	* locales/ko_KR: likewise
	* locales/ks_IN: likewise
	* locales/kw_GB: likewise
	* locales/lb_LU: likewise
	* locales/lg_UG: likewise
	* locales/lij_IT: likewise
	* locales/ln_CD: likewise
	* locales/lo_LA: likewise
	* locales/lt_LT: likewise
	* locales/lv_LV: likewise
	* locales/mg_MG: likewise
	* locales/mhr_RU: likewise
	* locales/mk_MK: likewise
	* locales/ml_IN: likewise
	* locales/ms_MY: likewise
	* locales/mt_MT: likewise
	* locales/nan_TW@latin: likewise
	* locales/nb_NO: likewise
	* locales/ne_NP: likewise
	* locales/nhn_MX: likewise
	* locales/niu_NU: likewise
	* locales/niu_NZ: likewise
	* locales/nl_NL: likewise
	* locales/nr_ZA: likewise
	* locales/oc_FR: likewise
	* locales/om_KE: likewise
	* locales/or_IN: likewise
	* locales/os_RU: likewise
	* locales/pa_IN: likewise
	* locales/pa_PK: likewise
	* locales/pl_PL: likewise
	* locales/pt_PT: likewise
	* locales/quz_PE: likewise
	* locales/ro_RO: likewise
	* locales/ru_RU: likewise
	* locales/rw_RW: likewise
	* locales/sa_IN: likewise
	* locales/sd_IN: likewise
	* locales/sd_IN@devanagari: likewise
	* locales/sd_PK: likewise
	* locales/se_NO: likewise
	* locales/sgs_LT: likewise
	* locales/si_LK: likewise
	* locales/sk_SK: likewise
	* locales/sl_SI: likewise
	* locales/sm_WS: likewise
	* locales/so_SO: likewise
	* locales/sq_AL: likewise
	* locales/ss_ZA: likewise
	* locales/st_ZA: likewise
	* locales/sv_SE: likewise
	* locales/sw_KE: likewise
	* locales/ta_IN: likewise
	* locales/te_IN: likewise
	* locales/th_TH: likewise
	* locales/ti_ET: likewise
	* locales/tn_ZA: likewise
	* locales/to_TO: likewise
	* locales/tpi_PG: likewise
	* locales/tr_TR: likewise
	* locales/ts_ZA: likewise
	* locales/unm_US: likewise
	* locales/ur_IN: likewise
	* locales/ur_PK: likewise
	* locales/ve_ZA: likewise
	* locales/vi_VN: likewise
	* locales/wa_BE: likewise
	* locales/wo_SN: likewise
	* locales/xh_ZA: likewise
	* locales/yi_US: likewise
	* locales/zh_CN: likewise
	* locales/zu_ZA: likewise


diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
 include "translit_combining";""


+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "<U0065><U005E>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
@@ -53,6 +53,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
+0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>

 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@

 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
 <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
 <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
 <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
 <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
 <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE




[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 5149 bytes --]

From: Carlos O'Donell <carlos@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, libc-alpha@sourceware.org, libc-locales@sourceware.org
Cc: "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>
Subject: Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
Date: Tue, 17 Jul 2018 15:59:27 -0400
Message-ID: <e2d97c8e-ba84-2244-aec6-fc0d5f560570@redhat.com>

On 07/17/2018 03:50 PM, Egor Kobylkin wrote:
> On 17.07.2018 21:40, Carlos O'Donell wrote:
>> On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
>>> Dear locale maintainers,
>>>
>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>
>> We are currently preparing for the 2.28 release and it may take
>> a while to review this change and the structure of the changes,
>> and the data itself.
>>
>> Is it OK if this material is reviewed for 2.29 inclusion (after
>> August 1st)?
> 
> It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
> Should I send a reminder in August?

Yes please, ping the original patches again in August and we can
review. In the meantime others may feel free to review, but we won't
consider them for inclusion yet e.g. don't block the release.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-08-06 19:00   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
@ 2018-10-03  8:26     ` Egor Kobylkin
  2018-10-03  9:19       ` Keld Simonsen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-03  8:26 UTC (permalink / raw)
  To: libc-alpha, libc-locales
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Carlos O'Donell,
	Max Kutny, danilo

Ping.

Absent of feedback I am wondering if anything could be missing in this
patch from the maintainers standpoint. More than two months have passed
since the original submission.

If I can be of assistance, please do not hesitate to contact me,
Egor Kobylkin

On 06.08.2018 21:00, Egor Kobylkin wrote:
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> 
> add Cyrillic transliteration table translit_cyrillic file
> 
> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
> 
> to localedata/locales/ and include it in all your locales going forward.
> 
> Patch included inline below.
> 
> This is a re-submission for the consideration for 2.29 on a request from
> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
> 
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
> 
> Their maintainers are requested to make an explicit decision on how and
> whether at all to include this patch.
> 
> 
> 
> Current bug effect:
> 
> The glibc wiki explicitly lists this use case as the test example
> 
> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
> 
> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> translit-test-input.txt
> 
> currently it fails on Cyrillic texts in most locales including ru_RU [1]
> [8] [9]:
> 
> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> translit-test-input.txt |grep CYRILLIC
> 
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> 
>  - It produces a string of question marks and spaces.
> 
> This is what it should produce and it does so after the patch applied:
> 
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
> 
> 
> Root problem and the fix:
> 
> The root problem is the missing transliteration table that I am
> supplying here. Furthermore it has to be referenced/included into the
> active locale at the compilation time to be used by iconv.
> 
> 
> 
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> 
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration has only ASCII codes but still can be read by a native
> speaker. Among other things it is useful for processing the Cyrillic
> texts and filenames by programs or on systems that are not specifically
> prepared to work with Cyrillic, don't have corresponding fonts installed
> or can't handle UTF-8.
> 
> The transliteration table itself is attached as a file translit_cyrillic
> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
> (Federal Agency on Technical Regulating and Metrology Of Russian
> Federation [2]). Technically an independent but identical source [3] was
> used and prepared in a spreadsheet [6].
> 
> The documentation suggests that the transliteration tables inclusion is
> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
> translit_start section
> http://man7.org/linux/man-pages/man5/locale.5.html [5]
> Practically I have searched for all locales that have a
> translit_start/end stance and generated a patch for them.
> 
> The Cyrillic transliteration of e.g. Russian text may have already
> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
> have their transliteration tables included inline.
> However it would not be the standard Russian Cyrillic transliteration as
> described above.
> I am excluding these locales from this proposed patch. I have written
> directly to locale maintainer emails listed in the files. Volodymyr
> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> Данило Шеган <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
> exclusion.
> 
> Links:
> 
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?id=8590
> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=8618
> 
> Best regards,
> Egor Kobylkin
> 
> ---
> 2018-07-17  Egor Kobylkin  <egor@kobylkin.com>
> 
> 	[BZ #2872]
> 	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
> table from Cyrillic to Latin.
> 	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
> section.
> 	* locales/aa_DJ: likewise
> 	* locales/af_ZA: likewise
> 	* locales/ak_GH: likewise
> 	* locales/am_ET: likewise
> 	* locales/ar_EG: likewise
> 	* locales/be_BY: likewise
> 	* locales/bem_ZM: likewise
> 	* locales/ber_DZ: likewise
> 	* locales/ber_MA: likewise
> 	* locales/bg_BG: likewise
> 	* locales/bi_VU: likewise
> 	* locales/bn_BD: likewise
> 	* locales/bo_CN: likewise
> 	* locales/ca_ES: likewise
> 	* locales/ce_RU: likewise
> 	* locales/cs_CZ: likewise
> 	* locales/cv_RU: likewise
> 	* locales/cy_GB: likewise
> 	* locales/da_DK: likewise
> 	* locales/de_DE: likewise
> 	* locales/dv_MV: likewise
> 	* locales/dz_BT: likewise
> 	* locales/el_GR: likewise
> 	* locales/en_GB: likewise
> 	* locales/en_NG: likewise
> 	* locales/en_ZM: likewise
> 	* locales/es_CU: likewise
> 	* locales/es_ES: likewise
> 	* locales/et_EE: likewise
> 	* locales/fa_IR: likewise
> 	* locales/ff_SN: likewise
> 	* locales/fi_FI: likewise
> 	* locales/fr_FR: likewise
> 	* locales/ga_IE: likewise
> 	* locales/gd_GB: likewise
> 	* locales/gu_IN: likewise
> 	* locales/gv_GB: likewise
> 	* locales/he_IL: likewise
> 	* locales/hi_IN: likewise
> 	* locales/hif_FJ: likewise
> 	* locales/hr_HR: likewise
> 	* locales/ht_HT: likewise
> 	* locales/hu_HU: likewise
> 	* locales/hy_AM: likewise
> 	* locales/id_ID: likewise
> 	* locales/is_IS: likewise
> 	* locales/it_IT: likewise
> 	* locales/ja_JP: likewise
> 	* locales/kk_KZ: likewise
> 	* locales/km_KH: likewise
> 	* locales/kn_IN: likewise
> 	* locales/ko_KR: likewise
> 	* locales/ks_IN: likewise
> 	* locales/kw_GB: likewise
> 	* locales/lb_LU: likewise
> 	* locales/lg_UG: likewise
> 	* locales/lij_IT: likewise
> 	* locales/ln_CD: likewise
> 	* locales/lo_LA: likewise
> 	* locales/lt_LT: likewise
> 	* locales/lv_LV: likewise
> 	* locales/mg_MG: likewise
> 	* locales/mhr_RU: likewise
> 	* locales/mk_MK: likewise
> 	* locales/ml_IN: likewise
> 	* locales/ms_MY: likewise
> 	* locales/mt_MT: likewise
> 	* locales/nan_TW@latin: likewise
> 	* locales/nb_NO: likewise
> 	* locales/ne_NP: likewise
> 	* locales/nhn_MX: likewise
> 	* locales/niu_NU: likewise
> 	* locales/niu_NZ: likewise
> 	* locales/nl_NL: likewise
> 	* locales/nr_ZA: likewise
> 	* locales/oc_FR: likewise
> 	* locales/om_KE: likewise
> 	* locales/or_IN: likewise
> 	* locales/os_RU: likewise
> 	* locales/pa_IN: likewise
> 	* locales/pa_PK: likewise
> 	* locales/pl_PL: likewise
> 	* locales/pt_PT: likewise
> 	* locales/quz_PE: likewise
> 	* locales/ro_RO: likewise
> 	* locales/ru_RU: likewise
> 	* locales/rw_RW: likewise
> 	* locales/sa_IN: likewise
> 	* locales/sd_IN: likewise
> 	* locales/sd_IN@devanagari: likewise
> 	* locales/sd_PK: likewise
> 	* locales/se_NO: likewise
> 	* locales/sgs_LT: likewise
> 	* locales/si_LK: likewise
> 	* locales/sk_SK: likewise
> 	* locales/sl_SI: likewise
> 	* locales/sm_WS: likewise
> 	* locales/so_SO: likewise
> 	* locales/sq_AL: likewise
> 	* locales/ss_ZA: likewise
> 	* locales/st_ZA: likewise
> 	* locales/sv_SE: likewise
> 	* locales/sw_KE: likewise
> 	* locales/ta_IN: likewise
> 	* locales/te_IN: likewise
> 	* locales/th_TH: likewise
> 	* locales/ti_ET: likewise
> 	* locales/tn_ZA: likewise
> 	* locales/to_TO: likewise
> 	* locales/tpi_PG: likewise
> 	* locales/tr_TR: likewise
> 	* locales/ts_ZA: likewise
> 	* locales/unm_US: likewise
> 	* locales/ur_IN: likewise
> 	* locales/ur_PK: likewise
> 	* locales/ve_ZA: likewise
> 	* locales/vi_VN: likewise
> 	* locales/wa_BE: likewise
> 	* locales/wo_SN: likewise
> 	* locales/xh_ZA: likewise
> 	* locales/yi_US: likewise
> 	* locales/zh_CN: likewise
> 	* locales/zu_ZA: likewise
> 
> 
> diff -uNr a/localedata/locales/C b/localedata/locales/C
> --- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
> @@ -2292,6 +2292,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
> --- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
> @@ -70,6 +70,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
> --- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
> @@ -72,6 +72,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
> --- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
> @@ -56,6 +56,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> --- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
> @@ -1396,6 +1396,7 @@
>  <U137A>    <U0060><U0039><U0030>
>  <U137B>    <U0060><U0031><U0030><U0030>
>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
> +include "translit_cyrillic";""
>  translit_end
>  %
>  END LC_CTYPE
> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
> --- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
> @@ -44,6 +44,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
> --- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
> @@ -69,6 +69,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
> --- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
> @@ -42,6 +42,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
> --- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
> @@ -166,6 +166,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
> --- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
> @@ -86,6 +86,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
> --- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
> @@ -49,6 +49,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
> --- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
> @@ -39,6 +39,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
> --- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
> @@ -63,6 +63,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
> --- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
> @@ -43,6 +43,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
> --- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
> @@ -72,6 +72,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
> --- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
> @@ -39,6 +39,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
> --- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
> @@ -2311,6 +2311,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
> --- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
> @@ -109,6 +109,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
> --- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
> @@ -69,6 +69,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
> --- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
> @@ -167,6 +167,7 @@
>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
> --- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
> @@ -78,6 +78,7 @@
>  % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>  <U201F> <U00AB>;<U0022>
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
> --- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
> @@ -52,6 +52,7 @@
>  include "translit_combining";""
> 
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
> --- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
> @@ -60,6 +60,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
> --- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
> @@ -59,6 +59,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
> --- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
> @@ -55,6 +55,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
> --- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
> @@ -50,6 +50,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
> --- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
> @@ -42,6 +42,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
> --- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
> @@ -59,6 +59,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
> --- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
> @@ -73,6 +73,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
> --- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
> @@ -109,6 +109,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
> --- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
> @@ -79,6 +79,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
> --- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
> @@ -42,6 +42,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
> --- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
> @@ -137,6 +137,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
> --- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
>  % In France, accents are simply omitted if they cannot be represented.
>  include "translit_combining";""
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
> --- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
> @@ -54,6 +54,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
> --- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
> @@ -47,6 +47,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
> --- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
> @@ -62,6 +62,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
> --- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
> @@ -57,6 +57,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
> --- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
> --- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
> @@ -61,6 +61,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
> --- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
> @@ -37,6 +37,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
> --- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
> @@ -153,6 +153,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
> --- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
> --- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
> @@ -478,6 +478,7 @@
>  <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>  <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
> --- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
> @@ -77,6 +77,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
> --- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
> @@ -55,6 +55,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
> --- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
> @@ -2161,6 +2161,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
> --- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
> --- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
> @@ -1682,6 +1682,7 @@
>  include "translit_combining";""
>  include "translit_cjk_variants";""
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
> --- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
> @@ -158,6 +158,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
> --- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
> @@ -873,6 +873,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
> --- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
> @@ -63,6 +63,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
> --- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
> @@ -6099,6 +6099,7 @@
>  include "translit_combining";""
>  include "translit_hangul";""
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
> --- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
> @@ -46,6 +46,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
> --- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
> @@ -58,6 +58,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
> --- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
> @@ -78,6 +78,7 @@
>  % LATIN SMALL LETTER E WITH CIRCUMFLEX
>  <U00EA> "<U0065><U005E>"
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
> --- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
> @@ -57,6 +57,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
> --- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
> @@ -47,6 +47,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
> --- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
> @@ -39,6 +39,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
> --- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
> @@ -51,6 +51,7 @@
>  copy "i18n"
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
> --- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
> @@ -77,6 +77,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
> --- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
> @@ -2122,6 +2122,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
> --- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
> @@ -55,6 +55,7 @@
>  % Accents are simply omitted if they cannot be represented.
>  include "translit_combining";""
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
> --- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
> @@ -59,6 +59,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
> --- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
> @@ -49,6 +49,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
> --- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
> @@ -60,6 +60,7 @@
> 
>  translit_start
>  include     "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
>  %
> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
> --- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
> @@ -45,6 +45,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
> --- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
> @@ -47,6 +47,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/nan_TW@latin
> b/localedata/locales/nan_TW@latin
> --- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
> @@ -53,6 +53,7 @@
>  % accents are simply omitted if they cannot be represented.
>  include "translit_combining";""
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
> --- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
> @@ -154,6 +154,7 @@
>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
> --- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
> @@ -43,6 +43,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
> --- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
> --- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
> --- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
> --- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
> @@ -57,6 +57,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
> --- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
> @@ -66,6 +66,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
> --- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
> @@ -62,6 +62,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
> --- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
> @@ -140,6 +140,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
> --- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
> @@ -62,6 +62,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
> --- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
> @@ -70,6 +70,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
> --- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
> 
>  translit_start
>  include     "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
> --- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
> @@ -58,6 +58,7 @@
>  % Farsi yeh -> yeh
>  <U06CC> "<U064A>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
> --- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
> @@ -142,6 +142,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
> --- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
> @@ -59,6 +59,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
> --- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
> @@ -57,6 +57,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
> --- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
> @@ -144,6 +144,7 @@
>  <U0162> "<U021A>";"<U0054>"
>  <U0163> "<U021B>";"<U0074>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
> --- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
> @@ -74,6 +74,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
> --- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
> @@ -45,6 +45,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
> --- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
> @@ -44,6 +44,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
> --- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
> @@ -46,6 +46,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sd_IN@devanagari
> b/localedata/locales/sd_IN@devanagari
> --- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
> +0000
> +++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
> +0000
> @@ -44,6 +44,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> --- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
> @@ -39,6 +39,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
> --- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
> @@ -205,6 +205,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
> --- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
> @@ -59,6 +59,7 @@
>  copy "i18n"
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
> --- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
> @@ -45,6 +45,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
> --- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
> @@ -68,6 +68,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
> --- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
> @@ -91,6 +91,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
> --- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
> @@ -37,6 +37,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
> --- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
> @@ -70,6 +70,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
> --- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
> @@ -45,6 +45,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
> --- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
> @@ -68,6 +68,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
> --- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
> @@ -64,6 +64,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
> --- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
> @@ -139,6 +139,7 @@
>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
> --- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
> @@ -44,6 +44,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
> --- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
> @@ -63,6 +63,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
> --- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
> @@ -63,6 +63,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
> --- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
> @@ -58,6 +58,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
> --- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
> @@ -866,6 +866,7 @@
>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
> 
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  %
>  END LC_CTYPE
> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
> --- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
> @@ -69,6 +69,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
> --- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
> @@ -36,6 +36,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
> --- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
> @@ -37,6 +37,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
> --- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
> @@ -2430,6 +2430,7 @@
> 
>  % TURKISH LIRA SIGN
>  <U20BA> "<U0054><U004C>"
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
> +0000
> @@ -0,0 +1,151 @@
> +escape_char /
> +comment_char %
> +
> +% Transliterations that converts cyrillic letters to ascii symbols
> inspired by GOST 7.79-2000
> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> +% Generated from UnicodeData.txt with
> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
> +% Up to three characters are required to do a reversible transliteration.
> +
> +LC_CTYPE
> +
> +translit_start
> +
> +
> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> "<U0059><U004F>";<U0059>
> +% CYRILLIC CAPITAL LETTER A
> +<U0410> <U0041>
> +% CYRILLIC CAPITAL LETTER BE
> +<U0411> <U0042>
> +% CYRILLIC CAPITAL LETTER VE
> +<U0412> <U0056>
> +% CYRILLIC CAPITAL LETTER GHE
> +<U0413> <U0047>
> +% CYRILLIC CAPITAL LETTER DE
> +<U0414> <U0044>
> +% CYRILLIC CAPITAL LETTER IE
> +<U0415> <U0045>
> +% CYRILLIC CAPITAL LETTER ZHE
> +<U0416> "<U005A><U0048>";<U005A>
> +% CYRILLIC CAPITAL LETTER ZE
> +<U0417> <U005A>
> +% CYRILLIC CAPITAL LETTER I
> +<U0418> <U0049>
> +% CYRILLIC CAPITAL LETTER SHORT I
> +<U0419> <U004A>
> +% CYRILLIC CAPITAL LETTER KA
> +<U041A> <U004B>
> +% CYRILLIC CAPITAL LETTER EL
> +<U041B> <U004C>
> +% CYRILLIC CAPITAL LETTER EM
> +<U041C> <U004D>
> +% CYRILLIC CAPITAL LETTER EN
> +<U041D> <U004E>
> +% CYRILLIC CAPITAL LETTER O
> +<U041E> <U004F>
> +% CYRILLIC CAPITAL LETTER PE
> +<U041F> <U0050>
> +% CYRILLIC CAPITAL LETTER ER
> +<U0420> <U0052>
> +% CYRILLIC CAPITAL LETTER ES
> +<U0421> <U0053>
> +% CYRILLIC CAPITAL LETTER TE
> +<U0422> <U0054>
> +% CYRILLIC CAPITAL LETTER U
> +<U0423> <U0055>
> +% CYRILLIC CAPITAL LETTER EF
> +<U0424> <U0046>
> +% CYRILLIC CAPITAL LETTER HA
> +<U0425> <U0058>
> +% CYRILLIC CAPITAL LETTER TSE
> +<U0426> "<U0043><U005A>";<U0043>
> +% CYRILLIC CAPITAL LETTER CHE
> +<U0427> "<U0043><U0048>";<U0043>
> +% CYRILLIC CAPITAL LETTER SHA
> +<U0428> "<U0053><U0048>";<U0053>
> +% CYRILLIC CAPITAL LETTER SHCHA
> +<U0429> "<U0053><U0048><U0048>";<U0053>
> +% CYRILLIC CAPITAL LETTER HARD SIGN
> +<U042A> "<U0060><U0060>";<U0060>
> +% CYRILLIC CAPITAL LETTER YERU
> +<U042B> "<U0059><U0027>";<U0059>
> +% CYRILLIC CAPITAL LETTER SOFT SIGN
> +<U042C> <U0060>
> +% CYRILLIC CAPITAL LETTER E
> +<U042D> "<U0045><U0060>";<U0045>
> +% CYRILLIC CAPITAL LETTER YU
> +<U042E> "<U0059><U0055>";<U0059>
> +% CYRILLIC CAPITAL LETTER YA
> +<U042F> "<U0059><U0041>";<U0059>
> +% CYRILLIC SMALL LETTER A
> +<U0430> <U0061>
> +% CYRILLIC SMALL LETTER BE
> +<U0431> <U0062>
> +% CYRILLIC SMALL LETTER VE
> +<U0432> <U0076>
> +% CYRILLIC SMALL LETTER GHE
> +<U0433> <U0067>
> +% CYRILLIC SMALL LETTER DE
> +<U0434> <U0064>
> +% CYRILLIC SMALL LETTER IE
> +<U0435> <U0065>
> +% CYRILLIC SMALL LETTER ZHE
> +<U0436> "<U007A><U0068>";<U007A>
> +% CYRILLIC SMALL LETTER ZE
> +<U0437> <U007A>
> +% CYRILLIC SMALL LETTER I
> +<U0438> <U0069>
> +% CYRILLIC SMALL LETTER SHORT I
> +<U0439> <U006A>
> +% CYRILLIC SMALL LETTER KA
> +<U043A> <U006B>
> +% CYRILLIC SMALL LETTER EL
> +<U043B> <U006C>
> +% CYRILLIC SMALL LETTER EM
> +<U043C> <U006D>
> +% CYRILLIC SMALL LETTER EN
> +<U043D> <U006E>
> +% CYRILLIC SMALL LETTER O
> +<U043E> <U006F>
> +% CYRILLIC SMALL LETTER PE
> +<U043F> <U0070>
> +% CYRILLIC SMALL LETTER ER
> +<U0440> <U0072>
> +% CYRILLIC SMALL LETTER ES
> +<U0441> <U0073>
> +% CYRILLIC SMALL LETTER TE
> +<U0442> <U0074>
> +% CYRILLIC SMALL LETTER U
> +<U0443> <U0075>
> +% CYRILLIC SMALL LETTER EF
> +<U0444> <U0066>
> +% CYRILLIC SMALL LETTER HA
> +<U0445> <U0078>
> +% CYRILLIC SMALL LETTER TSE
> +<U0446> "<U0063><U007A>";<U0063>
> +% CYRILLIC SMALL LETTER CHE
> +<U0447> "<U0063><U0068>";<U0063>
> +% CYRILLIC SMALL LETTER SHA
> +<U0448> "<U0073><U0068>";<U0073>
> +% CYRILLIC SMALL LETTER SHCHA
> +<U0449> "<U0073><U0068><U0068>";<U0073>
> +% CYRILLIC SMALL LETTER HARD SIGN
> +<U044A> "<U0060><U0060>";<U0060>
> +% CYRILLIC SMALL LETTER YERU
> +<U044B> "<U0079><U0027>";<U0079>
> +% CYRILLIC SMALL LETTER SOFT SIGN
> +<U044C> <U0060>
> +% CYRILLIC SMALL LETTER E
> +<U044D> "<U0065><U0060>";<U0065>
> +% CYRILLIC SMALL LETTER YU
> +<U044E> "<U0079><U0075>";<U0079>
> +% CYRILLIC SMALL LETTER YA
> +<U044F> "<U0079><U0061>";<U0079>
> +% CYRILLIC SMALL LETTER IO
> +<U0451> "<U0079><U006F>";<U0079>
> +
> +
> +translit_end
> +
> +END LC_CTYPE
> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
> --- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
> @@ -64,6 +64,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
> --- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
> @@ -48,6 +48,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
> --- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
> @@ -46,6 +46,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
> --- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
> @@ -58,6 +58,7 @@
>  % Farsi yeh -> yeh
>  <U06CC> "<U064A>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
> --- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
> @@ -67,6 +67,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
> --- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
> @@ -58,6 +58,7 @@
>  % dong sign -> d// -> dd
>  <U20AB> "<U0111>";"<U0064><U0064>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
> --- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
> @@ -69,6 +69,7 @@
>  <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>  <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
> 
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
> --- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
> @@ -55,6 +55,7 @@
>  % Accents are simply omitted if they cannot be represented.
>  include "translit_combining";""
> 
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
> --- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
> @@ -66,6 +66,7 @@
> 
>  translit_start
>  include "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
> --- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
> @@ -73,6 +73,7 @@
>  <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>  <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>  <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
> +include "translit_cyrillic";""
>  translit_end
> 
>  END LC_CTYPE
> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
> --- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
> @@ -58,6 +58,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
> 
>  class	"hanzi"; /
> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
> --- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
> +++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
> @@ -70,6 +70,7 @@
> 
>  translit_start
>  include  "translit_combining";""
> +include "translit_cyrillic";""
>  translit_end
>  END LC_CTYPE
> 
> 
> 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-03  8:26     ` Egor Kobylkin
@ 2018-10-03  9:19       ` Keld Simonsen
  2018-10-03  9:32         ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Keld Simonsen @ 2018-10-03  9:19 UTC (permalink / raw)
  To: Egor Kobylkin
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi

Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and 
there is also an ISO standard for it. 

But do go forward with fixing this bug.

Best regards
Keld

On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
> Ping.
> 
> Absent of feedback I am wondering if anything could be missing in this
> patch from the maintainers standpoint. More than two months have passed
> since the original submission.
> 
> If I can be of assistance, please do not hesitate to contact me,
> Egor Kobylkin
> 
> On 06.08.2018 21:00, Egor Kobylkin wrote:
> > Dear locale maintainers,
> > 
> > fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> > 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> > 
> > add Cyrillic transliteration table translit_cyrillic file
> > 
> > https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
> > 
> > to localedata/locales/ and include it in all your locales going forward.
> > 
> > Patch included inline below.
> > 
> > This is a re-submission for the consideration for 2.29 on a request from
> > Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
> > 
> > From this patch I have excluded locales that already mention cyrillic or
> > have a transliteration table for it:
> > az_AZ
> > iso14651_t1_common
> > ky_KG
> > mn_MN
> > sr_RS
> > tg_TJ
> > tk_TM
> > tt_RU
> > uk_UA
> > uz_UZ
> > uz_UZ@cyrillic
> > 
> > Their maintainers are requested to make an explicit decision on how and
> > whether at all to include this patch.
> > 
> > 
> > 
> > Current bug effect:
> > 
> > The glibc wiki explicitly lists this use case as the test example
> > 
> > https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
> > 
> > LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> > translit-test-input.txt
> > 
> > currently it fails on Cyrillic texts in most locales including ru_RU [1]
> > [8] [9]:
> > 
> > LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> > translit-test-input.txt |grep CYRILLIC
> > 
> > CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> > 
> >  - It produces a string of question marks and spaces.
> > 
> > This is what it should produce and it does so after the patch applied:
> > 
> > CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> > chayu.
> > 
> > 
> > Root problem and the fix:
> > 
> > The root problem is the missing transliteration table that I am
> > supplying here. Furthermore it has to be referenced/included into the
> > active locale at the compilation time to be used by iconv.
> > 
> > 
> > 
> > COMMIT MESSAGE:
> > This translit_cyrillic table enables conversion (e.g. with iconv) from a
> > UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> > 
> > While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> > a transliteration has only ASCII codes but still can be read by a native
> > speaker. Among other things it is useful for processing the Cyrillic
> > texts and filenames by programs or on systems that are not specifically
> > prepared to work with Cyrillic, don't have corresponding fonts installed
> > or can't handle UTF-8.
> > 
> > The transliteration table itself is attached as a file translit_cyrillic
> > [7]. Its content (mapping) is based on GOST 7.79-2000 official source
> > (Federal Agency on Technical Regulating and Metrology Of Russian
> > Federation [2]). Technically an independent but identical source [3] was
> > used and prepared in a spreadsheet [6].
> > 
> > The documentation suggests that the transliteration tables inclusion is
> > done by adding *include "translit_cyrillic";""* string into LC_CTYPE
> > translit_start section
> > http://man7.org/linux/man-pages/man5/locale.5.html [5]
> > Practically I have searched for all locales that have a
> > translit_start/end stance and generated a patch for them.
> > 
> > The Cyrillic transliteration of e.g. Russian text may have already
> > worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
> > have their transliteration tables included inline.
> > However it would not be the standard Russian Cyrillic transliteration as
> > described above.
> > I am excluding these locales from this proposed patch. I have written
> > directly to locale maintainer emails listed in the files. Volodymyr
> > Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> > ???????????? ?????????? <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
> > exclusion.
> > 
> > Links:
> > 
> > [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> > [2] GOST 7.79-2000 official source
> > http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> > available in low quality gif format)
> > [3] http://transliteration.ru/gost-7-79-2000/ and
> > http://www.yfermer.ru/specifications/285821.html
> > [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> > https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> > [5] http://man7.org/linux/man-pages/man5/locale.5.html
> > [6] Spreadsheet for generating translit_cyrillic
> > https://sourceware.org/bugzilla/attachment.cgi?id=8590
> > [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
> > [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> > [9] translit-test-input.txt
> > https://sourceware.org/bugzilla/attachment.cgi?id=8618
> > 
> > Best regards,
> > Egor Kobylkin
> > 
> > ---
> > 2018-07-17  Egor Kobylkin  <egor@kobylkin.com>
> > 
> > 	[BZ #2872]
> > 	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
> > table from Cyrillic to Latin.
> > 	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
> > section.
> > 	* locales/aa_DJ: likewise
> > 	* locales/af_ZA: likewise
> > 	* locales/ak_GH: likewise
> > 	* locales/am_ET: likewise
> > 	* locales/ar_EG: likewise
> > 	* locales/be_BY: likewise
> > 	* locales/bem_ZM: likewise
> > 	* locales/ber_DZ: likewise
> > 	* locales/ber_MA: likewise
> > 	* locales/bg_BG: likewise
> > 	* locales/bi_VU: likewise
> > 	* locales/bn_BD: likewise
> > 	* locales/bo_CN: likewise
> > 	* locales/ca_ES: likewise
> > 	* locales/ce_RU: likewise
> > 	* locales/cs_CZ: likewise
> > 	* locales/cv_RU: likewise
> > 	* locales/cy_GB: likewise
> > 	* locales/da_DK: likewise
> > 	* locales/de_DE: likewise
> > 	* locales/dv_MV: likewise
> > 	* locales/dz_BT: likewise
> > 	* locales/el_GR: likewise
> > 	* locales/en_GB: likewise
> > 	* locales/en_NG: likewise
> > 	* locales/en_ZM: likewise
> > 	* locales/es_CU: likewise
> > 	* locales/es_ES: likewise
> > 	* locales/et_EE: likewise
> > 	* locales/fa_IR: likewise
> > 	* locales/ff_SN: likewise
> > 	* locales/fi_FI: likewise
> > 	* locales/fr_FR: likewise
> > 	* locales/ga_IE: likewise
> > 	* locales/gd_GB: likewise
> > 	* locales/gu_IN: likewise
> > 	* locales/gv_GB: likewise
> > 	* locales/he_IL: likewise
> > 	* locales/hi_IN: likewise
> > 	* locales/hif_FJ: likewise
> > 	* locales/hr_HR: likewise
> > 	* locales/ht_HT: likewise
> > 	* locales/hu_HU: likewise
> > 	* locales/hy_AM: likewise
> > 	* locales/id_ID: likewise
> > 	* locales/is_IS: likewise
> > 	* locales/it_IT: likewise
> > 	* locales/ja_JP: likewise
> > 	* locales/kk_KZ: likewise
> > 	* locales/km_KH: likewise
> > 	* locales/kn_IN: likewise
> > 	* locales/ko_KR: likewise
> > 	* locales/ks_IN: likewise
> > 	* locales/kw_GB: likewise
> > 	* locales/lb_LU: likewise
> > 	* locales/lg_UG: likewise
> > 	* locales/lij_IT: likewise
> > 	* locales/ln_CD: likewise
> > 	* locales/lo_LA: likewise
> > 	* locales/lt_LT: likewise
> > 	* locales/lv_LV: likewise
> > 	* locales/mg_MG: likewise
> > 	* locales/mhr_RU: likewise
> > 	* locales/mk_MK: likewise
> > 	* locales/ml_IN: likewise
> > 	* locales/ms_MY: likewise
> > 	* locales/mt_MT: likewise
> > 	* locales/nan_TW@latin: likewise
> > 	* locales/nb_NO: likewise
> > 	* locales/ne_NP: likewise
> > 	* locales/nhn_MX: likewise
> > 	* locales/niu_NU: likewise
> > 	* locales/niu_NZ: likewise
> > 	* locales/nl_NL: likewise
> > 	* locales/nr_ZA: likewise
> > 	* locales/oc_FR: likewise
> > 	* locales/om_KE: likewise
> > 	* locales/or_IN: likewise
> > 	* locales/os_RU: likewise
> > 	* locales/pa_IN: likewise
> > 	* locales/pa_PK: likewise
> > 	* locales/pl_PL: likewise
> > 	* locales/pt_PT: likewise
> > 	* locales/quz_PE: likewise
> > 	* locales/ro_RO: likewise
> > 	* locales/ru_RU: likewise
> > 	* locales/rw_RW: likewise
> > 	* locales/sa_IN: likewise
> > 	* locales/sd_IN: likewise
> > 	* locales/sd_IN@devanagari: likewise
> > 	* locales/sd_PK: likewise
> > 	* locales/se_NO: likewise
> > 	* locales/sgs_LT: likewise
> > 	* locales/si_LK: likewise
> > 	* locales/sk_SK: likewise
> > 	* locales/sl_SI: likewise
> > 	* locales/sm_WS: likewise
> > 	* locales/so_SO: likewise
> > 	* locales/sq_AL: likewise
> > 	* locales/ss_ZA: likewise
> > 	* locales/st_ZA: likewise
> > 	* locales/sv_SE: likewise
> > 	* locales/sw_KE: likewise
> > 	* locales/ta_IN: likewise
> > 	* locales/te_IN: likewise
> > 	* locales/th_TH: likewise
> > 	* locales/ti_ET: likewise
> > 	* locales/tn_ZA: likewise
> > 	* locales/to_TO: likewise
> > 	* locales/tpi_PG: likewise
> > 	* locales/tr_TR: likewise
> > 	* locales/ts_ZA: likewise
> > 	* locales/unm_US: likewise
> > 	* locales/ur_IN: likewise
> > 	* locales/ur_PK: likewise
> > 	* locales/ve_ZA: likewise
> > 	* locales/vi_VN: likewise
> > 	* locales/wa_BE: likewise
> > 	* locales/wo_SN: likewise
> > 	* locales/xh_ZA: likewise
> > 	* locales/yi_US: likewise
> > 	* locales/zh_CN: likewise
> > 	* locales/zu_ZA: likewise
> > 
> > 
> > diff -uNr a/localedata/locales/C b/localedata/locales/C
> > --- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
> > @@ -2292,6 +2292,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
> > --- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
> > @@ -70,6 +70,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
> > --- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
> > @@ -72,6 +72,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
> > --- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
> > @@ -56,6 +56,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> > --- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
> > @@ -1396,6 +1396,7 @@
> >  <U137A>    <U0060><U0039><U0030>
> >  <U137B>    <U0060><U0031><U0030><U0030>
> >  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
> > +include "translit_cyrillic";""
> >  translit_end
> >  %
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
> > --- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
> > @@ -44,6 +44,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
> > --- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
> > @@ -69,6 +69,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
> > --- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
> > @@ -42,6 +42,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
> > --- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
> > @@ -166,6 +166,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
> > --- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
> > @@ -86,6 +86,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
> > --- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
> > @@ -49,6 +49,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
> > --- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
> > @@ -39,6 +39,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
> > --- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
> > @@ -63,6 +63,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
> > --- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
> > @@ -43,6 +43,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
> > --- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
> > @@ -72,6 +72,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
> > --- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
> > @@ -39,6 +39,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
> > --- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
> > @@ -2311,6 +2311,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
> > --- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
> > @@ -109,6 +109,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
> > --- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
> > @@ -69,6 +69,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
> > --- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
> > @@ -167,6 +167,7 @@
> >  % LATIN SMALL LETTER O WITH STROKE -> "oe"
> >  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
> > --- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
> > @@ -78,6 +78,7 @@
> >  % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
> >  <U201F> <U00AB>;<U0022>
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
> > --- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
> > @@ -52,6 +52,7 @@
> >  include "translit_combining";""
> > 
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
> > --- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
> > @@ -60,6 +60,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
> > --- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
> > @@ -59,6 +59,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
> > --- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
> > @@ -55,6 +55,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
> > --- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
> > @@ -50,6 +50,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
> > --- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
> > @@ -42,6 +42,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
> > --- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
> > @@ -59,6 +59,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
> > --- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
> > @@ -73,6 +73,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
> > --- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
> > @@ -109,6 +109,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
> > --- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
> > @@ -79,6 +79,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
> > --- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
> > @@ -42,6 +42,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
> > --- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
> > @@ -137,6 +137,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
> > --- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> >  % In France, accents are simply omitted if they cannot be represented.
> >  include "translit_combining";""
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
> > --- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
> > @@ -54,6 +54,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
> > --- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
> > @@ -47,6 +47,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
> > --- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
> > @@ -62,6 +62,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
> > --- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
> > @@ -57,6 +57,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
> > --- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
> > --- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
> > @@ -61,6 +61,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
> > --- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
> > @@ -37,6 +37,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
> > --- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
> > @@ -153,6 +153,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
> > --- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
> > --- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
> > @@ -478,6 +478,7 @@
> >  <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
> >  <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
> > --- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
> > @@ -77,6 +77,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
> > --- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
> > @@ -55,6 +55,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
> > --- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
> > @@ -2161,6 +2161,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
> > --- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
> > --- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
> > @@ -1682,6 +1682,7 @@
> >  include "translit_combining";""
> >  include "translit_cjk_variants";""
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
> > --- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
> > @@ -158,6 +158,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
> > --- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
> > @@ -873,6 +873,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
> > --- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
> > @@ -63,6 +63,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
> > --- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
> > @@ -6099,6 +6099,7 @@
> >  include "translit_combining";""
> >  include "translit_hangul";""
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
> > --- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
> > @@ -46,6 +46,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
> > --- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
> > @@ -58,6 +58,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
> > --- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
> > @@ -78,6 +78,7 @@
> >  % LATIN SMALL LETTER E WITH CIRCUMFLEX
> >  <U00EA> "<U0065><U005E>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
> > --- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
> > @@ -57,6 +57,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
> > --- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
> > @@ -47,6 +47,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
> > --- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
> > @@ -39,6 +39,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
> > --- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
> > @@ -51,6 +51,7 @@
> >  copy "i18n"
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
> > --- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
> > @@ -77,6 +77,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
> > --- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
> > @@ -2122,6 +2122,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
> > --- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
> > @@ -55,6 +55,7 @@
> >  % Accents are simply omitted if they cannot be represented.
> >  include "translit_combining";""
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
> > --- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
> > @@ -59,6 +59,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
> > --- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
> > @@ -49,6 +49,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
> > --- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
> > @@ -60,6 +60,7 @@
> > 
> >  translit_start
> >  include     "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> >  %
> > diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
> > --- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
> > @@ -45,6 +45,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
> > --- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
> > @@ -47,6 +47,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/nan_TW@latin
> > b/localedata/locales/nan_TW@latin
> > --- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
> > @@ -53,6 +53,7 @@
> >  % accents are simply omitted if they cannot be represented.
> >  include "translit_combining";""
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
> > --- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
> > @@ -154,6 +154,7 @@
> >  % LATIN SMALL LETTER O WITH STROKE -> "oe"
> >  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
> > --- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
> > @@ -43,6 +43,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
> > --- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
> > --- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
> > --- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
> > --- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
> > @@ -57,6 +57,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
> > --- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
> > @@ -66,6 +66,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
> > --- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
> > @@ -62,6 +62,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
> > --- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
> > @@ -140,6 +140,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
> > --- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
> > @@ -62,6 +62,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
> > --- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
> > @@ -70,6 +70,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
> > --- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> > 
> >  translit_start
> >  include     "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
> > --- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
> > @@ -58,6 +58,7 @@
> >  % Farsi yeh -> yeh
> >  <U06CC> "<U064A>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
> > --- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
> > @@ -142,6 +142,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
> > --- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
> > @@ -59,6 +59,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
> > --- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
> > @@ -57,6 +57,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
> > --- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
> > @@ -144,6 +144,7 @@
> >  <U0162> "<U021A>";"<U0054>"
> >  <U0163> "<U021B>";"<U0074>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
> > --- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
> > @@ -74,6 +74,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
> > --- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
> > @@ -45,6 +45,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
> > --- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
> > @@ -44,6 +44,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
> > --- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
> > @@ -46,6 +46,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sd_IN@devanagari
> > b/localedata/locales/sd_IN@devanagari
> > --- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
> > +0000
> > +++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
> > +0000
> > @@ -44,6 +44,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> > --- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
> > @@ -39,6 +39,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
> > --- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
> > @@ -205,6 +205,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
> > --- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
> > @@ -59,6 +59,7 @@
> >  copy "i18n"
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
> > --- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
> > @@ -45,6 +45,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
> > --- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
> > @@ -68,6 +68,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
> > --- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
> > @@ -91,6 +91,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
> > --- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
> > @@ -37,6 +37,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
> > --- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
> > @@ -70,6 +70,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
> > --- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
> > @@ -45,6 +45,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
> > --- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
> > @@ -68,6 +68,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
> > --- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
> > @@ -64,6 +64,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
> > --- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
> > @@ -139,6 +139,7 @@
> >  % LATIN SMALL LETTER O WITH STROKE -> "oe"
> >  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
> > --- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
> > @@ -44,6 +44,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
> > --- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
> > @@ -63,6 +63,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
> > --- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
> > @@ -63,6 +63,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
> > --- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
> > @@ -58,6 +58,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
> > --- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
> > @@ -866,6 +866,7 @@
> >  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
> > 
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  %
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
> > --- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
> > @@ -69,6 +69,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
> > --- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
> > @@ -36,6 +36,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
> > --- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
> > @@ -37,6 +37,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
> > --- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
> > @@ -2430,6 +2430,7 @@
> > 
> >  % TURKISH LIRA SIGN
> >  <U20BA> "<U0054><U004C>"
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/translit_cyrillic
> > b/localedata/locales/translit_cyrillic
> > --- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
> > +0000
> > +++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
> > +0000
> > @@ -0,0 +1,151 @@
> > +escape_char /
> > +comment_char %
> > +
> > +% Transliterations that converts cyrillic letters to ascii symbols
> > inspired by GOST 7.79-2000
> > +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> > +% Generated from UnicodeData.txt with
> > +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
> > +% Up to three characters are required to do a reversible transliteration.
> > +
> > +LC_CTYPE
> > +
> > +translit_start
> > +
> > +
> > +% CYRILLIC CAPITAL LETTER IO
> > +<U0401> "<U0059><U004F>";<U0059>
> > +% CYRILLIC CAPITAL LETTER A
> > +<U0410> <U0041>
> > +% CYRILLIC CAPITAL LETTER BE
> > +<U0411> <U0042>
> > +% CYRILLIC CAPITAL LETTER VE
> > +<U0412> <U0056>
> > +% CYRILLIC CAPITAL LETTER GHE
> > +<U0413> <U0047>
> > +% CYRILLIC CAPITAL LETTER DE
> > +<U0414> <U0044>
> > +% CYRILLIC CAPITAL LETTER IE
> > +<U0415> <U0045>
> > +% CYRILLIC CAPITAL LETTER ZHE
> > +<U0416> "<U005A><U0048>";<U005A>
> > +% CYRILLIC CAPITAL LETTER ZE
> > +<U0417> <U005A>
> > +% CYRILLIC CAPITAL LETTER I
> > +<U0418> <U0049>
> > +% CYRILLIC CAPITAL LETTER SHORT I
> > +<U0419> <U004A>
> > +% CYRILLIC CAPITAL LETTER KA
> > +<U041A> <U004B>
> > +% CYRILLIC CAPITAL LETTER EL
> > +<U041B> <U004C>
> > +% CYRILLIC CAPITAL LETTER EM
> > +<U041C> <U004D>
> > +% CYRILLIC CAPITAL LETTER EN
> > +<U041D> <U004E>
> > +% CYRILLIC CAPITAL LETTER O
> > +<U041E> <U004F>
> > +% CYRILLIC CAPITAL LETTER PE
> > +<U041F> <U0050>
> > +% CYRILLIC CAPITAL LETTER ER
> > +<U0420> <U0052>
> > +% CYRILLIC CAPITAL LETTER ES
> > +<U0421> <U0053>
> > +% CYRILLIC CAPITAL LETTER TE
> > +<U0422> <U0054>
> > +% CYRILLIC CAPITAL LETTER U
> > +<U0423> <U0055>
> > +% CYRILLIC CAPITAL LETTER EF
> > +<U0424> <U0046>
> > +% CYRILLIC CAPITAL LETTER HA
> > +<U0425> <U0058>
> > +% CYRILLIC CAPITAL LETTER TSE
> > +<U0426> "<U0043><U005A>";<U0043>
> > +% CYRILLIC CAPITAL LETTER CHE
> > +<U0427> "<U0043><U0048>";<U0043>
> > +% CYRILLIC CAPITAL LETTER SHA
> > +<U0428> "<U0053><U0048>";<U0053>
> > +% CYRILLIC CAPITAL LETTER SHCHA
> > +<U0429> "<U0053><U0048><U0048>";<U0053>
> > +% CYRILLIC CAPITAL LETTER HARD SIGN
> > +<U042A> "<U0060><U0060>";<U0060>
> > +% CYRILLIC CAPITAL LETTER YERU
> > +<U042B> "<U0059><U0027>";<U0059>
> > +% CYRILLIC CAPITAL LETTER SOFT SIGN
> > +<U042C> <U0060>
> > +% CYRILLIC CAPITAL LETTER E
> > +<U042D> "<U0045><U0060>";<U0045>
> > +% CYRILLIC CAPITAL LETTER YU
> > +<U042E> "<U0059><U0055>";<U0059>
> > +% CYRILLIC CAPITAL LETTER YA
> > +<U042F> "<U0059><U0041>";<U0059>
> > +% CYRILLIC SMALL LETTER A
> > +<U0430> <U0061>
> > +% CYRILLIC SMALL LETTER BE
> > +<U0431> <U0062>
> > +% CYRILLIC SMALL LETTER VE
> > +<U0432> <U0076>
> > +% CYRILLIC SMALL LETTER GHE
> > +<U0433> <U0067>
> > +% CYRILLIC SMALL LETTER DE
> > +<U0434> <U0064>
> > +% CYRILLIC SMALL LETTER IE
> > +<U0435> <U0065>
> > +% CYRILLIC SMALL LETTER ZHE
> > +<U0436> "<U007A><U0068>";<U007A>
> > +% CYRILLIC SMALL LETTER ZE
> > +<U0437> <U007A>
> > +% CYRILLIC SMALL LETTER I
> > +<U0438> <U0069>
> > +% CYRILLIC SMALL LETTER SHORT I
> > +<U0439> <U006A>
> > +% CYRILLIC SMALL LETTER KA
> > +<U043A> <U006B>
> > +% CYRILLIC SMALL LETTER EL
> > +<U043B> <U006C>
> > +% CYRILLIC SMALL LETTER EM
> > +<U043C> <U006D>
> > +% CYRILLIC SMALL LETTER EN
> > +<U043D> <U006E>
> > +% CYRILLIC SMALL LETTER O
> > +<U043E> <U006F>
> > +% CYRILLIC SMALL LETTER PE
> > +<U043F> <U0070>
> > +% CYRILLIC SMALL LETTER ER
> > +<U0440> <U0072>
> > +% CYRILLIC SMALL LETTER ES
> > +<U0441> <U0073>
> > +% CYRILLIC SMALL LETTER TE
> > +<U0442> <U0074>
> > +% CYRILLIC SMALL LETTER U
> > +<U0443> <U0075>
> > +% CYRILLIC SMALL LETTER EF
> > +<U0444> <U0066>
> > +% CYRILLIC SMALL LETTER HA
> > +<U0445> <U0078>
> > +% CYRILLIC SMALL LETTER TSE
> > +<U0446> "<U0063><U007A>";<U0063>
> > +% CYRILLIC SMALL LETTER CHE
> > +<U0447> "<U0063><U0068>";<U0063>
> > +% CYRILLIC SMALL LETTER SHA
> > +<U0448> "<U0073><U0068>";<U0073>
> > +% CYRILLIC SMALL LETTER SHCHA
> > +<U0449> "<U0073><U0068><U0068>";<U0073>
> > +% CYRILLIC SMALL LETTER HARD SIGN
> > +<U044A> "<U0060><U0060>";<U0060>
> > +% CYRILLIC SMALL LETTER YERU
> > +<U044B> "<U0079><U0027>";<U0079>
> > +% CYRILLIC SMALL LETTER SOFT SIGN
> > +<U044C> <U0060>
> > +% CYRILLIC SMALL LETTER E
> > +<U044D> "<U0065><U0060>";<U0065>
> > +% CYRILLIC SMALL LETTER YU
> > +<U044E> "<U0079><U0075>";<U0079>
> > +% CYRILLIC SMALL LETTER YA
> > +<U044F> "<U0079><U0061>";<U0079>
> > +% CYRILLIC SMALL LETTER IO
> > +<U0451> "<U0079><U006F>";<U0079>
> > +
> > +
> > +translit_end
> > +
> > +END LC_CTYPE
> > diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
> > --- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
> > @@ -64,6 +64,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
> > --- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
> > @@ -48,6 +48,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
> > --- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
> > @@ -46,6 +46,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
> > --- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
> > @@ -58,6 +58,7 @@
> >  % Farsi yeh -> yeh
> >  <U06CC> "<U064A>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
> > --- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
> > @@ -67,6 +67,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
> > --- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
> > @@ -58,6 +58,7 @@
> >  % dong sign -> d// -> dd
> >  <U20AB> "<U0111>";"<U0064><U0064>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
> > --- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
> > @@ -69,6 +69,7 @@
> >  <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
> >  <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
> > --- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
> > @@ -55,6 +55,7 @@
> >  % Accents are simply omitted if they cannot be represented.
> >  include "translit_combining";""
> > 
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
> > --- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
> > @@ -66,6 +66,7 @@
> > 
> >  translit_start
> >  include "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
> > --- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
> > @@ -73,6 +73,7 @@
> >  <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
> >  <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
> >  <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  END LC_CTYPE
> > diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
> > --- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
> > @@ -58,6 +58,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> > 
> >  class	"hanzi"; /
> > diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
> > --- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
> > +++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
> > @@ -70,6 +70,7 @@
> > 
> >  translit_start
> >  include  "translit_combining";""
> > +include "translit_cyrillic";""
> >  translit_end
> >  END LC_CTYPE
> > 
> > 
> > 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-03  9:19       ` Keld Simonsen
@ 2018-10-03  9:32         ` Egor Kobylkin
  2018-10-05  8:43           ` Marko Myllynen
  2018-10-05  9:20           ` Rafal Luzynski
  0 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-03  9:32 UTC (permalink / raw)
  To: Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

On 03.10.2018 11:19, Keld Simonsen wrote:
> Hi
> 
> Please note that translitteration of Cyrillic to latin is not universal.
> There are different schemes for for example German, English and Danish, and 
> there is also an ISO standard for it. 

Thanks for your feedback, Keld!

Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?

That is:
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.

--Egor

> 
> But do go forward with fixing this bug.
> 
> Best regards
> Keld
> 
> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
>> Ping.
>>
>> Absent of feedback I am wondering if anything could be missing in this
>> patch from the maintainers standpoint. More than two months have passed
>> since the original submission.
>>
>> If I can be of assistance, please do not hesitate to contact me,
>> Egor Kobylkin
>>
>> On 06.08.2018 21:00, Egor Kobylkin wrote:
>>> Dear locale maintainers,
>>>
>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>>
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>>>
>>> add Cyrillic transliteration table translit_cyrillic file
>>>
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>>>
>>> to localedata/locales/ and include it in all your locales going forward.
>>>
>>> Patch included inline below.
>>>
>>> This is a re-submission for the consideration for 2.29 on a request from
>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>>>
>>> From this patch I have excluded locales that already mention cyrillic or
>>> have a transliteration table for it:
>>> az_AZ
>>> iso14651_t1_common
>>> ky_KG
>>> mn_MN
>>> sr_RS
>>> tg_TJ
>>> tk_TM
>>> tt_RU
>>> uk_UA
>>> uz_UZ
>>> uz_UZ@cyrillic
>>>
>>> Their maintainers are requested to make an explicit decision on how and
>>> whether at all to include this patch.
>>>
>>>
>>>
>>> Current bug effect:
>>>
>>> The glibc wiki explicitly lists this use case as the test example
>>>
>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>>>
>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>> translit-test-input.txt
>>>
>>> currently it fails on Cyrillic texts in most locales including ru_RU [1]
>>> [8] [9]:
>>>
>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>> translit-test-input.txt |grep CYRILLIC
>>>
>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>>>
>>>  - It produces a string of question marks and spaces.
>>>
>>> This is what it should produce and it does so after the patch applied:
>>>
>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
>>> chayu.
>>>
>>>
>>> Root problem and the fix:
>>>
>>> The root problem is the missing transliteration table that I am
>>> supplying here. Furthermore it has to be referenced/included into the
>>> active locale at the compilation time to be used by iconv.
>>>
>>>
>>>
>>> COMMIT MESSAGE:
>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a
>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>>>
>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
>>> a transliteration has only ASCII codes but still can be read by a native
>>> speaker. Among other things it is useful for processing the Cyrillic
>>> texts and filenames by programs or on systems that are not specifically
>>> prepared to work with Cyrillic, don't have corresponding fonts installed
>>> or can't handle UTF-8.
>>>
>>> The transliteration table itself is attached as a file translit_cyrillic
>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
>>> (Federal Agency on Technical Regulating and Metrology Of Russian
>>> Federation [2]). Technically an independent but identical source [3] was
>>> used and prepared in a spreadsheet [6].
>>>
>>> The documentation suggests that the transliteration tables inclusion is
>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
>>> translit_start section
>>> http://man7.org/linux/man-pages/man5/locale.5.html [5]
>>> Practically I have searched for all locales that have a
>>> translit_start/end stance and generated a patch for them.
>>>
>>> The Cyrillic transliteration of e.g. Russian text may have already
>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
>>> have their transliteration tables included inline.
>>> However it would not be the standard Russian Cyrillic transliteration as
>>> described above.
>>> I am excluding these locales from this proposed patch. I have written
>>> directly to locale maintainer emails listed in the files. Volodymyr
>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>> ???????????? ?????????? <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
>>> exclusion.
>>>
>>> Links:
>>>
>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>> [2] GOST 7.79-2000 official source
>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
>>> available in low quality gif format)
>>> [3] http://transliteration.ru/gost-7-79-2000/ and
>>> http://www.yfermer.ru/specifications/285821.html
>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html
>>> [6] Spreadsheet for generating translit_cyrillic
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
>>> [9] translit-test-input.txt
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>>>
>>> Best regards,
>>> Egor Kobylkin
>>>
>>> ---
>>> 2018-07-17  Egor Kobylkin  <egor@kobylkin.com>
>>>
>>> 	[BZ #2872]
>>> 	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
>>> table from Cyrillic to Latin.
>>> 	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
>>> section.
>>> 	* locales/aa_DJ: likewise
>>> 	* locales/af_ZA: likewise
>>> 	* locales/ak_GH: likewise
>>> 	* locales/am_ET: likewise
>>> 	* locales/ar_EG: likewise
>>> 	* locales/be_BY: likewise
>>> 	* locales/bem_ZM: likewise
>>> 	* locales/ber_DZ: likewise
>>> 	* locales/ber_MA: likewise
>>> 	* locales/bg_BG: likewise
>>> 	* locales/bi_VU: likewise
>>> 	* locales/bn_BD: likewise
>>> 	* locales/bo_CN: likewise
>>> 	* locales/ca_ES: likewise
>>> 	* locales/ce_RU: likewise
>>> 	* locales/cs_CZ: likewise
>>> 	* locales/cv_RU: likewise
>>> 	* locales/cy_GB: likewise
>>> 	* locales/da_DK: likewise
>>> 	* locales/de_DE: likewise
>>> 	* locales/dv_MV: likewise
>>> 	* locales/dz_BT: likewise
>>> 	* locales/el_GR: likewise
>>> 	* locales/en_GB: likewise
>>> 	* locales/en_NG: likewise
>>> 	* locales/en_ZM: likewise
>>> 	* locales/es_CU: likewise
>>> 	* locales/es_ES: likewise
>>> 	* locales/et_EE: likewise
>>> 	* locales/fa_IR: likewise
>>> 	* locales/ff_SN: likewise
>>> 	* locales/fi_FI: likewise
>>> 	* locales/fr_FR: likewise
>>> 	* locales/ga_IE: likewise
>>> 	* locales/gd_GB: likewise
>>> 	* locales/gu_IN: likewise
>>> 	* locales/gv_GB: likewise
>>> 	* locales/he_IL: likewise
>>> 	* locales/hi_IN: likewise
>>> 	* locales/hif_FJ: likewise
>>> 	* locales/hr_HR: likewise
>>> 	* locales/ht_HT: likewise
>>> 	* locales/hu_HU: likewise
>>> 	* locales/hy_AM: likewise
>>> 	* locales/id_ID: likewise
>>> 	* locales/is_IS: likewise
>>> 	* locales/it_IT: likewise
>>> 	* locales/ja_JP: likewise
>>> 	* locales/kk_KZ: likewise
>>> 	* locales/km_KH: likewise
>>> 	* locales/kn_IN: likewise
>>> 	* locales/ko_KR: likewise
>>> 	* locales/ks_IN: likewise
>>> 	* locales/kw_GB: likewise
>>> 	* locales/lb_LU: likewise
>>> 	* locales/lg_UG: likewise
>>> 	* locales/lij_IT: likewise
>>> 	* locales/ln_CD: likewise
>>> 	* locales/lo_LA: likewise
>>> 	* locales/lt_LT: likewise
>>> 	* locales/lv_LV: likewise
>>> 	* locales/mg_MG: likewise
>>> 	* locales/mhr_RU: likewise
>>> 	* locales/mk_MK: likewise
>>> 	* locales/ml_IN: likewise
>>> 	* locales/ms_MY: likewise
>>> 	* locales/mt_MT: likewise
>>> 	* locales/nan_TW@latin: likewise
>>> 	* locales/nb_NO: likewise
>>> 	* locales/ne_NP: likewise
>>> 	* locales/nhn_MX: likewise
>>> 	* locales/niu_NU: likewise
>>> 	* locales/niu_NZ: likewise
>>> 	* locales/nl_NL: likewise
>>> 	* locales/nr_ZA: likewise
>>> 	* locales/oc_FR: likewise
>>> 	* locales/om_KE: likewise
>>> 	* locales/or_IN: likewise
>>> 	* locales/os_RU: likewise
>>> 	* locales/pa_IN: likewise
>>> 	* locales/pa_PK: likewise
>>> 	* locales/pl_PL: likewise
>>> 	* locales/pt_PT: likewise
>>> 	* locales/quz_PE: likewise
>>> 	* locales/ro_RO: likewise
>>> 	* locales/ru_RU: likewise
>>> 	* locales/rw_RW: likewise
>>> 	* locales/sa_IN: likewise
>>> 	* locales/sd_IN: likewise
>>> 	* locales/sd_IN@devanagari: likewise
>>> 	* locales/sd_PK: likewise
>>> 	* locales/se_NO: likewise
>>> 	* locales/sgs_LT: likewise
>>> 	* locales/si_LK: likewise
>>> 	* locales/sk_SK: likewise
>>> 	* locales/sl_SI: likewise
>>> 	* locales/sm_WS: likewise
>>> 	* locales/so_SO: likewise
>>> 	* locales/sq_AL: likewise
>>> 	* locales/ss_ZA: likewise
>>> 	* locales/st_ZA: likewise
>>> 	* locales/sv_SE: likewise
>>> 	* locales/sw_KE: likewise
>>> 	* locales/ta_IN: likewise
>>> 	* locales/te_IN: likewise
>>> 	* locales/th_TH: likewise
>>> 	* locales/ti_ET: likewise
>>> 	* locales/tn_ZA: likewise
>>> 	* locales/to_TO: likewise
>>> 	* locales/tpi_PG: likewise
>>> 	* locales/tr_TR: likewise
>>> 	* locales/ts_ZA: likewise
>>> 	* locales/unm_US: likewise
>>> 	* locales/ur_IN: likewise
>>> 	* locales/ur_PK: likewise
>>> 	* locales/ve_ZA: likewise
>>> 	* locales/vi_VN: likewise
>>> 	* locales/wa_BE: likewise
>>> 	* locales/wo_SN: likewise
>>> 	* locales/xh_ZA: likewise
>>> 	* locales/yi_US: likewise
>>> 	* locales/zh_CN: likewise
>>> 	* locales/zu_ZA: likewise
>>>
>>>
>>> diff -uNr a/localedata/locales/C b/localedata/locales/C
>>> --- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
>>> @@ -2292,6 +2292,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
>>> --- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
>>> --- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
>>> @@ -72,6 +72,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
>>> --- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
>>> @@ -56,6 +56,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
>>> --- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
>>> @@ -1396,6 +1396,7 @@
>>>  <U137A>    <U0060><U0039><U0030>
>>>  <U137B>    <U0060><U0031><U0030><U0030>
>>>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  %
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
>>> --- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
>>> @@ -44,6 +44,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
>>> --- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
>>> --- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
>>> @@ -42,6 +42,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
>>> --- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
>>> @@ -166,6 +166,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
>>> --- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
>>> @@ -86,6 +86,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
>>> --- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
>>> @@ -49,6 +49,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
>>> --- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
>>> --- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
>>> --- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
>>> @@ -43,6 +43,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
>>> --- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
>>> @@ -72,6 +72,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
>>> --- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
>>> --- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
>>> @@ -2311,6 +2311,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
>>> --- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
>>> @@ -109,6 +109,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
>>> --- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
>>> --- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
>>> @@ -167,6 +167,7 @@
>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
>>> --- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
>>> @@ -78,6 +78,7 @@
>>>  % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>>>  <U201F> <U00AB>;<U0022>
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
>>> --- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
>>> @@ -52,6 +52,7 @@
>>>  include "translit_combining";""
>>>
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
>>> --- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
>>> --- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
>>> --- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
>>> --- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
>>> @@ -50,6 +50,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
>>> --- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
>>> @@ -42,6 +42,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
>>> --- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
>>> --- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
>>> @@ -73,6 +73,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
>>> --- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
>>> @@ -109,6 +109,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
>>> --- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
>>> @@ -79,6 +79,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
>>> --- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
>>> @@ -42,6 +42,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
>>> --- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
>>> @@ -137,6 +137,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
>>> --- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>  % In France, accents are simply omitted if they cannot be represented.
>>>  include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
>>> --- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
>>> @@ -54,6 +54,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
>>> --- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
>>> @@ -47,6 +47,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
>>> --- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
>>> @@ -62,6 +62,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
>>> --- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
>>> --- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
>>> --- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
>>> @@ -61,6 +61,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
>>> --- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
>>> @@ -37,6 +37,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
>>> --- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
>>> @@ -153,6 +153,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
>>> --- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
>>> --- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
>>> @@ -478,6 +478,7 @@
>>>  <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>>>  <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
>>> --- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
>>> @@ -77,6 +77,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
>>> --- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
>>> --- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
>>> @@ -2161,6 +2161,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
>>> --- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
>>> --- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
>>> @@ -1682,6 +1682,7 @@
>>>  include "translit_combining";""
>>>  include "translit_cjk_variants";""
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
>>> --- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
>>> @@ -158,6 +158,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
>>> --- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
>>> @@ -873,6 +873,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
>>> --- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
>>> --- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
>>> @@ -6099,6 +6099,7 @@
>>>  include "translit_combining";""
>>>  include "translit_hangul";""
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
>>> --- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
>>> @@ -46,6 +46,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
>>> --- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
>>> --- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
>>> @@ -78,6 +78,7 @@
>>>  % LATIN SMALL LETTER E WITH CIRCUMFLEX
>>>  <U00EA> "<U0065><U005E>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
>>> --- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
>>> --- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
>>> @@ -47,6 +47,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
>>> --- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
>>> --- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
>>> @@ -51,6 +51,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
>>> --- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
>>> @@ -77,6 +77,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
>>> --- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
>>> @@ -2122,6 +2122,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
>>> --- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>>  % Accents are simply omitted if they cannot be represented.
>>>  include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
>>> --- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
>>> --- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
>>> @@ -49,6 +49,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
>>> --- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>>  translit_start
>>>  include     "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>  %
>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
>>> --- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
>>> --- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
>>> @@ -47,6 +47,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nan_TW@latin
>>> b/localedata/locales/nan_TW@latin
>>> --- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
>>> @@ -53,6 +53,7 @@
>>>  % accents are simply omitted if they cannot be represented.
>>>  include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
>>> --- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
>>> @@ -154,6 +154,7 @@
>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
>>> --- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
>>> @@ -43,6 +43,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
>>> --- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
>>> --- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
>>> --- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
>>> --- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
>>> --- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
>>> @@ -66,6 +66,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
>>> --- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
>>> @@ -62,6 +62,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
>>> --- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
>>> @@ -140,6 +140,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
>>> --- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
>>> @@ -62,6 +62,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
>>> --- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
>>> --- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>>  translit_start
>>>  include     "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
>>> --- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>  % Farsi yeh -> yeh
>>>  <U06CC> "<U064A>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
>>> --- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
>>> @@ -142,6 +142,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
>>> --- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
>>> --- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
>>> --- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
>>> @@ -144,6 +144,7 @@
>>>  <U0162> "<U021A>";"<U0054>"
>>>  <U0163> "<U021B>";"<U0074>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
>>> --- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
>>> @@ -74,6 +74,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
>>> --- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
>>> --- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
>>> @@ -44,6 +44,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
>>> --- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
>>> @@ -46,6 +46,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sd_IN@devanagari
>>> b/localedata/locales/sd_IN@devanagari
>>> --- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
>>> +0000
>>> +++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
>>> +0000
>>> @@ -44,6 +44,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
>>> --- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
>>> --- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
>>> @@ -205,6 +205,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
>>> --- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>  copy "i18n"
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
>>> --- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
>>> --- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
>>> @@ -68,6 +68,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
>>> --- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
>>> @@ -91,6 +91,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
>>> --- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
>>> @@ -37,6 +37,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
>>> --- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
>>> --- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
>>> --- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
>>> @@ -68,6 +68,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
>>> --- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
>>> @@ -64,6 +64,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
>>> --- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
>>> @@ -139,6 +139,7 @@
>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
>>> --- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
>>> @@ -44,6 +44,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
>>> --- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
>>> --- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
>>> --- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
>>> --- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
>>> @@ -866,6 +866,7 @@
>>>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
>>>
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  %
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
>>> --- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
>>> --- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
>>> @@ -36,6 +36,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
>>> --- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
>>> @@ -37,6 +37,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
>>> --- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
>>> @@ -2430,6 +2430,7 @@
>>>
>>>  % TURKISH LIRA SIGN
>>>  <U20BA> "<U0054><U004C>"
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/translit_cyrillic
>>> b/localedata/locales/translit_cyrillic
>>> --- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
>>> +0000
>>> +++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
>>> +0000
>>> @@ -0,0 +1,151 @@
>>> +escape_char /
>>> +comment_char %
>>> +
>>> +% Transliterations that converts cyrillic letters to ascii symbols
>>> inspired by GOST 7.79-2000
>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>> +% Generated from UnicodeData.txt with
>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>> +% Up to three characters are required to do a reversible transliteration.
>>> +
>>> +LC_CTYPE
>>> +
>>> +translit_start
>>> +
>>> +
>>> +% CYRILLIC CAPITAL LETTER IO
>>> +<U0401> "<U0059><U004F>";<U0059>
>>> +% CYRILLIC CAPITAL LETTER A
>>> +<U0410> <U0041>
>>> +% CYRILLIC CAPITAL LETTER BE
>>> +<U0411> <U0042>
>>> +% CYRILLIC CAPITAL LETTER VE
>>> +<U0412> <U0056>
>>> +% CYRILLIC CAPITAL LETTER GHE
>>> +<U0413> <U0047>
>>> +% CYRILLIC CAPITAL LETTER DE
>>> +<U0414> <U0044>
>>> +% CYRILLIC CAPITAL LETTER IE
>>> +<U0415> <U0045>
>>> +% CYRILLIC CAPITAL LETTER ZHE
>>> +<U0416> "<U005A><U0048>";<U005A>
>>> +% CYRILLIC CAPITAL LETTER ZE
>>> +<U0417> <U005A>
>>> +% CYRILLIC CAPITAL LETTER I
>>> +<U0418> <U0049>
>>> +% CYRILLIC CAPITAL LETTER SHORT I
>>> +<U0419> <U004A>
>>> +% CYRILLIC CAPITAL LETTER KA
>>> +<U041A> <U004B>
>>> +% CYRILLIC CAPITAL LETTER EL
>>> +<U041B> <U004C>
>>> +% CYRILLIC CAPITAL LETTER EM
>>> +<U041C> <U004D>
>>> +% CYRILLIC CAPITAL LETTER EN
>>> +<U041D> <U004E>
>>> +% CYRILLIC CAPITAL LETTER O
>>> +<U041E> <U004F>
>>> +% CYRILLIC CAPITAL LETTER PE
>>> +<U041F> <U0050>
>>> +% CYRILLIC CAPITAL LETTER ER
>>> +<U0420> <U0052>
>>> +% CYRILLIC CAPITAL LETTER ES
>>> +<U0421> <U0053>
>>> +% CYRILLIC CAPITAL LETTER TE
>>> +<U0422> <U0054>
>>> +% CYRILLIC CAPITAL LETTER U
>>> +<U0423> <U0055>
>>> +% CYRILLIC CAPITAL LETTER EF
>>> +<U0424> <U0046>
>>> +% CYRILLIC CAPITAL LETTER HA
>>> +<U0425> <U0058>
>>> +% CYRILLIC CAPITAL LETTER TSE
>>> +<U0426> "<U0043><U005A>";<U0043>
>>> +% CYRILLIC CAPITAL LETTER CHE
>>> +<U0427> "<U0043><U0048>";<U0043>
>>> +% CYRILLIC CAPITAL LETTER SHA
>>> +<U0428> "<U0053><U0048>";<U0053>
>>> +% CYRILLIC CAPITAL LETTER SHCHA
>>> +<U0429> "<U0053><U0048><U0048>";<U0053>
>>> +% CYRILLIC CAPITAL LETTER HARD SIGN
>>> +<U042A> "<U0060><U0060>";<U0060>
>>> +% CYRILLIC CAPITAL LETTER YERU
>>> +<U042B> "<U0059><U0027>";<U0059>
>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN
>>> +<U042C> <U0060>
>>> +% CYRILLIC CAPITAL LETTER E
>>> +<U042D> "<U0045><U0060>";<U0045>
>>> +% CYRILLIC CAPITAL LETTER YU
>>> +<U042E> "<U0059><U0055>";<U0059>
>>> +% CYRILLIC CAPITAL LETTER YA
>>> +<U042F> "<U0059><U0041>";<U0059>
>>> +% CYRILLIC SMALL LETTER A
>>> +<U0430> <U0061>
>>> +% CYRILLIC SMALL LETTER BE
>>> +<U0431> <U0062>
>>> +% CYRILLIC SMALL LETTER VE
>>> +<U0432> <U0076>
>>> +% CYRILLIC SMALL LETTER GHE
>>> +<U0433> <U0067>
>>> +% CYRILLIC SMALL LETTER DE
>>> +<U0434> <U0064>
>>> +% CYRILLIC SMALL LETTER IE
>>> +<U0435> <U0065>
>>> +% CYRILLIC SMALL LETTER ZHE
>>> +<U0436> "<U007A><U0068>";<U007A>
>>> +% CYRILLIC SMALL LETTER ZE
>>> +<U0437> <U007A>
>>> +% CYRILLIC SMALL LETTER I
>>> +<U0438> <U0069>
>>> +% CYRILLIC SMALL LETTER SHORT I
>>> +<U0439> <U006A>
>>> +% CYRILLIC SMALL LETTER KA
>>> +<U043A> <U006B>
>>> +% CYRILLIC SMALL LETTER EL
>>> +<U043B> <U006C>
>>> +% CYRILLIC SMALL LETTER EM
>>> +<U043C> <U006D>
>>> +% CYRILLIC SMALL LETTER EN
>>> +<U043D> <U006E>
>>> +% CYRILLIC SMALL LETTER O
>>> +<U043E> <U006F>
>>> +% CYRILLIC SMALL LETTER PE
>>> +<U043F> <U0070>
>>> +% CYRILLIC SMALL LETTER ER
>>> +<U0440> <U0072>
>>> +% CYRILLIC SMALL LETTER ES
>>> +<U0441> <U0073>
>>> +% CYRILLIC SMALL LETTER TE
>>> +<U0442> <U0074>
>>> +% CYRILLIC SMALL LETTER U
>>> +<U0443> <U0075>
>>> +% CYRILLIC SMALL LETTER EF
>>> +<U0444> <U0066>
>>> +% CYRILLIC SMALL LETTER HA
>>> +<U0445> <U0078>
>>> +% CYRILLIC SMALL LETTER TSE
>>> +<U0446> "<U0063><U007A>";<U0063>
>>> +% CYRILLIC SMALL LETTER CHE
>>> +<U0447> "<U0063><U0068>";<U0063>
>>> +% CYRILLIC SMALL LETTER SHA
>>> +<U0448> "<U0073><U0068>";<U0073>
>>> +% CYRILLIC SMALL LETTER SHCHA
>>> +<U0449> "<U0073><U0068><U0068>";<U0073>
>>> +% CYRILLIC SMALL LETTER HARD SIGN
>>> +<U044A> "<U0060><U0060>";<U0060>
>>> +% CYRILLIC SMALL LETTER YERU
>>> +<U044B> "<U0079><U0027>";<U0079>
>>> +% CYRILLIC SMALL LETTER SOFT SIGN
>>> +<U044C> <U0060>
>>> +% CYRILLIC SMALL LETTER E
>>> +<U044D> "<U0065><U0060>";<U0065>
>>> +% CYRILLIC SMALL LETTER YU
>>> +<U044E> "<U0079><U0075>";<U0079>
>>> +% CYRILLIC SMALL LETTER YA
>>> +<U044F> "<U0079><U0061>";<U0079>
>>> +% CYRILLIC SMALL LETTER IO
>>> +<U0451> "<U0079><U006F>";<U0079>
>>> +
>>> +
>>> +translit_end
>>> +
>>> +END LC_CTYPE
>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
>>> --- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
>>> @@ -64,6 +64,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
>>> --- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
>>> @@ -48,6 +48,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
>>> --- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
>>> @@ -46,6 +46,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
>>> --- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>  % Farsi yeh -> yeh
>>>  <U06CC> "<U064A>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
>>> --- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
>>> @@ -67,6 +67,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
>>> --- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>  % dong sign -> d// -> dd
>>>  <U20AB> "<U0111>";"<U0064><U0064>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
>>> --- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>>  <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>>>  <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
>>> --- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>>  % Accents are simply omitted if they cannot be represented.
>>>  include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
>>> --- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
>>> @@ -66,6 +66,7 @@
>>>
>>>  translit_start
>>>  include "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
>>> --- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
>>> @@ -73,6 +73,7 @@
>>>  <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>>>  <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>>>  <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  END LC_CTYPE
>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
>>> --- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>
>>>  class	"hanzi"; /
>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
>>> --- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
>>> +++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>>  translit_start
>>>  include  "translit_combining";""
>>> +include "translit_cyrillic";""
>>>  translit_end
>>>  END LC_CTYPE
>>>
>>>
>>>


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-03  9:32         ` Egor Kobylkin
@ 2018-10-05  8:43           ` Marko Myllynen
  2018-10-05  9:20           ` Rafal Luzynski
  1 sibling, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-05  8:43 UTC (permalink / raw)
  To: Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi Egor,

Thanks for your patience with this one.

On 2018-10-03 12:32, Egor Kobylkin wrote:
> On 03.10.2018 11:19, Keld Simonsen wrote:
>>
>> Please note that translitteration of Cyrillic to latin is not universal.
>> There are different schemes for for example German, English and Danish, and 
>> there is also an ISO standard for it. 
> 
> Thanks for your feedback, Keld!
> 
> Could the locale maintainers that wouldn't like to include this patch
> explicitly state so here?
> 
> That is:
> - In the case that there is a different preferred cyrillic
> transliteration table for any specific locale their maintainers may want
> to point me to it so I can supply a separate table/patch.
> - Or they could state explicitly that for some reason they would like to
> exclude their locale from the patch for a default cyrillic
> transliteration altogether.

The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to
understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so
perhaps you could mention both ISO 9 and the Wikipedia article in the
commit log. translit_cyrillic includes every transliteration defined in
ISO 9:1995 and GOST 7.79-2000, correct?

I think those locales which already have Cyrillic transliteration
defined it would be best to leave them as-is (as you've done) unless
there are some issues with them, there's probably a good reason why they
have been added in the first place.

For other locales, using ISO 9 instead of not doing transliteration at
all may not be entirely correct but I'd suppose it's better to provide
at least some sort of transliteration (even if not entirely correct)
than sequences of question marks. But as you say, locale maintainers may
know better the case for individual locales.

Wrt language-specific differences Keld mentioned, Finnish Wikipedia
article on transliteration gives an example, see the table on right at
https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian /
international / Finnish / Swedish / English / French / German / Polish /
phonetic transliteration of a Russian name. (The table also shows that
for correct transliteration ASCII letters are not enough for some
languages.)

Some of the differences and language-specific aspects are probably
impossible to take fully into account within the locale system we have
today. For example, in Finnish (the tables at
http://jkorpela.fi/iso9.html8 and
https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might
also be helpful):

1) transliteration of Russian is mostly as per ISO 9 but with national
differences defined in SFS 4900
2) transliteration of Russian and Ukrainian names have some slight
differences according to http://jkorpela.fi/iso9.html8
3) transliteration of a letter depends on its position within a word or
pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e)
except when at the beginning of a word it becomes U+006A U+0065 (je)

Hopefully we'll hear comments from others as well. Once your patch is
merged, I'll try to come up with the needed locale-specific changes for
fi_FI, some differences referred to in 1) above are straightforward to
implement but for 2) and 3) some compromises probably need to be made,
unfortunately.

Thanks,

>> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
>>> Ping.
>>>
>>> Absent of feedback I am wondering if anything could be missing in this
>>> patch from the maintainers standpoint. More than two months have passed
>>> since the original submission.
>>>
>>> If I can be of assistance, please do not hesitate to contact me,
>>> Egor Kobylkin
>>>
>>> On 06.08.2018 21:00, Egor Kobylkin wrote:
>>>> Dear locale maintainers,
>>>>
>>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>>>
>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>>>>
>>>> add Cyrillic transliteration table translit_cyrillic file
>>>>
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>>>>
>>>> to localedata/locales/ and include it in all your locales going forward.
>>>>
>>>> Patch included inline below.
>>>>
>>>> This is a re-submission for the consideration for 2.29 on a request from
>>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>>>>
>>>> From this patch I have excluded locales that already mention cyrillic or
>>>> have a transliteration table for it:
>>>> az_AZ
>>>> iso14651_t1_common
>>>> ky_KG
>>>> mn_MN
>>>> sr_RS
>>>> tg_TJ
>>>> tk_TM
>>>> tt_RU
>>>> uk_UA
>>>> uz_UZ
>>>> uz_UZ@cyrillic
>>>>
>>>> Their maintainers are requested to make an explicit decision on how and
>>>> whether at all to include this patch.
>>>>
>>>>
>>>>
>>>> Current bug effect:
>>>>
>>>> The glibc wiki explicitly lists this use case as the test example
>>>>
>>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>>>>
>>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt
>>>>
>>>> currently it fails on Cyrillic texts in most locales including ru_RU [1]
>>>> [8] [9]:
>>>>
>>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt |grep CYRILLIC
>>>>
>>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>>>>
>>>>  - It produces a string of question marks and spaces.
>>>>
>>>> This is what it should produce and it does so after the patch applied:
>>>>
>>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
>>>> chayu.
>>>>
>>>>
>>>> Root problem and the fix:
>>>>
>>>> The root problem is the missing transliteration table that I am
>>>> supplying here. Furthermore it has to be referenced/included into the
>>>> active locale at the compilation time to be used by iconv.
>>>>
>>>>
>>>>
>>>> COMMIT MESSAGE:
>>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a
>>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>>>>
>>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
>>>> a transliteration has only ASCII codes but still can be read by a native
>>>> speaker. Among other things it is useful for processing the Cyrillic
>>>> texts and filenames by programs or on systems that are not specifically
>>>> prepared to work with Cyrillic, don't have corresponding fonts installed
>>>> or can't handle UTF-8.
>>>>
>>>> The transliteration table itself is attached as a file translit_cyrillic
>>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
>>>> (Federal Agency on Technical Regulating and Metrology Of Russian
>>>> Federation [2]). Technically an independent but identical source [3] was
>>>> used and prepared in a spreadsheet [6].
>>>>
>>>> The documentation suggests that the transliteration tables inclusion is
>>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
>>>> translit_start section
>>>> http://man7.org/linux/man-pages/man5/locale.5.html [5]
>>>> Practically I have searched for all locales that have a
>>>> translit_start/end stance and generated a patch for them.
>>>>
>>>> The Cyrillic transliteration of e.g. Russian text may have already
>>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
>>>> have their transliteration tables included inline.
>>>> However it would not be the standard Russian Cyrillic transliteration as
>>>> described above.
>>>> I am excluding these locales from this proposed patch. I have written
>>>> directly to locale maintainer emails listed in the files. Volodymyr
>>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>>> ???????????? ?????????? <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
>>>> exclusion.
>>>>
>>>> Links:
>>>>
>>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> [2] GOST 7.79-2000 official source
>>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
>>>> available in low quality gif format)
>>>> [3] http://transliteration.ru/gost-7-79-2000/ and
>>>> http://www.yfermer.ru/specifications/285821.html
>>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
>>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html
>>>> [6] Spreadsheet for generating translit_cyrillic
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
>>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
>>>> [9] translit-test-input.txt
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>>>>
>>>> Best regards,
>>>> Egor Kobylkin
>>>>
>>>> ---
>>>> 2018-07-17  Egor Kobylkin  <egor@kobylkin.com>
>>>>
>>>> 	[BZ #2872]
>>>> 	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
>>>> table from Cyrillic to Latin.
>>>> 	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
>>>> section.
>>>> 	* locales/aa_DJ: likewise
>>>> 	* locales/af_ZA: likewise
>>>> 	* locales/ak_GH: likewise
>>>> 	* locales/am_ET: likewise
>>>> 	* locales/ar_EG: likewise
>>>> 	* locales/be_BY: likewise
>>>> 	* locales/bem_ZM: likewise
>>>> 	* locales/ber_DZ: likewise
>>>> 	* locales/ber_MA: likewise
>>>> 	* locales/bg_BG: likewise
>>>> 	* locales/bi_VU: likewise
>>>> 	* locales/bn_BD: likewise
>>>> 	* locales/bo_CN: likewise
>>>> 	* locales/ca_ES: likewise
>>>> 	* locales/ce_RU: likewise
>>>> 	* locales/cs_CZ: likewise
>>>> 	* locales/cv_RU: likewise
>>>> 	* locales/cy_GB: likewise
>>>> 	* locales/da_DK: likewise
>>>> 	* locales/de_DE: likewise
>>>> 	* locales/dv_MV: likewise
>>>> 	* locales/dz_BT: likewise
>>>> 	* locales/el_GR: likewise
>>>> 	* locales/en_GB: likewise
>>>> 	* locales/en_NG: likewise
>>>> 	* locales/en_ZM: likewise
>>>> 	* locales/es_CU: likewise
>>>> 	* locales/es_ES: likewise
>>>> 	* locales/et_EE: likewise
>>>> 	* locales/fa_IR: likewise
>>>> 	* locales/ff_SN: likewise
>>>> 	* locales/fi_FI: likewise
>>>> 	* locales/fr_FR: likewise
>>>> 	* locales/ga_IE: likewise
>>>> 	* locales/gd_GB: likewise
>>>> 	* locales/gu_IN: likewise
>>>> 	* locales/gv_GB: likewise
>>>> 	* locales/he_IL: likewise
>>>> 	* locales/hi_IN: likewise
>>>> 	* locales/hif_FJ: likewise
>>>> 	* locales/hr_HR: likewise
>>>> 	* locales/ht_HT: likewise
>>>> 	* locales/hu_HU: likewise
>>>> 	* locales/hy_AM: likewise
>>>> 	* locales/id_ID: likewise
>>>> 	* locales/is_IS: likewise
>>>> 	* locales/it_IT: likewise
>>>> 	* locales/ja_JP: likewise
>>>> 	* locales/kk_KZ: likewise
>>>> 	* locales/km_KH: likewise
>>>> 	* locales/kn_IN: likewise
>>>> 	* locales/ko_KR: likewise
>>>> 	* locales/ks_IN: likewise
>>>> 	* locales/kw_GB: likewise
>>>> 	* locales/lb_LU: likewise
>>>> 	* locales/lg_UG: likewise
>>>> 	* locales/lij_IT: likewise
>>>> 	* locales/ln_CD: likewise
>>>> 	* locales/lo_LA: likewise
>>>> 	* locales/lt_LT: likewise
>>>> 	* locales/lv_LV: likewise
>>>> 	* locales/mg_MG: likewise
>>>> 	* locales/mhr_RU: likewise
>>>> 	* locales/mk_MK: likewise
>>>> 	* locales/ml_IN: likewise
>>>> 	* locales/ms_MY: likewise
>>>> 	* locales/mt_MT: likewise
>>>> 	* locales/nan_TW@latin: likewise
>>>> 	* locales/nb_NO: likewise
>>>> 	* locales/ne_NP: likewise
>>>> 	* locales/nhn_MX: likewise
>>>> 	* locales/niu_NU: likewise
>>>> 	* locales/niu_NZ: likewise
>>>> 	* locales/nl_NL: likewise
>>>> 	* locales/nr_ZA: likewise
>>>> 	* locales/oc_FR: likewise
>>>> 	* locales/om_KE: likewise
>>>> 	* locales/or_IN: likewise
>>>> 	* locales/os_RU: likewise
>>>> 	* locales/pa_IN: likewise
>>>> 	* locales/pa_PK: likewise
>>>> 	* locales/pl_PL: likewise
>>>> 	* locales/pt_PT: likewise
>>>> 	* locales/quz_PE: likewise
>>>> 	* locales/ro_RO: likewise
>>>> 	* locales/ru_RU: likewise
>>>> 	* locales/rw_RW: likewise
>>>> 	* locales/sa_IN: likewise
>>>> 	* locales/sd_IN: likewise
>>>> 	* locales/sd_IN@devanagari: likewise
>>>> 	* locales/sd_PK: likewise
>>>> 	* locales/se_NO: likewise
>>>> 	* locales/sgs_LT: likewise
>>>> 	* locales/si_LK: likewise
>>>> 	* locales/sk_SK: likewise
>>>> 	* locales/sl_SI: likewise
>>>> 	* locales/sm_WS: likewise
>>>> 	* locales/so_SO: likewise
>>>> 	* locales/sq_AL: likewise
>>>> 	* locales/ss_ZA: likewise
>>>> 	* locales/st_ZA: likewise
>>>> 	* locales/sv_SE: likewise
>>>> 	* locales/sw_KE: likewise
>>>> 	* locales/ta_IN: likewise
>>>> 	* locales/te_IN: likewise
>>>> 	* locales/th_TH: likewise
>>>> 	* locales/ti_ET: likewise
>>>> 	* locales/tn_ZA: likewise
>>>> 	* locales/to_TO: likewise
>>>> 	* locales/tpi_PG: likewise
>>>> 	* locales/tr_TR: likewise
>>>> 	* locales/ts_ZA: likewise
>>>> 	* locales/unm_US: likewise
>>>> 	* locales/ur_IN: likewise
>>>> 	* locales/ur_PK: likewise
>>>> 	* locales/ve_ZA: likewise
>>>> 	* locales/vi_VN: likewise
>>>> 	* locales/wa_BE: likewise
>>>> 	* locales/wo_SN: likewise
>>>> 	* locales/xh_ZA: likewise
>>>> 	* locales/yi_US: likewise
>>>> 	* locales/zh_CN: likewise
>>>> 	* locales/zu_ZA: likewise
>>>>
>>>>
>>>> diff -uNr a/localedata/locales/C b/localedata/locales/C
>>>> --- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -2292,6 +2292,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
>>>> --- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
>>>> --- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
>>>> --- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -56,6 +56,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
>>>> --- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -1396,6 +1396,7 @@
>>>>  <U137A>    <U0060><U0039><U0030>
>>>>  <U137B>    <U0060><U0031><U0030><U0030>
>>>>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  %
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
>>>> --- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
>>>> --- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
>>>> --- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
>>>> --- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -166,6 +166,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
>>>> --- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -86,6 +86,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
>>>> --- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
>>>> --- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
>>>> --- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
>>>> --- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
>>>> --- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
>>>> --- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
>>>> --- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -2311,6 +2311,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
>>>> --- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
>>>> --- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
>>>> --- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -167,6 +167,7 @@
>>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
>>>> --- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>>  % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>>>>  <U201F> <U00AB>;<U0022>
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
>>>> --- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -52,6 +52,7 @@
>>>>  include "translit_combining";""
>>>>
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
>>>> --- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
>>>> --- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
>>>> --- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
>>>> --- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -50,6 +50,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
>>>> --- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
>>>> --- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
>>>> --- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
>>>> --- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
>>>> --- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -79,6 +79,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
>>>> --- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
>>>> --- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -137,6 +137,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
>>>> --- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>  % In France, accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
>>>> --- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -54,6 +54,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
>>>> --- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
>>>> --- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
>>>> --- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
>>>> --- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
>>>> --- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -61,6 +61,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
>>>> --- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
>>>> --- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -153,6 +153,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
>>>> --- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
>>>> --- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -478,6 +478,7 @@
>>>>  <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>>>>  <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
>>>> --- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
>>>> --- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
>>>> --- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -2161,6 +2161,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
>>>> --- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
>>>> --- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -1682,6 +1682,7 @@
>>>>  include "translit_combining";""
>>>>  include "translit_cjk_variants";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
>>>> --- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -158,6 +158,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
>>>> --- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -873,6 +873,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
>>>> --- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
>>>> --- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -6099,6 +6099,7 @@
>>>>  include "translit_combining";""
>>>>  include "translit_hangul";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
>>>> --- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
>>>> --- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
>>>> --- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>>  % LATIN SMALL LETTER E WITH CIRCUMFLEX
>>>>  <U00EA> "<U0065><U005E>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
>>>> --- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
>>>> --- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
>>>> --- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
>>>> --- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -51,6 +51,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
>>>> --- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
>>>> --- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -2122,6 +2122,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
>>>> --- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>  % Accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
>>>> --- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
>>>> --- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
>>>> --- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include     "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>  %
>>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
>>>> --- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
>>>> --- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nan_TW@latin
>>>> b/localedata/locales/nan_TW@latin
>>>> --- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -53,6 +53,7 @@
>>>>  % accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
>>>> --- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -154,6 +154,7 @@
>>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
>>>> --- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
>>>> --- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
>>>> --- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
>>>> --- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
>>>> --- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
>>>> --- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
>>>> --- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
>>>> --- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -140,6 +140,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
>>>> --- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
>>>> --- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
>>>> --- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include     "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
>>>> --- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>  % Farsi yeh -> yeh
>>>>  <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
>>>> --- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -142,6 +142,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
>>>> --- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
>>>> --- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
>>>> --- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -144,6 +144,7 @@
>>>>  <U0162> "<U021A>";"<U0054>"
>>>>  <U0163> "<U021B>";"<U0074>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
>>>> --- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -74,6 +74,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
>>>> --- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
>>>> --- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
>>>> --- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN@devanagari
>>>> b/localedata/locales/sd_IN@devanagari
>>>> --- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
>>>> +0000
>>>> +++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
>>>> +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
>>>> --- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
>>>> --- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -205,6 +205,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
>>>> --- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
>>>> --- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
>>>> --- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
>>>> --- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -91,6 +91,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
>>>> --- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
>>>> --- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
>>>> --- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
>>>> --- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
>>>> --- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
>>>> --- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -139,6 +139,7 @@
>>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
>>>> --- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
>>>> --- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
>>>> --- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
>>>> --- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
>>>> --- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -866,6 +866,7 @@
>>>>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
>>>>
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  %
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
>>>> --- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
>>>> --- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -36,6 +36,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
>>>> --- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
>>>> --- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -2430,6 +2430,7 @@
>>>>
>>>>  % TURKISH LIRA SIGN
>>>>  <U20BA> "<U0054><U004C>"
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/translit_cyrillic
>>>> b/localedata/locales/translit_cyrillic
>>>> --- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
>>>> +0000
>>>> +++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
>>>> +0000
>>>> @@ -0,0 +1,151 @@
>>>> +escape_char /
>>>> +comment_char %
>>>> +
>>>> +% Transliterations that converts cyrillic letters to ascii symbols
>>>> inspired by GOST 7.79-2000
>>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> +% Generated from UnicodeData.txt with
>>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> +% Up to three characters are required to do a reversible transliteration.
>>>> +
>>>> +LC_CTYPE
>>>> +
>>>> +translit_start
>>>> +
>>>> +
>>>> +% CYRILLIC CAPITAL LETTER IO
>>>> +<U0401> "<U0059><U004F>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER A
>>>> +<U0410> <U0041>
>>>> +% CYRILLIC CAPITAL LETTER BE
>>>> +<U0411> <U0042>
>>>> +% CYRILLIC CAPITAL LETTER VE
>>>> +<U0412> <U0056>
>>>> +% CYRILLIC CAPITAL LETTER GHE
>>>> +<U0413> <U0047>
>>>> +% CYRILLIC CAPITAL LETTER DE
>>>> +<U0414> <U0044>
>>>> +% CYRILLIC CAPITAL LETTER IE
>>>> +<U0415> <U0045>
>>>> +% CYRILLIC CAPITAL LETTER ZHE
>>>> +<U0416> "<U005A><U0048>";<U005A>
>>>> +% CYRILLIC CAPITAL LETTER ZE
>>>> +<U0417> <U005A>
>>>> +% CYRILLIC CAPITAL LETTER I
>>>> +<U0418> <U0049>
>>>> +% CYRILLIC CAPITAL LETTER SHORT I
>>>> +<U0419> <U004A>
>>>> +% CYRILLIC CAPITAL LETTER KA
>>>> +<U041A> <U004B>
>>>> +% CYRILLIC CAPITAL LETTER EL
>>>> +<U041B> <U004C>
>>>> +% CYRILLIC CAPITAL LETTER EM
>>>> +<U041C> <U004D>
>>>> +% CYRILLIC CAPITAL LETTER EN
>>>> +<U041D> <U004E>
>>>> +% CYRILLIC CAPITAL LETTER O
>>>> +<U041E> <U004F>
>>>> +% CYRILLIC CAPITAL LETTER PE
>>>> +<U041F> <U0050>
>>>> +% CYRILLIC CAPITAL LETTER ER
>>>> +<U0420> <U0052>
>>>> +% CYRILLIC CAPITAL LETTER ES
>>>> +<U0421> <U0053>
>>>> +% CYRILLIC CAPITAL LETTER TE
>>>> +<U0422> <U0054>
>>>> +% CYRILLIC CAPITAL LETTER U
>>>> +<U0423> <U0055>
>>>> +% CYRILLIC CAPITAL LETTER EF
>>>> +<U0424> <U0046>
>>>> +% CYRILLIC CAPITAL LETTER HA
>>>> +<U0425> <U0058>
>>>> +% CYRILLIC CAPITAL LETTER TSE
>>>> +<U0426> "<U0043><U005A>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER CHE
>>>> +<U0427> "<U0043><U0048>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER SHA
>>>> +<U0428> "<U0053><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER SHCHA
>>>> +<U0429> "<U0053><U0048><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER HARD SIGN
>>>> +<U042A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC CAPITAL LETTER YERU
>>>> +<U042B> "<U0059><U0027>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN
>>>> +<U042C> <U0060>
>>>> +% CYRILLIC CAPITAL LETTER E
>>>> +<U042D> "<U0045><U0060>";<U0045>
>>>> +% CYRILLIC CAPITAL LETTER YU
>>>> +<U042E> "<U0059><U0055>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER YA
>>>> +<U042F> "<U0059><U0041>";<U0059>
>>>> +% CYRILLIC SMALL LETTER A
>>>> +<U0430> <U0061>
>>>> +% CYRILLIC SMALL LETTER BE
>>>> +<U0431> <U0062>
>>>> +% CYRILLIC SMALL LETTER VE
>>>> +<U0432> <U0076>
>>>> +% CYRILLIC SMALL LETTER GHE
>>>> +<U0433> <U0067>
>>>> +% CYRILLIC SMALL LETTER DE
>>>> +<U0434> <U0064>
>>>> +% CYRILLIC SMALL LETTER IE
>>>> +<U0435> <U0065>
>>>> +% CYRILLIC SMALL LETTER ZHE
>>>> +<U0436> "<U007A><U0068>";<U007A>
>>>> +% CYRILLIC SMALL LETTER ZE
>>>> +<U0437> <U007A>
>>>> +% CYRILLIC SMALL LETTER I
>>>> +<U0438> <U0069>
>>>> +% CYRILLIC SMALL LETTER SHORT I
>>>> +<U0439> <U006A>
>>>> +% CYRILLIC SMALL LETTER KA
>>>> +<U043A> <U006B>
>>>> +% CYRILLIC SMALL LETTER EL
>>>> +<U043B> <U006C>
>>>> +% CYRILLIC SMALL LETTER EM
>>>> +<U043C> <U006D>
>>>> +% CYRILLIC SMALL LETTER EN
>>>> +<U043D> <U006E>
>>>> +% CYRILLIC SMALL LETTER O
>>>> +<U043E> <U006F>
>>>> +% CYRILLIC SMALL LETTER PE
>>>> +<U043F> <U0070>
>>>> +% CYRILLIC SMALL LETTER ER
>>>> +<U0440> <U0072>
>>>> +% CYRILLIC SMALL LETTER ES
>>>> +<U0441> <U0073>
>>>> +% CYRILLIC SMALL LETTER TE
>>>> +<U0442> <U0074>
>>>> +% CYRILLIC SMALL LETTER U
>>>> +<U0443> <U0075>
>>>> +% CYRILLIC SMALL LETTER EF
>>>> +<U0444> <U0066>
>>>> +% CYRILLIC SMALL LETTER HA
>>>> +<U0445> <U0078>
>>>> +% CYRILLIC SMALL LETTER TSE
>>>> +<U0446> "<U0063><U007A>";<U0063>
>>>> +% CYRILLIC SMALL LETTER CHE
>>>> +<U0447> "<U0063><U0068>";<U0063>
>>>> +% CYRILLIC SMALL LETTER SHA
>>>> +<U0448> "<U0073><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER SHCHA
>>>> +<U0449> "<U0073><U0068><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER HARD SIGN
>>>> +<U044A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC SMALL LETTER YERU
>>>> +<U044B> "<U0079><U0027>";<U0079>
>>>> +% CYRILLIC SMALL LETTER SOFT SIGN
>>>> +<U044C> <U0060>
>>>> +% CYRILLIC SMALL LETTER E
>>>> +<U044D> "<U0065><U0060>";<U0065>
>>>> +% CYRILLIC SMALL LETTER YU
>>>> +<U044E> "<U0079><U0075>";<U0079>
>>>> +% CYRILLIC SMALL LETTER YA
>>>> +<U044F> "<U0079><U0061>";<U0079>
>>>> +% CYRILLIC SMALL LETTER IO
>>>> +<U0451> "<U0079><U006F>";<U0079>
>>>> +
>>>> +
>>>> +translit_end
>>>> +
>>>> +END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
>>>> --- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
>>>> --- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -48,6 +48,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
>>>> --- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
>>>> --- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>  % Farsi yeh -> yeh
>>>>  <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
>>>> --- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -67,6 +67,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
>>>> --- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>  % dong sign -> d// -> dd
>>>>  <U20AB> "<U0111>";"<U0064><U0064>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
>>>> --- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>  <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>>>>  <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
>>>> --- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>  % Accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
>>>> --- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
>>>> --- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>>  <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>>>>  <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>>>>  <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
>>>> --- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  class	"hanzi"; /
>>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
>>>> --- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
>>>> +++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>>
>>>>
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-03  9:32         ` Egor Kobylkin
  2018-10-05  8:43           ` Marko Myllynen
@ 2018-10-05  9:20           ` Rafal Luzynski
  2018-10-05 10:36             ` Egor Kobylkin
       [not found]             ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
  1 sibling, 2 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-05  9:20 UTC (permalink / raw)
  To: Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> On 03.10.2018 11:19, Keld Simonsen wrote:
> > Hi
> >
> > Please note that translitteration of Cyrillic to latin is not universal.
> > There are different schemes for for example German, English and Danish, and
> > there is also an ISO standard for it.
>
> Thanks for your feedback, Keld!
>
> Could the locale maintainers that wouldn't like to include this patch
> explicitly state so here?

I think it is about me so I must reply.  I am sorry about that and the sole
reason is my lack of time.  I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project.  I do these things only in my free time which I don't
have much.  Of course you will see my contributions here and there but they
are either trivial or take me months to complete.  Your patches are on my
radar but I can't tell any ETA for them.  Of course, there are other people
around here and they are all welcome to come and join.

> That is:
> - In the case that there is a different preferred cyrillic
> transliteration table for any specific locale their maintainers may want
> to point me to it so I can supply a separate table/patch.
> - Or they could state explicitly that for some reason they would like to
> exclude their locale from the patch for a default cyrillic
> transliteration altogether.

As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale.  At first sight, it seems to me they work only for English
(as a destination locale).  Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet.  What about
other languages which use Cyrillic alphabet but add their own diacritic
characters?  Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more.  What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian?  For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".

Few remarks:

* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better?  By the way,
  in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
  "y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
  to "e" or "ie" or "je" or "ye"?

These remarks are obviously incomplete, your patch deserves much more
attention to review.

Best regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-05  9:20           ` Rafal Luzynski
@ 2018-10-05 10:36             ` Egor Kobylkin
  2018-10-08 22:04               ` Rafal Luzynski
       [not found]             ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
  1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-05 10:36 UTC (permalink / raw)
  To: Rafal Luzynski, Keld Simonsen, Marko Myllynen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 6091 bytes --]

removed a png image attachment

Keld,Marko,Rafal, other locale maintainers,

this all is written with having in mind a minimal viable fix for this
bug asap. I want to avoid wasting maintainers time getting into
fundamental discussions here (although for perfectly good reasons).

I see three options:
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.

Does this make sense to you?

Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not explicitly
targeting transliteration standards of any country.

The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.

It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.

You are all are correctly mentioning the deficiencies of this approach.
However, I couldn't find a better straightforward approach as of yet.
Happy to hear from you as on how this could be handled.

There is a danger of being caught in the web of language/country
differences. I propose just pruning the locales that are not comfortable
including this current table. We can address possible solutions in the
second wave of patching.

I am vary of getting into discussions on specific country variants just
because of the sheer complexity of this topic. It is probably better
addressed by respective maintainers of their locales. I do not see a
"one fits all" solution in this first wave possible.

I would like to have this "three options plan of action" vetted first
and then we could go to the specific detail. (Like, for instance, what
characters should be included in to the table, and in which
transliteration form.)

I am looking forward to your reply,
Egor Kobylkin

P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.


On 05.10.2018 11:20, Rafal Luzynski wrote:
> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>> Hi
>>>
>>> Please note that translitteration of Cyrillic to latin is not universal.
>>> There are different schemes for for example German, English and Danish, and
>>> there is also an ISO standard for it.
>>
>> Thanks for your feedback, Keld!
>>
>> Could the locale maintainers that wouldn't like to include this patch
>> explicitly state so here?
> 
> I think it is about me so I must reply.  I am sorry about that and the sole
> reason is my lack of time.  I'm just a volunteer here, that means it's not
> my regular job to work on locale data nor anything in glibc nor in any other
> open source project.  I do these things only in my free time which I don't
> have much.  Of course you will see my contributions here and there but they
> are either trivial or take me months to complete.  Your patches are on my
> radar but I can't tell any ETA for them.  Of course, there are other people
> around here and they are all welcome to come and join.
> 
>> That is:
>> - In the case that there is a different preferred cyrillic
>> transliteration table for any specific locale their maintainers may want
>> to point me to it so I can supply a separate table/patch.
>> - Or they could state explicitly that for some reason they would like to
>> exclude their locale from the patch for a default cyrillic
>> transliteration altogether.
> 
> As Keld wrote, there are probably separate rules for every language so
> I don't think you should treat your rules as universal and include them
> in every locale.  At first sight, it seems to me they work only for English
> (as a destination locale).  Also, although it is called "transliteration
> from Cyrillic" it seems that it covers only Russian alphabet.  What about
> other languages which use Cyrillic alphabet but add their own diacritic
> characters?  Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
> Mari, Ossetian, Yakut, Tatar, and more.  What about languages which use
> Cyrillic alphabet but transliterate their respective letters in a different
> way than Russian?  For example, Russian "Ъ" is (I think) usually skipped
> in transliteration, I think you propose "``", but when transliterating from
> Bulgarian they usually transliterate this as "ă".
> 
> Few remarks:
> 
> * I think you transliterate "щ" as "shh", wouldn't "shch" be better?
> * You transliterate "ц" as "cz", wouldn't "ts" be better?  By the way,
>   in Polish language "cz" is a correct transliteration of "ч".
> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>   "y" be better in English?
> * In case of "е": how will you know if it is correct to transliterate it
>   to "e" or "ie" or "je" or "ye"?
> 
> These remarks are obviously incomplete, your patch deserves much more
> attention to review.
> 
> Best regards,
> 
> Rafal
> 



[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 71068 bytes --]

From: Marko Myllynen <myllynen@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, Keld Simonsen <keld@keldix.com>
Cc: libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
Date: Fri, 5 Oct 2018 11:43:46 +0300
Message-ID: <66f29205-d7fe-478c-26f9-f3a1d7eb9f25@redhat.com>

Hi Egor,

Thanks for your patience with this one.

On 2018-10-03 12:32, Egor Kobylkin wrote:
> On 03.10.2018 11:19, Keld Simonsen wrote:
>>
>> Please note that translitteration of Cyrillic to latin is not universal.
>> There are different schemes for for example German, English and Danish, and 
>> there is also an ISO standard for it. 
> 
> Thanks for your feedback, Keld!
> 
> Could the locale maintainers that wouldn't like to include this patch
> explicitly state so here?
> 
> That is:
> - In the case that there is a different preferred cyrillic
> transliteration table for any specific locale their maintainers may want
> to point me to it so I can supply a separate table/patch.
> - Or they could state explicitly that for some reason they would like to
> exclude their locale from the patch for a default cyrillic
> transliteration altogether.

The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to
understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so
perhaps you could mention both ISO 9 and the Wikipedia article in the
commit log. translit_cyrillic includes every transliteration defined in
ISO 9:1995 and GOST 7.79-2000, correct?

I think those locales which already have Cyrillic transliteration
defined it would be best to leave them as-is (as you've done) unless
there are some issues with them, there's probably a good reason why they
have been added in the first place.

For other locales, using ISO 9 instead of not doing transliteration at
all may not be entirely correct but I'd suppose it's better to provide
at least some sort of transliteration (even if not entirely correct)
than sequences of question marks. But as you say, locale maintainers may
know better the case for individual locales.

Wrt language-specific differences Keld mentioned, Finnish Wikipedia
article on transliteration gives an example, see the table on right at
https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian /
international / Finnish / Swedish / English / French / German / Polish /
phonetic transliteration of a Russian name. (The table also shows that
for correct transliteration ASCII letters are not enough for some
languages.)

Some of the differences and language-specific aspects are probably
impossible to take fully into account within the locale system we have
today. For example, in Finnish (the tables at
http://jkorpela.fi/iso9.html8 and
https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might
also be helpful):

1) transliteration of Russian is mostly as per ISO 9 but with national
differences defined in SFS 4900
2) transliteration of Russian and Ukrainian names have some slight
differences according to http://jkorpela.fi/iso9.html8
3) transliteration of a letter depends on its position within a word or
pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e)
except when at the beginning of a word it becomes U+006A U+0065 (je)

Hopefully we'll hear comments from others as well. Once your patch is
merged, I'll try to come up with the needed locale-specific changes for
fi_FI, some differences referred to in 1) above are straightforward to
implement but for 2) and 3) some compromises probably need to be made,
unfortunately.

Thanks,

>> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
>>> Ping.
>>>
>>> Absent of feedback I am wondering if anything could be missing in this
>>> patch from the maintainers standpoint. More than two months have passed
>>> since the original submission.
>>>
>>> If I can be of assistance, please do not hesitate to contact me,
>>> Egor Kobylkin
>>>
>>> On 06.08.2018 21:00, Egor Kobylkin wrote:
>>>> Dear locale maintainers,
>>>>
>>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>>>
>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>>>>
>>>> add Cyrillic transliteration table translit_cyrillic file
>>>>
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>>>>
>>>> to localedata/locales/ and include it in all your locales going forward.
>>>>
>>>> Patch included inline below.
>>>>
>>>> This is a re-submission for the consideration for 2.29 on a request from
>>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>>>>
>>>> From this patch I have excluded locales that already mention cyrillic or
>>>> have a transliteration table for it:
>>>> az_AZ
>>>> iso14651_t1_common
>>>> ky_KG
>>>> mn_MN
>>>> sr_RS
>>>> tg_TJ
>>>> tk_TM
>>>> tt_RU
>>>> uk_UA
>>>> uz_UZ
>>>> uz_UZ@cyrillic
>>>>
>>>> Their maintainers are requested to make an explicit decision on how and
>>>> whether at all to include this patch.
>>>>
>>>>
>>>>
>>>> Current bug effect:
>>>>
>>>> The glibc wiki explicitly lists this use case as the test example
>>>>
>>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>>>>
>>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt
>>>>
>>>> currently it fails on Cyrillic texts in most locales including ru_RU [1]
>>>> [8] [9]:
>>>>
>>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt |grep CYRILLIC
>>>>
>>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>>>>
>>>>  - It produces a string of question marks and spaces.
>>>>
>>>> This is what it should produce and it does so after the patch applied:
>>>>
>>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
>>>> chayu.
>>>>
>>>>
>>>> Root problem and the fix:
>>>>
>>>> The root problem is the missing transliteration table that I am
>>>> supplying here. Furthermore it has to be referenced/included into the
>>>> active locale at the compilation time to be used by iconv.
>>>>
>>>>
>>>>
>>>> COMMIT MESSAGE:
>>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a
>>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>>>>
>>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
>>>> a transliteration has only ASCII codes but still can be read by a native
>>>> speaker. Among other things it is useful for processing the Cyrillic
>>>> texts and filenames by programs or on systems that are not specifically
>>>> prepared to work with Cyrillic, don't have corresponding fonts installed
>>>> or can't handle UTF-8.
>>>>
>>>> The transliteration table itself is attached as a file translit_cyrillic
>>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
>>>> (Federal Agency on Technical Regulating and Metrology Of Russian
>>>> Federation [2]). Technically an independent but identical source [3] was
>>>> used and prepared in a spreadsheet [6].
>>>>
>>>> The documentation suggests that the transliteration tables inclusion is
>>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
>>>> translit_start section
>>>> http://man7.org/linux/man-pages/man5/locale.5.html [5]
>>>> Practically I have searched for all locales that have a
>>>> translit_start/end stance and generated a patch for them.
>>>>
>>>> The Cyrillic transliteration of e.g. Russian text may have already
>>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
>>>> have their transliteration tables included inline.
>>>> However it would not be the standard Russian Cyrillic transliteration as
>>>> described above.
>>>> I am excluding these locales from this proposed patch. I have written
>>>> directly to locale maintainer emails listed in the files. Volodymyr
>>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>>> ???????????? ?????????? <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
>>>> exclusion.
>>>>
>>>> Links:
>>>>
>>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> [2] GOST 7.79-2000 official source
>>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
>>>> available in low quality gif format)
>>>> [3] http://transliteration.ru/gost-7-79-2000/ and
>>>> http://www.yfermer.ru/specifications/285821.html
>>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
>>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html
>>>> [6] Spreadsheet for generating translit_cyrillic
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
>>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
>>>> [9] translit-test-input.txt
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>>>>
>>>> Best regards,
>>>> Egor Kobylkin
>>>>
>>>> ---
>>>> 2018-07-17  Egor Kobylkin  <egor@kobylkin.com>
>>>>
>>>> 	[BZ #2872]
>>>> 	* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
>>>> table from Cyrillic to Latin.
>>>> 	* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
>>>> section.
>>>> 	* locales/aa_DJ: likewise
>>>> 	* locales/af_ZA: likewise
>>>> 	* locales/ak_GH: likewise
>>>> 	* locales/am_ET: likewise
>>>> 	* locales/ar_EG: likewise
>>>> 	* locales/be_BY: likewise
>>>> 	* locales/bem_ZM: likewise
>>>> 	* locales/ber_DZ: likewise
>>>> 	* locales/ber_MA: likewise
>>>> 	* locales/bg_BG: likewise
>>>> 	* locales/bi_VU: likewise
>>>> 	* locales/bn_BD: likewise
>>>> 	* locales/bo_CN: likewise
>>>> 	* locales/ca_ES: likewise
>>>> 	* locales/ce_RU: likewise
>>>> 	* locales/cs_CZ: likewise
>>>> 	* locales/cv_RU: likewise
>>>> 	* locales/cy_GB: likewise
>>>> 	* locales/da_DK: likewise
>>>> 	* locales/de_DE: likewise
>>>> 	* locales/dv_MV: likewise
>>>> 	* locales/dz_BT: likewise
>>>> 	* locales/el_GR: likewise
>>>> 	* locales/en_GB: likewise
>>>> 	* locales/en_NG: likewise
>>>> 	* locales/en_ZM: likewise
>>>> 	* locales/es_CU: likewise
>>>> 	* locales/es_ES: likewise
>>>> 	* locales/et_EE: likewise
>>>> 	* locales/fa_IR: likewise
>>>> 	* locales/ff_SN: likewise
>>>> 	* locales/fi_FI: likewise
>>>> 	* locales/fr_FR: likewise
>>>> 	* locales/ga_IE: likewise
>>>> 	* locales/gd_GB: likewise
>>>> 	* locales/gu_IN: likewise
>>>> 	* locales/gv_GB: likewise
>>>> 	* locales/he_IL: likewise
>>>> 	* locales/hi_IN: likewise
>>>> 	* locales/hif_FJ: likewise
>>>> 	* locales/hr_HR: likewise
>>>> 	* locales/ht_HT: likewise
>>>> 	* locales/hu_HU: likewise
>>>> 	* locales/hy_AM: likewise
>>>> 	* locales/id_ID: likewise
>>>> 	* locales/is_IS: likewise
>>>> 	* locales/it_IT: likewise
>>>> 	* locales/ja_JP: likewise
>>>> 	* locales/kk_KZ: likewise
>>>> 	* locales/km_KH: likewise
>>>> 	* locales/kn_IN: likewise
>>>> 	* locales/ko_KR: likewise
>>>> 	* locales/ks_IN: likewise
>>>> 	* locales/kw_GB: likewise
>>>> 	* locales/lb_LU: likewise
>>>> 	* locales/lg_UG: likewise
>>>> 	* locales/lij_IT: likewise
>>>> 	* locales/ln_CD: likewise
>>>> 	* locales/lo_LA: likewise
>>>> 	* locales/lt_LT: likewise
>>>> 	* locales/lv_LV: likewise
>>>> 	* locales/mg_MG: likewise
>>>> 	* locales/mhr_RU: likewise
>>>> 	* locales/mk_MK: likewise
>>>> 	* locales/ml_IN: likewise
>>>> 	* locales/ms_MY: likewise
>>>> 	* locales/mt_MT: likewise
>>>> 	* locales/nan_TW@latin: likewise
>>>> 	* locales/nb_NO: likewise
>>>> 	* locales/ne_NP: likewise
>>>> 	* locales/nhn_MX: likewise
>>>> 	* locales/niu_NU: likewise
>>>> 	* locales/niu_NZ: likewise
>>>> 	* locales/nl_NL: likewise
>>>> 	* locales/nr_ZA: likewise
>>>> 	* locales/oc_FR: likewise
>>>> 	* locales/om_KE: likewise
>>>> 	* locales/or_IN: likewise
>>>> 	* locales/os_RU: likewise
>>>> 	* locales/pa_IN: likewise
>>>> 	* locales/pa_PK: likewise
>>>> 	* locales/pl_PL: likewise
>>>> 	* locales/pt_PT: likewise
>>>> 	* locales/quz_PE: likewise
>>>> 	* locales/ro_RO: likewise
>>>> 	* locales/ru_RU: likewise
>>>> 	* locales/rw_RW: likewise
>>>> 	* locales/sa_IN: likewise
>>>> 	* locales/sd_IN: likewise
>>>> 	* locales/sd_IN@devanagari: likewise
>>>> 	* locales/sd_PK: likewise
>>>> 	* locales/se_NO: likewise
>>>> 	* locales/sgs_LT: likewise
>>>> 	* locales/si_LK: likewise
>>>> 	* locales/sk_SK: likewise
>>>> 	* locales/sl_SI: likewise
>>>> 	* locales/sm_WS: likewise
>>>> 	* locales/so_SO: likewise
>>>> 	* locales/sq_AL: likewise
>>>> 	* locales/ss_ZA: likewise
>>>> 	* locales/st_ZA: likewise
>>>> 	* locales/sv_SE: likewise
>>>> 	* locales/sw_KE: likewise
>>>> 	* locales/ta_IN: likewise
>>>> 	* locales/te_IN: likewise
>>>> 	* locales/th_TH: likewise
>>>> 	* locales/ti_ET: likewise
>>>> 	* locales/tn_ZA: likewise
>>>> 	* locales/to_TO: likewise
>>>> 	* locales/tpi_PG: likewise
>>>> 	* locales/tr_TR: likewise
>>>> 	* locales/ts_ZA: likewise
>>>> 	* locales/unm_US: likewise
>>>> 	* locales/ur_IN: likewise
>>>> 	* locales/ur_PK: likewise
>>>> 	* locales/ve_ZA: likewise
>>>> 	* locales/vi_VN: likewise
>>>> 	* locales/wa_BE: likewise
>>>> 	* locales/wo_SN: likewise
>>>> 	* locales/xh_ZA: likewise
>>>> 	* locales/yi_US: likewise
>>>> 	* locales/zh_CN: likewise
>>>> 	* locales/zu_ZA: likewise
>>>>
>>>>
>>>> diff -uNr a/localedata/locales/C b/localedata/locales/C
>>>> --- a/localedata/locales/C	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/C	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -2292,6 +2292,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
>>>> --- a/localedata/locales/aa_DJ	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/aa_DJ	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
>>>> --- a/localedata/locales/af_ZA	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/af_ZA	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
>>>> --- a/localedata/locales/ak_GH	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ak_GH	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -56,6 +56,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
>>>> --- a/localedata/locales/am_ET	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/am_ET	2018-07-17 17:55:47.000000000 +0000
>>>> @@ -1396,6 +1396,7 @@
>>>>  <U137A>    <U0060><U0039><U0030>
>>>>  <U137B>    <U0060><U0031><U0030><U0030>
>>>>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  %
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
>>>> --- a/localedata/locales/ar_EG	2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ar_EG	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
>>>> --- a/localedata/locales/be_BY	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/be_BY	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
>>>> --- a/localedata/locales/bem_ZM	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bem_ZM	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
>>>> --- a/localedata/locales/ber_DZ	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_DZ	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -166,6 +166,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
>>>> --- a/localedata/locales/ber_MA	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_MA	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -86,6 +86,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
>>>> --- a/localedata/locales/bg_BG	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bg_BG	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
>>>> --- a/localedata/locales/bi_VU	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bi_VU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
>>>> --- a/localedata/locales/bn_BD	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bn_BD	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
>>>> --- a/localedata/locales/bo_CN	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bo_CN	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
>>>> --- a/localedata/locales/ca_ES	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ca_ES	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
>>>> --- a/localedata/locales/ce_RU	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ce_RU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
>>>> --- a/localedata/locales/cs_CZ	2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/cs_CZ	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -2311,6 +2311,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
>>>> --- a/localedata/locales/cv_RU	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cv_RU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
>>>> --- a/localedata/locales/cy_GB	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cy_GB	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
>>>> --- a/localedata/locales/da_DK	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/da_DK	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -167,6 +167,7 @@
>>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
>>>> --- a/localedata/locales/de_DE	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/de_DE	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>>  % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>>>>  <U201F> <U00AB>;<U0022>
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
>>>> --- a/localedata/locales/dv_MV	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dv_MV	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -52,6 +52,7 @@
>>>>  include "translit_combining";""
>>>>
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
>>>> --- a/localedata/locales/dz_BT	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dz_BT	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
>>>> --- a/localedata/locales/el_GR	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/el_GR	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
>>>> --- a/localedata/locales/en_GB	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_GB	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
>>>> --- a/localedata/locales/en_NG	2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_NG	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -50,6 +50,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
>>>> --- a/localedata/locales/en_ZM	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/en_ZM	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
>>>> --- a/localedata/locales/es_CU	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_CU	2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
>>>> --- a/localedata/locales/es_ES	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_ES	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
>>>> --- a/localedata/locales/et_EE	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/et_EE	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
>>>> --- a/localedata/locales/fa_IR	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fa_IR	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -79,6 +79,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
>>>> --- a/localedata/locales/ff_SN	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/ff_SN	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
>>>> --- a/localedata/locales/fi_FI	2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fi_FI	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -137,6 +137,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
>>>> --- a/localedata/locales/fr_FR	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/fr_FR	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>  % In France, accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
>>>> --- a/localedata/locales/ga_IE	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ga_IE	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -54,6 +54,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
>>>> --- a/localedata/locales/gd_GB	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gd_GB	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
>>>> --- a/localedata/locales/gu_IN	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gu_IN	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
>>>> --- a/localedata/locales/gv_GB	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gv_GB	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
>>>> --- a/localedata/locales/he_IL	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/he_IL	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
>>>> --- a/localedata/locales/hi_IN	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hi_IN	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -61,6 +61,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
>>>> --- a/localedata/locales/hif_FJ	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hif_FJ	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
>>>> --- a/localedata/locales/hr_HR	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hr_HR	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -153,6 +153,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
>>>> --- a/localedata/locales/ht_HT	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ht_HT	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
>>>> --- a/localedata/locales/hu_HU	2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hu_HU	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -478,6 +478,7 @@
>>>>  <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>>>>  <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
>>>> --- a/localedata/locales/hy_AM	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/hy_AM	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
>>>> --- a/localedata/locales/id_ID	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/id_ID	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
>>>> --- a/localedata/locales/is_IS	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/is_IS	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -2161,6 +2161,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
>>>> --- a/localedata/locales/it_IT	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/it_IT	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
>>>> --- a/localedata/locales/ja_JP	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ja_JP	2018-07-17 17:55:49.000000000 +0000
>>>> @@ -1682,6 +1682,7 @@
>>>>  include "translit_combining";""
>>>>  include "translit_cjk_variants";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
>>>> --- a/localedata/locales/kk_KZ	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kk_KZ	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -158,6 +158,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
>>>> --- a/localedata/locales/km_KH	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/km_KH	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -873,6 +873,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
>>>> --- a/localedata/locales/kn_IN	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kn_IN	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
>>>> --- a/localedata/locales/ko_KR	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ko_KR	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -6099,6 +6099,7 @@
>>>>  include "translit_combining";""
>>>>  include "translit_hangul";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
>>>> --- a/localedata/locales/ks_IN	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ks_IN	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
>>>> --- a/localedata/locales/kw_GB	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kw_GB	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
>>>> --- a/localedata/locales/lb_LU	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lb_LU	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>>  % LATIN SMALL LETTER E WITH CIRCUMFLEX
>>>>  <U00EA> "<U0065><U005E>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
>>>> --- a/localedata/locales/lg_UG	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lg_UG	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
>>>> --- a/localedata/locales/lij_IT	2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lij_IT	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
>>>> --- a/localedata/locales/ln_CD	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ln_CD	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
>>>> --- a/localedata/locales/lo_LA	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lo_LA	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -51,6 +51,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
>>>> --- a/localedata/locales/lt_LT	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lt_LT	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
>>>> --- a/localedata/locales/lv_LV	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lv_LV	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -2122,6 +2122,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
>>>> --- a/localedata/locales/mg_MG	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mg_MG	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>  % Accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
>>>> --- a/localedata/locales/mhr_RU	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mhr_RU	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
>>>> --- a/localedata/locales/mk_MK	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mk_MK	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
>>>> --- a/localedata/locales/ml_IN	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ml_IN	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include     "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>  %
>>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
>>>> --- a/localedata/locales/ms_MY	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ms_MY	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
>>>> --- a/localedata/locales/mt_MT	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mt_MT	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nan_TW@latin
>>>> b/localedata/locales/nan_TW@latin
>>>> --- a/localedata/locales/nan_TW@latin	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nan_TW@latin	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -53,6 +53,7 @@
>>>>  % accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
>>>> --- a/localedata/locales/nb_NO	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nb_NO	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -154,6 +154,7 @@
>>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
>>>> --- a/localedata/locales/ne_NP	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ne_NP	2018-07-17 17:55:50.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
>>>> --- a/localedata/locales/nhn_MX	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nhn_MX	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
>>>> --- a/localedata/locales/niu_NU	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NU	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
>>>> --- a/localedata/locales/niu_NZ	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NZ	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
>>>> --- a/localedata/locales/nl_NL	2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nl_NL	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
>>>> --- a/localedata/locales/nr_ZA	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/nr_ZA	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
>>>> --- a/localedata/locales/oc_FR	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/oc_FR	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
>>>> --- a/localedata/locales/om_KE	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/om_KE	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -140,6 +140,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
>>>> --- a/localedata/locales/or_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/or_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
>>>> --- a/localedata/locales/os_RU	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/os_RU	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
>>>> --- a/localedata/locales/pa_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>>  translit_start
>>>>  include     "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
>>>> --- a/localedata/locales/pa_PK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_PK	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>  % Farsi yeh -> yeh
>>>>  <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
>>>> --- a/localedata/locales/pl_PL	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pl_PL	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -142,6 +142,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
>>>> --- a/localedata/locales/pt_PT	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pt_PT	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
>>>> --- a/localedata/locales/quz_PE	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/quz_PE	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
>>>> --- a/localedata/locales/ro_RO	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ro_RO	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -144,6 +144,7 @@
>>>>  <U0162> "<U021A>";"<U0054>"
>>>>  <U0163> "<U021B>";"<U0074>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
>>>> --- a/localedata/locales/ru_RU	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ru_RU	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -74,6 +74,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
>>>> --- a/localedata/locales/rw_RW	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/rw_RW	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
>>>> --- a/localedata/locales/sa_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sa_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
>>>> --- a/localedata/locales/sd_IN	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_IN	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN@devanagari
>>>> b/localedata/locales/sd_IN@devanagari
>>>> --- a/localedata/locales/sd_IN@devanagari	2018-07-17 17:49:19.000000000
>>>> +0000
>>>> +++ b/localedata/locales/sd_IN@devanagari	2018-07-17 17:55:51.000000000
>>>> +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
>>>> --- a/localedata/locales/sd_PK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_PK	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
>>>> --- a/localedata/locales/se_NO	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/se_NO	2018-07-17 17:55:51.000000000 +0000
>>>> @@ -205,6 +205,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
>>>> --- a/localedata/locales/sgs_LT	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sgs_LT	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>  copy "i18n"
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
>>>> --- a/localedata/locales/si_LK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/si_LK	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
>>>> --- a/localedata/locales/sk_SK	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sk_SK	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
>>>> --- a/localedata/locales/sl_SI	2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sl_SI	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -91,6 +91,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
>>>> --- a/localedata/locales/sm_WS	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sm_WS	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
>>>> --- a/localedata/locales/so_SO	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/so_SO	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
>>>> --- a/localedata/locales/sq_AL	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sq_AL	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
>>>> --- a/localedata/locales/ss_ZA	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ss_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
>>>> --- a/localedata/locales/st_ZA	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/st_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
>>>> --- a/localedata/locales/sv_SE	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sv_SE	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -139,6 +139,7 @@
>>>>  % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>>  <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
>>>> --- a/localedata/locales/sw_KE	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sw_KE	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
>>>> --- a/localedata/locales/ta_IN	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ta_IN	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
>>>> --- a/localedata/locales/te_IN	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/te_IN	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
>>>> --- a/localedata/locales/th_TH	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/th_TH	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
>>>> --- a/localedata/locales/ti_ET	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ti_ET	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -866,6 +866,7 @@
>>>>  <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
>>>>
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  %
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
>>>> --- a/localedata/locales/tn_ZA	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tn_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
>>>> --- a/localedata/locales/to_TO	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/to_TO	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -36,6 +36,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
>>>> --- a/localedata/locales/tpi_PG	2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tpi_PG	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
>>>> --- a/localedata/locales/tr_TR	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/tr_TR	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -2430,6 +2430,7 @@
>>>>
>>>>  % TURKISH LIRA SIGN
>>>>  <U20BA> "<U0054><U004C>"
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/translit_cyrillic
>>>> b/localedata/locales/translit_cyrillic
>>>> --- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
>>>> +0000
>>>> +++ b/localedata/locales/translit_cyrillic	2018-07-17 17:55:52.000000000
>>>> +0000
>>>> @@ -0,0 +1,151 @@
>>>> +escape_char /
>>>> +comment_char %
>>>> +
>>>> +% Transliterations that converts cyrillic letters to ascii symbols
>>>> inspired by GOST 7.79-2000
>>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> +% Generated from UnicodeData.txt with
>>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> +% Up to three characters are required to do a reversible transliteration.
>>>> +
>>>> +LC_CTYPE
>>>> +
>>>> +translit_start
>>>> +
>>>> +
>>>> +% CYRILLIC CAPITAL LETTER IO
>>>> +<U0401> "<U0059><U004F>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER A
>>>> +<U0410> <U0041>
>>>> +% CYRILLIC CAPITAL LETTER BE
>>>> +<U0411> <U0042>
>>>> +% CYRILLIC CAPITAL LETTER VE
>>>> +<U0412> <U0056>
>>>> +% CYRILLIC CAPITAL LETTER GHE
>>>> +<U0413> <U0047>
>>>> +% CYRILLIC CAPITAL LETTER DE
>>>> +<U0414> <U0044>
>>>> +% CYRILLIC CAPITAL LETTER IE
>>>> +<U0415> <U0045>
>>>> +% CYRILLIC CAPITAL LETTER ZHE
>>>> +<U0416> "<U005A><U0048>";<U005A>
>>>> +% CYRILLIC CAPITAL LETTER ZE
>>>> +<U0417> <U005A>
>>>> +% CYRILLIC CAPITAL LETTER I
>>>> +<U0418> <U0049>
>>>> +% CYRILLIC CAPITAL LETTER SHORT I
>>>> +<U0419> <U004A>
>>>> +% CYRILLIC CAPITAL LETTER KA
>>>> +<U041A> <U004B>
>>>> +% CYRILLIC CAPITAL LETTER EL
>>>> +<U041B> <U004C>
>>>> +% CYRILLIC CAPITAL LETTER EM
>>>> +<U041C> <U004D>
>>>> +% CYRILLIC CAPITAL LETTER EN
>>>> +<U041D> <U004E>
>>>> +% CYRILLIC CAPITAL LETTER O
>>>> +<U041E> <U004F>
>>>> +% CYRILLIC CAPITAL LETTER PE
>>>> +<U041F> <U0050>
>>>> +% CYRILLIC CAPITAL LETTER ER
>>>> +<U0420> <U0052>
>>>> +% CYRILLIC CAPITAL LETTER ES
>>>> +<U0421> <U0053>
>>>> +% CYRILLIC CAPITAL LETTER TE
>>>> +<U0422> <U0054>
>>>> +% CYRILLIC CAPITAL LETTER U
>>>> +<U0423> <U0055>
>>>> +% CYRILLIC CAPITAL LETTER EF
>>>> +<U0424> <U0046>
>>>> +% CYRILLIC CAPITAL LETTER HA
>>>> +<U0425> <U0058>
>>>> +% CYRILLIC CAPITAL LETTER TSE
>>>> +<U0426> "<U0043><U005A>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER CHE
>>>> +<U0427> "<U0043><U0048>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER SHA
>>>> +<U0428> "<U0053><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER SHCHA
>>>> +<U0429> "<U0053><U0048><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER HARD SIGN
>>>> +<U042A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC CAPITAL LETTER YERU
>>>> +<U042B> "<U0059><U0027>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN
>>>> +<U042C> <U0060>
>>>> +% CYRILLIC CAPITAL LETTER E
>>>> +<U042D> "<U0045><U0060>";<U0045>
>>>> +% CYRILLIC CAPITAL LETTER YU
>>>> +<U042E> "<U0059><U0055>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER YA
>>>> +<U042F> "<U0059><U0041>";<U0059>
>>>> +% CYRILLIC SMALL LETTER A
>>>> +<U0430> <U0061>
>>>> +% CYRILLIC SMALL LETTER BE
>>>> +<U0431> <U0062>
>>>> +% CYRILLIC SMALL LETTER VE
>>>> +<U0432> <U0076>
>>>> +% CYRILLIC SMALL LETTER GHE
>>>> +<U0433> <U0067>
>>>> +% CYRILLIC SMALL LETTER DE
>>>> +<U0434> <U0064>
>>>> +% CYRILLIC SMALL LETTER IE
>>>> +<U0435> <U0065>
>>>> +% CYRILLIC SMALL LETTER ZHE
>>>> +<U0436> "<U007A><U0068>";<U007A>
>>>> +% CYRILLIC SMALL LETTER ZE
>>>> +<U0437> <U007A>
>>>> +% CYRILLIC SMALL LETTER I
>>>> +<U0438> <U0069>
>>>> +% CYRILLIC SMALL LETTER SHORT I
>>>> +<U0439> <U006A>
>>>> +% CYRILLIC SMALL LETTER KA
>>>> +<U043A> <U006B>
>>>> +% CYRILLIC SMALL LETTER EL
>>>> +<U043B> <U006C>
>>>> +% CYRILLIC SMALL LETTER EM
>>>> +<U043C> <U006D>
>>>> +% CYRILLIC SMALL LETTER EN
>>>> +<U043D> <U006E>
>>>> +% CYRILLIC SMALL LETTER O
>>>> +<U043E> <U006F>
>>>> +% CYRILLIC SMALL LETTER PE
>>>> +<U043F> <U0070>
>>>> +% CYRILLIC SMALL LETTER ER
>>>> +<U0440> <U0072>
>>>> +% CYRILLIC SMALL LETTER ES
>>>> +<U0441> <U0073>
>>>> +% CYRILLIC SMALL LETTER TE
>>>> +<U0442> <U0074>
>>>> +% CYRILLIC SMALL LETTER U
>>>> +<U0443> <U0075>
>>>> +% CYRILLIC SMALL LETTER EF
>>>> +<U0444> <U0066>
>>>> +% CYRILLIC SMALL LETTER HA
>>>> +<U0445> <U0078>
>>>> +% CYRILLIC SMALL LETTER TSE
>>>> +<U0446> "<U0063><U007A>";<U0063>
>>>> +% CYRILLIC SMALL LETTER CHE
>>>> +<U0447> "<U0063><U0068>";<U0063>
>>>> +% CYRILLIC SMALL LETTER SHA
>>>> +<U0448> "<U0073><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER SHCHA
>>>> +<U0449> "<U0073><U0068><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER HARD SIGN
>>>> +<U044A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC SMALL LETTER YERU
>>>> +<U044B> "<U0079><U0027>";<U0079>
>>>> +% CYRILLIC SMALL LETTER SOFT SIGN
>>>> +<U044C> <U0060>
>>>> +% CYRILLIC SMALL LETTER E
>>>> +<U044D> "<U0065><U0060>";<U0065>
>>>> +% CYRILLIC SMALL LETTER YU
>>>> +<U044E> "<U0079><U0075>";<U0079>
>>>> +% CYRILLIC SMALL LETTER YA
>>>> +<U044F> "<U0079><U0061>";<U0079>
>>>> +% CYRILLIC SMALL LETTER IO
>>>> +<U0451> "<U0079><U006F>";<U0079>
>>>> +
>>>> +
>>>> +translit_end
>>>> +
>>>> +END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
>>>> --- a/localedata/locales/ts_ZA	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ts_ZA	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
>>>> --- a/localedata/locales/unm_US	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/unm_US	2018-07-17 17:55:52.000000000 +0000
>>>> @@ -48,6 +48,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
>>>> --- a/localedata/locales/ur_IN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_IN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
>>>> --- a/localedata/locales/ur_PK	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_PK	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>  % Farsi yeh -> yeh
>>>>  <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
>>>> --- a/localedata/locales/ve_ZA	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ve_ZA	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -67,6 +67,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
>>>> --- a/localedata/locales/vi_VN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/vi_VN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>  % dong sign -> d// -> dd
>>>>  <U20AB> "<U0111>";"<U0064><U0064>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
>>>> --- a/localedata/locales/wa_BE	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wa_BE	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>  <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>>>>  <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
>>>> --- a/localedata/locales/wo_SN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wo_SN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>  % Accents are simply omitted if they cannot be represented.
>>>>  include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
>>>> --- a/localedata/locales/xh_ZA	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/xh_ZA	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>>  translit_start
>>>>  include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
>>>> --- a/localedata/locales/yi_US	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/yi_US	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>>  <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>>>>  <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>>>>  <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  END LC_CTYPE
>>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
>>>> --- a/localedata/locales/zh_CN	2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/zh_CN	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>
>>>>  class	"hanzi"; /
>>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
>>>> --- a/localedata/locales/zu_ZA	2018-07-17 17:49:22.000000000 +0000
>>>> +++ b/localedata/locales/zu_ZA	2018-07-17 17:55:53.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>>  translit_start
>>>>  include  "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>>  translit_end
>>>>  END LC_CTYPE
>>>>
>>>>
>>>>
> 


-- 
Marko Myllynen


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
       [not found]             ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
@ 2018-10-05 11:54               ` Marko Myllynen
  2018-10-05 12:00                 ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-05 11:54 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?

Implementation-wise current translit_* files have few examples where a
non-ASCII transliteration is tried first before an ASCII fallback. These
examples are from translit_neutral:

% NARROW NO-BREAK SPACE
<U202F> <U00A0>;<U0020>
% REVERSED TRIPLE PRIME
<U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"

Thanks,

On 2018-10-05 13:29, Egor Kobylkin wrote:
> Keld,Marko,Rafal, other locale maintainers,
> 
> this all is written with having in mind a minimal viable fix for this
> bug asap. I want to avoid wasting maintainers time getting into
> fundamental discussions here (although for perfectly good reasons).
> 
> I see three options:
> 1. those locale maintainers that are fine with using ISO
> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
> in their locales (see attached screenshot of the table).
> 2. those that that want to have a differing table can create their own
> variety based on the spreadsheet I have prepared
> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
> this patch.
> 3. those that want to omit a cyrillic transliteration altogether for now
> state so and just carry over the bug #2872 from the year 2006.
> 
> Does this make sense to you?
> 
> Just to be super clear on this: the patch is a stopgap _ASCII_
> transliteration table. ASCII being AMERICAN Standard Code for
> Information Interchange, that is obviously orthogonal to any
> transliteration rule of other countries. As such it is not explicitly
> targeting transliteration standards of any country.
> 
> The fact that the patch is reflecting Russian variety of ISO
> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
> available and can be helpful to a majority of cyrillic users b) I have
> access to it including via being proficient in Russian.
> 
> It is offered to all the respective locale maintainers as a stopgap
> solution. Stopgap in the sense that it is better to have some
> transliteration than not to have any at all and carry over the bug from
> 2006. That it may be a somewhat officially correct transliteration for
> ru_RU is a bonus. In that sense I would dub the discussion on the
> correctness for other languages "offtopic". Let me know if this is not OK.
> 
> You are all are correctly mentioning the deficiencies of this approach.
> However, I couldn't find a better straightforward approach as of yet.
> Happy to hear from you as on how this could be handled.
> 
> There is a danger of being caught in the web of language/country
> differences. I propose just pruning the locales that are not comfortable
> including this current table. We can address possible solutions in the
> second wave of patching.
> 
> I am vary of getting into discussions on specific country variants just
> because of the sheer complexity of this topic. It is probably better
> addressed by respective maintainers of their locales. I do not see a
> "one fits all" solution in this first wave possible.
> 
> I would like to have this "three options plan of action" vetted first
> and then we could go to the specific detail. (Like, for instance, what
> characters should be included in to the table, and in which
> transliteration form.)
> 
> I am looking forward to your reply,
> Egor Kobylkin
> 
> P.S. specifically as to how address languages other than Ru included in
> GOST_7.79_System_B: we can take the first option left to right from that
> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
> locales/languages but with errors where Ru supersedes their own variants.
> 
> 
> On 05.10.2018 11:20, Rafal Luzynski wrote:
>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>> Hi
>>>>
>>>> Please note that translitteration of Cyrillic to latin is not universal.
>>>> There are different schemes for for example German, English and Danish, and
>>>> there is also an ISO standard for it.
>>>
>>> Thanks for your feedback, Keld!
>>>
>>> Could the locale maintainers that wouldn't like to include this patch
>>> explicitly state so here?
>>
>> I think it is about me so I must reply.  I am sorry about that and the sole
>> reason is my lack of time.  I'm just a volunteer here, that means it's not
>> my regular job to work on locale data nor anything in glibc nor in any other
>> open source project.  I do these things only in my free time which I don't
>> have much.  Of course you will see my contributions here and there but they
>> are either trivial or take me months to complete.  Your patches are on my
>> radar but I can't tell any ETA for them.  Of course, there are other people
>> around here and they are all welcome to come and join.
>>
>>> That is:
>>> - In the case that there is a different preferred cyrillic
>>> transliteration table for any specific locale their maintainers may want
>>> to point me to it so I can supply a separate table/patch.
>>> - Or they could state explicitly that for some reason they would like to
>>> exclude their locale from the patch for a default cyrillic
>>> transliteration altogether.
>>
>> As Keld wrote, there are probably separate rules for every language so
>> I don't think you should treat your rules as universal and include them
>> in every locale.  At first sight, it seems to me they work only for English
>> (as a destination locale).  Also, although it is called "transliteration
>> from Cyrillic" it seems that it covers only Russian alphabet.  What about
>> other languages which use Cyrillic alphabet but add their own diacritic
>> characters?  Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
>> Mari, Ossetian, Yakut, Tatar, and more.  What about languages which use
>> Cyrillic alphabet but transliterate their respective letters in a different
>> way than Russian?  For example, Russian "Ъ" is (I think) usually skipped
>> in transliteration, I think you propose "``", but when transliterating from
>> Bulgarian they usually transliterate this as "ă".
>>
>> Few remarks:
>>
>> * I think you transliterate "щ" as "shh", wouldn't "shch" be better?
>> * You transliterate "ц" as "cz", wouldn't "ts" be better?  By the way,
>>   in Polish language "cz" is a correct transliteration of "ч".
>> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>>   "y" be better in English?
>> * In case of "е": how will you know if it is correct to transliterate it
>>   to "e" or "ie" or "je" or "ye"?
>>
>> These remarks are obviously incomplete, your patch deserves much more
>> attention to review.
>>
>> Best regards,
>>
>> Rafal
>>
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-05 11:54               ` Marko Myllynen
@ 2018-10-05 12:00                 ` Egor Kobylkin
  2018-10-05 12:21                   ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-05 12:00 UTC (permalink / raw)
  To: Marko Myllynen, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi Marko,

I have chosen the System B because it is ASCII compartible. System A is
not ASCII compartible (diacritics in target).

https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
"GOST 7.79 contains two transliteration tables.

System A
    one Cyrillic character to one Latin character, some with diacritics
– identical to ISO 9:1995

System B
    one Cyrillic character to one or many Latin characters without
diacritics
"
Hope this helps,
Egor

On 05.10.2018 13:54, Marko Myllynen wrote:
> Hi,
> 
> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
> possible and if not, then fall back to GOST 7.79 System B?
> 
> Implementation-wise current translit_* files have few examples where a
> non-ASCII transliteration is tried first before an ASCII fallback. These
> examples are from translit_neutral:
> 
> % NARROW NO-BREAK SPACE
> <U202F> <U00A0>;<U0020>
> % REVERSED TRIPLE PRIME
> <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
> 
> Thanks,
> 
> On 2018-10-05 13:29, Egor Kobylkin wrote:
>> Keld,Marko,Rafal, other locale maintainers,
>>
>> this all is written with having in mind a minimal viable fix for this
>> bug asap. I want to avoid wasting maintainers time getting into
>> fundamental discussions here (although for perfectly good reasons).
>>
>> I see three options:
>> 1. those locale maintainers that are fine with using ISO
>> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
>> in their locales (see attached screenshot of the table).
>> 2. those that that want to have a differing table can create their own
>> variety based on the spreadsheet I have prepared
>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
>> this patch.
>> 3. those that want to omit a cyrillic transliteration altogether for now
>> state so and just carry over the bug #2872 from the year 2006.
>>
>> Does this make sense to you?
>>
>> Just to be super clear on this: the patch is a stopgap _ASCII_
>> transliteration table. ASCII being AMERICAN Standard Code for
>> Information Interchange, that is obviously orthogonal to any
>> transliteration rule of other countries. As such it is not explicitly
>> targeting transliteration standards of any country.
>>
>> The fact that the patch is reflecting Russian variety of ISO
>> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
>> available and can be helpful to a majority of cyrillic users b) I have
>> access to it including via being proficient in Russian.
>>
>> It is offered to all the respective locale maintainers as a stopgap
>> solution. Stopgap in the sense that it is better to have some
>> transliteration than not to have any at all and carry over the bug from
>> 2006. That it may be a somewhat officially correct transliteration for
>> ru_RU is a bonus. In that sense I would dub the discussion on the
>> correctness for other languages "offtopic". Let me know if this is not OK.
>>
>> You are all are correctly mentioning the deficiencies of this approach.
>> However, I couldn't find a better straightforward approach as of yet.
>> Happy to hear from you as on how this could be handled.
>>
>> There is a danger of being caught in the web of language/country
>> differences. I propose just pruning the locales that are not comfortable
>> including this current table. We can address possible solutions in the
>> second wave of patching.
>>
>> I am vary of getting into discussions on specific country variants just
>> because of the sheer complexity of this topic. It is probably better
>> addressed by respective maintainers of their locales. I do not see a
>> "one fits all" solution in this first wave possible.
>>
>> I would like to have this "three options plan of action" vetted first
>> and then we could go to the specific detail. (Like, for instance, what
>> characters should be included in to the table, and in which
>> transliteration form.)
>>
>> I am looking forward to your reply,
>> Egor Kobylkin
>>
>> P.S. specifically as to how address languages other than Ru included in
>> GOST_7.79_System_B: we can take the first option left to right from that
>> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
>> locales/languages but with errors where Ru supersedes their own variants.
>>
>>
>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>> Hi
>>>>>
>>>>> Please note that translitteration of Cyrillic to latin is not universal.
>>>>> There are different schemes for for example German, English and Danish, and
>>>>> there is also an ISO standard for it.
>>>>
>>>> Thanks for your feedback, Keld!
>>>>
>>>> Could the locale maintainers that wouldn't like to include this patch
>>>> explicitly state so here?
>>>
>>> I think it is about me so I must reply.  I am sorry about that and the sole
>>> reason is my lack of time.  I'm just a volunteer here, that means it's not
>>> my regular job to work on locale data nor anything in glibc nor in any other
>>> open source project.  I do these things only in my free time which I don't
>>> have much.  Of course you will see my contributions here and there but they
>>> are either trivial or take me months to complete.  Your patches are on my
>>> radar but I can't tell any ETA for them.  Of course, there are other people
>>> around here and they are all welcome to come and join.
>>>
>>>> That is:
>>>> - In the case that there is a different preferred cyrillic
>>>> transliteration table for any specific locale their maintainers may want
>>>> to point me to it so I can supply a separate table/patch.
>>>> - Or they could state explicitly that for some reason they would like to
>>>> exclude their locale from the patch for a default cyrillic
>>>> transliteration altogether.
>>>
>>> As Keld wrote, there are probably separate rules for every language so
>>> I don't think you should treat your rules as universal and include them
>>> in every locale.  At first sight, it seems to me they work only for English
>>> (as a destination locale).  Also, although it is called "transliteration
>>> from Cyrillic" it seems that it covers only Russian alphabet.  What about
>>> other languages which use Cyrillic alphabet but add their own diacritic
>>> characters?  Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
>>> Mari, Ossetian, Yakut, Tatar, and more.  What about languages which use
>>> Cyrillic alphabet but transliterate their respective letters in a different
>>> way than Russian?  For example, Russian "Ъ" is (I think) usually skipped
>>> in transliteration, I think you propose "``", but when transliterating from
>>> Bulgarian they usually transliterate this as "ă".
>>>
>>> Few remarks:
>>>
>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be better?
>>> * You transliterate "ц" as "cz", wouldn't "ts" be better?  By the way,
>>>   in Polish language "cz" is a correct transliteration of "ч".
>>> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>>>   "y" be better in English?
>>> * In case of "е": how will you know if it is correct to transliterate it
>>>   to "e" or "ie" or "je" or "ye"?
>>>
>>> These remarks are obviously incomplete, your patch deserves much more
>>> attention to review.
>>>
>>> Best regards,
>>>
>>> Rafal
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-05 12:00                 ` Egor Kobylkin
@ 2018-10-05 12:21                   ` Marko Myllynen
  2018-10-05 20:47                     ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-05 12:21 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

The scheme I proposed would also be ASCII compatible; consider this example:

% CYRILLIC CAPITAL LETTER SHA
<U0428> "<U0160>";"<U0053><U0068>"

"printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv -f
ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
\\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as per
System B.

Thanks,

On 2018-10-05 15:00, Egor Kobylkin wrote:
> Hi Marko,
> 
> I have chosen the System B because it is ASCII compartible. System A is
> not ASCII compartible (diacritics in target).
> 
> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
> "GOST 7.79 contains two transliteration tables.
> 
> System A
>     one Cyrillic character to one Latin character, some with diacritics
> – identical to ISO 9:1995
> 
> System B
>     one Cyrillic character to one or many Latin characters without
> diacritics
> "
> Hope this helps,
> Egor
> 
> On 05.10.2018 13:54, Marko Myllynen wrote:
>> Hi,
>>
>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>> possible and if not, then fall back to GOST 7.79 System B?
>>
>> Implementation-wise current translit_* files have few examples where a
>> non-ASCII transliteration is tried first before an ASCII fallback. These
>> examples are from translit_neutral:
>>
>> % NARROW NO-BREAK SPACE
>> <U202F> <U00A0>;<U0020>
>> % REVERSED TRIPLE PRIME
>> <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>
>> Thanks,
>>
>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>> Keld,Marko,Rafal, other locale maintainers,
>>>
>>> this all is written with having in mind a minimal viable fix for this
>>> bug asap. I want to avoid wasting maintainers time getting into
>>> fundamental discussions here (although for perfectly good reasons).
>>>
>>> I see three options:
>>> 1. those locale maintainers that are fine with using ISO
>>> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
>>> in their locales (see attached screenshot of the table).
>>> 2. those that that want to have a differing table can create their own
>>> variety based on the spreadsheet I have prepared
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
>>> this patch.
>>> 3. those that want to omit a cyrillic transliteration altogether for now
>>> state so and just carry over the bug #2872 from the year 2006.
>>>
>>> Does this make sense to you?
>>>
>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>> transliteration table. ASCII being AMERICAN Standard Code for
>>> Information Interchange, that is obviously orthogonal to any
>>> transliteration rule of other countries. As such it is not explicitly
>>> targeting transliteration standards of any country.
>>>
>>> The fact that the patch is reflecting Russian variety of ISO
>>> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
>>> available and can be helpful to a majority of cyrillic users b) I have
>>> access to it including via being proficient in Russian.
>>>
>>> It is offered to all the respective locale maintainers as a stopgap
>>> solution. Stopgap in the sense that it is better to have some
>>> transliteration than not to have any at all and carry over the bug from
>>> 2006. That it may be a somewhat officially correct transliteration for
>>> ru_RU is a bonus. In that sense I would dub the discussion on the
>>> correctness for other languages "offtopic". Let me know if this is not OK.
>>>
>>> You are all are correctly mentioning the deficiencies of this approach.
>>> However, I couldn't find a better straightforward approach as of yet.
>>> Happy to hear from you as on how this could be handled.
>>>
>>> There is a danger of being caught in the web of language/country
>>> differences. I propose just pruning the locales that are not comfortable
>>> including this current table. We can address possible solutions in the
>>> second wave of patching.
>>>
>>> I am vary of getting into discussions on specific country variants just
>>> because of the sheer complexity of this topic. It is probably better
>>> addressed by respective maintainers of their locales. I do not see a
>>> "one fits all" solution in this first wave possible.
>>>
>>> I would like to have this "three options plan of action" vetted first
>>> and then we could go to the specific detail. (Like, for instance, what
>>> characters should be included in to the table, and in which
>>> transliteration form.)
>>>
>>> I am looking forward to your reply,
>>> Egor Kobylkin
>>>
>>> P.S. specifically as to how address languages other than Ru included in
>>> GOST_7.79_System_B: we can take the first option left to right from that
>>> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
>>> locales/languages but with errors where Ru supersedes their own variants.
>>>
>>>
>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>
>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>> Hi
>>>>>>
>>>>>> Please note that translitteration of Cyrillic to latin is not universal.
>>>>>> There are different schemes for for example German, English and Danish, and
>>>>>> there is also an ISO standard for it.
>>>>>
>>>>> Thanks for your feedback, Keld!
>>>>>
>>>>> Could the locale maintainers that wouldn't like to include this patch
>>>>> explicitly state so here?
>>>>
>>>> I think it is about me so I must reply.  I am sorry about that and the sole
>>>> reason is my lack of time.  I'm just a volunteer here, that means it's not
>>>> my regular job to work on locale data nor anything in glibc nor in any other
>>>> open source project.  I do these things only in my free time which I don't
>>>> have much.  Of course you will see my contributions here and there but they
>>>> are either trivial or take me months to complete.  Your patches are on my
>>>> radar but I can't tell any ETA for them.  Of course, there are other people
>>>> around here and they are all welcome to come and join.
>>>>
>>>>> That is:
>>>>> - In the case that there is a different preferred cyrillic
>>>>> transliteration table for any specific locale their maintainers may want
>>>>> to point me to it so I can supply a separate table/patch.
>>>>> - Or they could state explicitly that for some reason they would like to
>>>>> exclude their locale from the patch for a default cyrillic
>>>>> transliteration altogether.
>>>>
>>>> As Keld wrote, there are probably separate rules for every language so
>>>> I don't think you should treat your rules as universal and include them
>>>> in every locale.  At first sight, it seems to me they work only for English
>>>> (as a destination locale).  Also, although it is called "transliteration
>>>> from Cyrillic" it seems that it covers only Russian alphabet.  What about
>>>> other languages which use Cyrillic alphabet but add their own diacritic
>>>> characters?  Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
>>>> Mari, Ossetian, Yakut, Tatar, and more.  What about languages which use
>>>> Cyrillic alphabet but transliterate their respective letters in a different
>>>> way than Russian?  For example, Russian "Ъ" is (I think) usually skipped
>>>> in transliteration, I think you propose "``", but when transliterating from
>>>> Bulgarian they usually transliterate this as "ă".
>>>>
>>>> Few remarks:
>>>>
>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be better?
>>>> * You transliterate "ц" as "cz", wouldn't "ts" be better?  By the way,
>>>>   in Polish language "cz" is a correct transliteration of "ч".
>>>> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>>>>   "y" be better in English?
>>>> * In case of "е": how will you know if it is correct to transliterate it
>>>>   to "e" or "ie" or "je" or "ye"?
>>>>
>>>> These remarks are obviously incomplete, your patch deserves much more
>>>> attention to review.
>>>>
>>>> Best regards,
>>>>
>>>> Rafal
>>>>
>>>
>>
>>
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-05 12:21                   ` Marko Myllynen
@ 2018-10-05 20:47                     ` Egor Kobylkin
  2018-10-08 12:40                       ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-05 20:47 UTC (permalink / raw)
  To: Marko Myllynen, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

After some kind help from Marko in the offline discussion
I realized the multi/single character approach I originally took was
against the  of the iconv(1) logic anyway. So there is no harm in
dropping it and adopting Marko's suggestion instead. I will do so and
will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
GOST 7.79 System B (for ASCII).

However this doesn't resolve the issue for ASCII part being different
for various locales. Again, I am offering the locale maintainers to let
me know if they want to 1) adopt the one I am supplying, 2) write their
own or 3) ignore the patch altogether. Your feedback is appreciated!

This is the relevant part that helped:
> The first part (ISO-8859-15 or ASCII) defines the target encoding for
> iconv(1). //TRANSLIT is described in the iconv(1) man page as:
> 
> If the string //TRANSLIT is appended to to-encoding,  characters 
> being  converted  are  transliterated  when needed and possible. This
> means that when a character cannot be  represented  in  the target
> character set, it can be approximated through one or sev‐ eral
> similar looking characters.  Characters that are outside of the
> target  character  set  and  cannot  be  transliterated are replaced
> with a question mark (?) in the output.
> 
> So in the above examples, iconv(1) encounters the character U+0428
> which is not part of either of the target encoding and since
> //TRANSLIT is specified, iconv(1) tries transliteration according to
> the rules defined above, in case of ASCII U+0160 is not part of the
> target encoding so the next alternative is used.

Bests,
Egor Kobylkin

On 05.10.2018 14:21, Marko Myllynen wrote:
> Hi,
> 
> The scheme I proposed would also be ASCII compatible; consider this 
> example:
> 
> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
> 
> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv 
> -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
>  \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as 
> per System B.
> 
> Thanks,
> 
> On 2018-10-05 15:00, Egor Kobylkin wrote:
>> Hi Marko,
>> 
>> I have chosen the System B because it is ASCII compartible. System 
>> A is not ASCII compartible (diacritics in target).
>> 
>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
>>
>>
>> 
"GOST 7.79 contains two transliteration tables.
>> 
>> System A one Cyrillic character to one Latin character, some with 
>> diacritics – identical to ISO 9:1995
>> 
>> System B one Cyrillic character to one or many Latin characters 
>> without diacritics " Hope this helps, Egor
>> 
>> On 05.10.2018 13:54, Marko Myllynen wrote:
>>> Hi,
>>> 
>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>>> possible and if not, then fall back to GOST 7.79 System B?
>>> 
>>> Implementation-wise current translit_* files have few examples 
>>> where a non-ASCII transliteration is tried first before an ASCII 
>>> fallback. These examples are from translit_neutral:
>>> 
>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
>>> TRIPLE PRIME <U2037>
>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>> 
>>> Thanks,
>>> 
>>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>>> Keld,Marko,Rafal, other locale maintainers,
>>>> 
>>>> this all is written with having in mind a minimal viable fix 
>>>> for this bug asap. I want to avoid wasting maintainers time 
>>>> getting into fundamental discussions here (although for 
>>>> perfectly good reasons).
>>>> 
>>>> I see three options: 1. those locale maintainers that are fine 
>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic 
>>>> transliteration table (Ru) include it in their locales (see 
>>>> attached screenshot of the table). 2. those that that want to 
>>>> have a differing table can create their own variety based on 
>>>> the spreadsheet I have prepared 
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and 
>>>> include it in this patch. 3. those that want to omit a
>>>> cyrillic transliteration altogether for now state so and just
>>>> carry over the bug #2872 from the year 2006.
>>>> 
>>>> Does this make sense to you?
>>>> 
>>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>>>  transliteration table. ASCII being AMERICAN Standard Code for
>>>>  Information Interchange, that is obviously orthogonal to any 
>>>> transliteration rule of other countries. As such it is not 
>>>> explicitly targeting transliteration standards of any country.
>>>> 
>>>> The fact that the patch is reflecting Russian variety of ISO 
>>>> 9:1995/GOST_7.79_System_B is because a) ISO 
>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a 
>>>> majority of cyrillic users b) I have access to it including
>>>> via being proficient in Russian.
>>>> 
>>>> It is offered to all the respective locale maintainers as a 
>>>> stopgap solution. Stopgap in the sense that it is better to 
>>>> have some transliteration than not to have any at all and
>>>> carry over the bug from 2006. That it may be a somewhat
>>>> officially correct transliteration for ru_RU is a bonus. In
>>>> that sense I would dub the discussion on the correctness for
>>>> other languages "offtopic". Let me know if this is not OK.
>>>> 
>>>> You are all are correctly mentioning the deficiencies of this 
>>>> approach. However, I couldn't find a better straightforward 
>>>> approach as of yet. Happy to hear from you as on how this
>>>> could be handled.
>>>> 
>>>> There is a danger of being caught in the web of 
>>>> language/country differences. I propose just pruning the 
>>>> locales that are not comfortable including this current table. 
>>>> We can address possible solutions in the second wave of 
>>>> patching.
>>>> 
>>>> I am vary of getting into discussions on specific country 
>>>> variants just because of the sheer complexity of this topic.
>>>> It is probably better addressed by respective maintainers of
>>>> their locales. I do not see a "one fits all" solution in this
>>>> first wave possible.
>>>> 
>>>> I would like to have this "three options plan of action"
>>>> vetted first and then we could go to the specific detail.
>>>> (Like, for instance, what characters should be included in to
>>>> the table, and in which transliteration form.)
>>>> 
>>>> I am looking forward to your reply, Egor Kobylkin
>>>> 
>>>> P.S. specifically as to how address languages other than Ru 
>>>> included in GOST_7.79_System_B: we can take the first option 
>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will 
>>>> technically work for all those locales/languages but with 
>>>> errors where Ru supersedes their own variants.
>>>> 
>>>> 
>>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>> 
>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>>> Hi
>>>>>>> 
>>>>>>> Please note that translitteration of Cyrillic to latin
>>>>>>> is not universal. There are different schemes for for 
>>>>>>> example German, English and Danish, and there is also an 
>>>>>>> ISO standard for it.
>>>>>> 
>>>>>> Thanks for your feedback, Keld!
>>>>>> 
>>>>>> Could the locale maintainers that wouldn't like to include 
>>>>>> this patch explicitly state so here?
>>>>> 
>>>>> I think it is about me so I must reply.  I am sorry about 
>>>>> that and the sole reason is my lack of time.  I'm just a 
>>>>> volunteer here, that means it's not my regular job to work
>>>>> on locale data nor anything in glibc nor in any other open 
>>>>> source project.  I do these things only in my free time
>>>>> which I don't have much.  Of course you will see my
>>>>> contributions here and there but they are either trivial or
>>>>> take me months to complete.  Your patches are on my radar but
>>>>> I can't tell any ETA for them.  Of course, there are other
>>>>> people around here and they are all welcome to come and
>>>>> join.
>>>>> 
>>>>>> That is: - In the case that there is a different preferred 
>>>>>> cyrillic transliteration table for any specific locale 
>>>>>> their maintainers may want to point me to it so I can 
>>>>>> supply a separate table/patch. - Or they could state 
>>>>>> explicitly that for some reason they would like to exclude 
>>>>>> their locale from the patch for a default cyrillic 
>>>>>> transliteration altogether.
>>>>> 
>>>>> As Keld wrote, there are probably separate rules for every 
>>>>> language so I don't think you should treat your rules as 
>>>>> universal and include them in every locale.  At first sight, 
>>>>> it seems to me they work only for English (as a destination 
>>>>> locale).  Also, although it is called "transliteration from 
>>>>> Cyrillic" it seems that it covers only Russian alphabet. What
>>>>> about other languages which use Cyrillic alphabet but add
>>>>> their own diacritic characters?  Think about Belarusian, 
>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, 
>>>>> Tatar, and more.  What about languages which use Cyrillic 
>>>>> alphabet but transliterate their respective letters in a 
>>>>> different way than Russian?  For example, Russian "Ъ" is (I 
>>>>> think) usually skipped in transliteration, I think you 
>>>>> propose "``", but when transliterating from Bulgarian they 
>>>>> usually transliterate this as "ă".
>>>>> 
>>>>> Few remarks:
>>>>> 
>>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be 
>>>>> better? * You transliterate "ц" as "cz", wouldn't "ts" be 
>>>>> better?  By the way, in Polish language "cz" is a correct 
>>>>> transliteration of "ч". * You transliterate "й" as "j", this 
>>>>> is fine in many languages but wouldn't "y" be better in 
>>>>> English? * In case of "е": how will you know if it is
>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"?
>>>>> 
>>>>> These remarks are obviously incomplete, your patch deserves 
>>>>> much more attention to review.
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Rafal
>>>>> 
>>>> 
>>> 
>>> 
>> 
> 
> 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-05 20:47                     ` Egor Kobylkin
@ 2018-10-08 12:40                       ` Marko Myllynen
  2018-10-08 22:23                         ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-08 12:40 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.

- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
- No duplicates:

% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>

should become:

% CYRILLIC SMALL LETTER IE
<U0435> <U0065>

- There are few issues with the definitions:

% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"

% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"

I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.

Thanks,

On 2018-10-05 23:47, Egor Kobylkin wrote:
> After some kind help from Marko in the offline discussion
> I realized the multi/single character approach I originally took was
> against the  of the iconv(1) logic anyway. So there is no harm in
> dropping it and adopting Marko's suggestion instead. I will do so and
> will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
> GOST 7.79 System B (for ASCII).
> 
> However this doesn't resolve the issue for ASCII part being different
> for various locales. Again, I am offering the locale maintainers to let
> me know if they want to 1) adopt the one I am supplying, 2) write their
> own or 3) ignore the patch altogether. Your feedback is appreciated!
> 
> This is the relevant part that helped:
>> The first part (ISO-8859-15 or ASCII) defines the target encoding for
>> iconv(1). //TRANSLIT is described in the iconv(1) man page as:
>>
>> If the string //TRANSLIT is appended to to-encoding,  characters 
>> being  converted  are  transliterated  when needed and possible. This
>> means that when a character cannot be  represented  in  the target
>> character set, it can be approximated through one or sev‐ eral
>> similar looking characters.  Characters that are outside of the
>> target  character  set  and  cannot  be  transliterated are replaced
>> with a question mark (?) in the output.
>>
>> So in the above examples, iconv(1) encounters the character U+0428
>> which is not part of either of the target encoding and since
>> //TRANSLIT is specified, iconv(1) tries transliteration according to
>> the rules defined above, in case of ASCII U+0160 is not part of the
>> target encoding so the next alternative is used.
> 
> Bests,
> Egor Kobylkin
> 
> On 05.10.2018 14:21, Marko Myllynen wrote:
>> Hi,
>>
>> The scheme I proposed would also be ASCII compatible; consider this 
>> example:
>>
>> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
>>
>> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv 
>> -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
>>  \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as 
>> per System B.
>>
>> Thanks,
>>
>> On 2018-10-05 15:00, Egor Kobylkin wrote:
>>> Hi Marko,
>>>
>>> I have chosen the System B because it is ASCII compartible. System 
>>> A is not ASCII compartible (diacritics in target).
>>>
>>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
>>>
>>>
>>>
> "GOST 7.79 contains two transliteration tables.
>>>
>>> System A one Cyrillic character to one Latin character, some with 
>>> diacritics – identical to ISO 9:1995
>>>
>>> System B one Cyrillic character to one or many Latin characters 
>>> without diacritics " Hope this helps, Egor
>>>
>>> On 05.10.2018 13:54, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>>>> possible and if not, then fall back to GOST 7.79 System B?
>>>>
>>>> Implementation-wise current translit_* files have few examples 
>>>> where a non-ASCII transliteration is tried first before an ASCII 
>>>> fallback. These examples are from translit_neutral:
>>>>
>>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
>>>> TRIPLE PRIME <U2037>
>>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>>>
>>>> Thanks,
>>>>
>>>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>>>> Keld,Marko,Rafal, other locale maintainers,
>>>>>
>>>>> this all is written with having in mind a minimal viable fix 
>>>>> for this bug asap. I want to avoid wasting maintainers time 
>>>>> getting into fundamental discussions here (although for 
>>>>> perfectly good reasons).
>>>>>
>>>>> I see three options: 1. those locale maintainers that are fine 
>>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic 
>>>>> transliteration table (Ru) include it in their locales (see 
>>>>> attached screenshot of the table). 2. those that that want to 
>>>>> have a differing table can create their own variety based on 
>>>>> the spreadsheet I have prepared 
>>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and 
>>>>> include it in this patch. 3. those that want to omit a
>>>>> cyrillic transliteration altogether for now state so and just
>>>>> carry over the bug #2872 from the year 2006.
>>>>>
>>>>> Does this make sense to you?
>>>>>
>>>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>>>>  transliteration table. ASCII being AMERICAN Standard Code for
>>>>>  Information Interchange, that is obviously orthogonal to any 
>>>>> transliteration rule of other countries. As such it is not 
>>>>> explicitly targeting transliteration standards of any country.
>>>>>
>>>>> The fact that the patch is reflecting Russian variety of ISO 
>>>>> 9:1995/GOST_7.79_System_B is because a) ISO 
>>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a 
>>>>> majority of cyrillic users b) I have access to it including
>>>>> via being proficient in Russian.
>>>>>
>>>>> It is offered to all the respective locale maintainers as a 
>>>>> stopgap solution. Stopgap in the sense that it is better to 
>>>>> have some transliteration than not to have any at all and
>>>>> carry over the bug from 2006. That it may be a somewhat
>>>>> officially correct transliteration for ru_RU is a bonus. In
>>>>> that sense I would dub the discussion on the correctness for
>>>>> other languages "offtopic". Let me know if this is not OK.
>>>>>
>>>>> You are all are correctly mentioning the deficiencies of this 
>>>>> approach. However, I couldn't find a better straightforward 
>>>>> approach as of yet. Happy to hear from you as on how this
>>>>> could be handled.
>>>>>
>>>>> There is a danger of being caught in the web of 
>>>>> language/country differences. I propose just pruning the 
>>>>> locales that are not comfortable including this current table. 
>>>>> We can address possible solutions in the second wave of 
>>>>> patching.
>>>>>
>>>>> I am vary of getting into discussions on specific country 
>>>>> variants just because of the sheer complexity of this topic.
>>>>> It is probably better addressed by respective maintainers of
>>>>> their locales. I do not see a "one fits all" solution in this
>>>>> first wave possible.
>>>>>
>>>>> I would like to have this "three options plan of action"
>>>>> vetted first and then we could go to the specific detail.
>>>>> (Like, for instance, what characters should be included in to
>>>>> the table, and in which transliteration form.)
>>>>>
>>>>> I am looking forward to your reply, Egor Kobylkin
>>>>>
>>>>> P.S. specifically as to how address languages other than Ru 
>>>>> included in GOST_7.79_System_B: we can take the first option 
>>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will 
>>>>> technically work for all those locales/languages but with 
>>>>> errors where Ru supersedes their own variants.
>>>>>
>>>>>
>>>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>>>
>>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Please note that translitteration of Cyrillic to latin
>>>>>>>> is not universal. There are different schemes for for 
>>>>>>>> example German, English and Danish, and there is also an 
>>>>>>>> ISO standard for it.
>>>>>>>
>>>>>>> Thanks for your feedback, Keld!
>>>>>>>
>>>>>>> Could the locale maintainers that wouldn't like to include 
>>>>>>> this patch explicitly state so here?
>>>>>>
>>>>>> I think it is about me so I must reply.  I am sorry about 
>>>>>> that and the sole reason is my lack of time.  I'm just a 
>>>>>> volunteer here, that means it's not my regular job to work
>>>>>> on locale data nor anything in glibc nor in any other open 
>>>>>> source project.  I do these things only in my free time
>>>>>> which I don't have much.  Of course you will see my
>>>>>> contributions here and there but they are either trivial or
>>>>>> take me months to complete.  Your patches are on my radar but
>>>>>> I can't tell any ETA for them.  Of course, there are other
>>>>>> people around here and they are all welcome to come and
>>>>>> join.
>>>>>>
>>>>>>> That is: - In the case that there is a different preferred 
>>>>>>> cyrillic transliteration table for any specific locale 
>>>>>>> their maintainers may want to point me to it so I can 
>>>>>>> supply a separate table/patch. - Or they could state 
>>>>>>> explicitly that for some reason they would like to exclude 
>>>>>>> their locale from the patch for a default cyrillic 
>>>>>>> transliteration altogether.
>>>>>>
>>>>>> As Keld wrote, there are probably separate rules for every 
>>>>>> language so I don't think you should treat your rules as 
>>>>>> universal and include them in every locale.  At first sight, 
>>>>>> it seems to me they work only for English (as a destination 
>>>>>> locale).  Also, although it is called "transliteration from 
>>>>>> Cyrillic" it seems that it covers only Russian alphabet. What
>>>>>> about other languages which use Cyrillic alphabet but add
>>>>>> their own diacritic characters?  Think about Belarusian, 
>>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, 
>>>>>> Tatar, and more.  What about languages which use Cyrillic 
>>>>>> alphabet but transliterate their respective letters in a 
>>>>>> different way than Russian?  For example, Russian "Ъ" is (I 
>>>>>> think) usually skipped in transliteration, I think you 
>>>>>> propose "``", but when transliterating from Bulgarian they 
>>>>>> usually transliterate this as "ă".
>>>>>>
>>>>>> Few remarks:
>>>>>>
>>>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be 
>>>>>> better? * You transliterate "ц" as "cz", wouldn't "ts" be 
>>>>>> better?  By the way, in Polish language "cz" is a correct 
>>>>>> transliteration of "ч". * You transliterate "й" as "j", this 
>>>>>> is fine in many languages but wouldn't "y" be better in 
>>>>>> English? * In case of "е": how will you know if it is
>>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"?
>>>>>>
>>>>>> These remarks are obviously incomplete, your patch deserves 
>>>>>> much more attention to review.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Rafal
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-05 10:36             ` Egor Kobylkin
@ 2018-10-08 22:04               ` Rafal Luzynski
  2018-10-08 22:52                 ` Egor Kobylkin
                                   ` (2 more replies)
  0 siblings, 3 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-08 22:04 UTC (permalink / raw)
  To: Egor Kobylkin, Keld Simonsen, Marko Myllynen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

5.10.2018 12:36 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> I see three options:
> 1. those locale maintainers that are fine with using ISO
> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
> in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289
> 2. those that that want to have a differing table can create their own
> variety based on the spreadsheet I have prepared
> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
> this patch.
> 3. those that want to omit a cyrillic transliteration altogether for now
> state so and just carry over the bug #2872 from the year 2006.
>
> Does this make sense to you?

The problem is that we don't have a separate maintainer for each locale,
we have only 2 maintainers for about 200 locales and we must represent
them all.  Sometimes a locale may happen to be our own native locale or
of someone in this list, or it may be a locale which we accidentally can
speak as a foreign language, or we may have friends who can speak it.
Or it may be totally unknown and we still must somehow handle it.

I think that these transliteration rules should be included in multiple
locales on "opt-in" basis rather than "opt-out".  I mean, we should not
include them in all locales unless someone explicitly provides a different
rules.  Instead, I think we should add them (maybe with modification)
only to those locales where we have a good reason to think they will work.

Particularly, I think that those rules will not be helpful at all for
the languages which use neither Latin nor Cyrillic alphabet.

> [...]
> The fact that the patch is reflecting Russian variety of ISO
> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
> available and can be helpful to a majority of cyrillic users b) I have
> access to it including via being proficient in Russian.

I took a look at these standards and as first I doubted they may be
correct for English language now I understand they are created for
Russian users.  Therefore I think it is pretty correct to include them
to Russian locale data.  Will it be OK if we say that it is only for
Russian language?  Will it be satisfying for you and/or your users?

> It is offered to all the respective locale maintainers as a stopgap
> solution. Stopgap in the sense that it is better to have some
> transliteration than not to have any at all and carry over the bug from
> 2006. That it may be a somewhat officially correct transliteration for
> ru_RU is a bonus. In that sense I would dub the discussion on the
> correctness for other languages "offtopic". Let me know if this is not OK.

If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now.  I am afraid that the iconv
algorithm does not handle such case.  Of course, we should add this missing
feature eventually but I do not volunteer to do it now.

> [...]
> P.S. specifically as to how address languages other than Ru included in
> GOST_7.79_System_B: we can take the first option left to right from that
> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
> locales/languages but with errors where Ru supersedes their own variants.

Makes sense, as long as we cannot select the source language now.

But, while at this, is there anything that stops are from adding transliteration
rules for additional Cyrillic characters not used in Russian but used in
other languages?

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 12:40                       ` Marko Myllynen
@ 2018-10-08 22:23                         ` Rafal Luzynski
  2018-10-08 23:35                           ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-08 22:23 UTC (permalink / raw)
  To: Marko Myllynen, Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
> Hi,
>
> Thanks for the update. I have few mostly cosmetic comments below,
> hopefully we'll hear from others whether they agree with this direction.
>
> - Please add the standard glibc locale header (see the existing
> translit_* files for reference)
> - Consider wrapping the header lines at or around column 70-72
> - Consider describing which characters, character ranges, or blocks are
> supported (perhaps also describe why some of those are not included, see
> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
> - Please remove trailing whitespaces and spaces after ;

Thanks for this, Marko.  While at this, in the ChangeLog and in the commit
message these paths:

	* locales/aa_DJ: likewise

1. Should be a relative path starting in the root directory of glibc source,
   that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a dot).

> - No duplicates:
>
> % CYRILLIC SMALL LETTER IE
> <U0435> <U0065>; <U0065>
>
> should become:
>
> % CYRILLIC SMALL LETTER IE
> <U0435> <U0065>
>
> - There are few issues with the definitions:
>
> % CYRILLIC CAPITAL LETTER U
> <U0423> <U0055>; <U0055>
> % CYRILLIC UNDEFINED
> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>
> % CYRILLIC SMALL LETTER U
> <U0443> <U0075>; <U0075>
> % CYRILLIC UNDEFINED
> <U0443><U0443> <U00FA>; "<U0075><U0060>"

Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"?  Can we provide rules for groups of characters instead?

> I wonder would it be possible to automate generation of this file so
> that issues like the above could avoided? But perhaps that could be the
> next step once this initial patch lands.

I agree with this.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 22:04               ` Rafal Luzynski
@ 2018-10-08 22:52                 ` Egor Kobylkin
  2018-10-09 21:43                   ` Rafal Luzynski
  2018-10-08 23:20                 ` Zack Weinberg
  2018-10-09 16:10                 ` Marko Myllynen
  2 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-08 22:52 UTC (permalink / raw)
  To: Rafal Luzynski, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 6674 bytes --]

Hi Rafal,

> But, while at this, is there anything that stops are from adding
> transliteration rules for additional Cyrillic characters not used in
> Russian but used in other languages?

Just to make sure we are not talking at cross purposes. Since your last
email on this topic on the suggestion from Marko I have already
implemented ISO 9 transliteration for all characters there are. This
should cover most if not all Slavic Cyrillic. You seem to have just
noticed and replied to this email of Marko as I write mine.

Pls also check the Spreadsheet version I have just uploaded
https://sourceware.org/bugzilla/attachment.cgi?id=11298

I am currently absorbing Marko's further suggestions and correction to
that one and will get back for more discussion once done there. I am
reading your suggestions and taking them to my heart, be sure of that.

Two  professional translators independently indicated the difference
between transliteration and transcription to me. Transliteration is
normative (letter for letter) and transcription is phonetic - letter for
whatever combination of Latin letters in the target language that sounds
like it for a native speaker. While transliteration should be easy to
cover for all those languages via ISO 9, transcription is inherently
language specific. The problem is we are (mis)using the transcription as
transliteration to ASCII because ASCII set of characters does not allow
for proper transcription. Another problem is that to be really useful
the ASCII transliteration should work outside of source locale (i.e. not
only ru_RU but en_US, de_DE, en_DE, es_ES etc. or even just C locale).

In fact for myself I would be committed to do all work needed to cover
at least C, en_US, ru_RU, de_DE in that order. ru_RU as a "courtesy", I
am not really using it but hope more contributors for locales may come
because of that and fix my bugs :-).


> The problem is that we don't have a separate maintainer for each
> locale, we have only 2 maintainers for about 200 locales and we must
> represent them all.

It was not clear to me that glibc team can not fall back on the
individual locale maintainers to make the decision. But then it may make
the decision making even easier. If you guys have a list of requirements
(may be implicit until now) could you please shoot them my way? We can
also certainly just keep this thread up and have all issues ironed out.

Anyway hopefully with ISO 9 as a first column in the translit_cyrillic
we cover the issue of the completeness of transliteration now. What we
need to figure out is transcription/transliteration to ASCII - second
column.

Are we sharing the same view on this?

Speaking on decision making - maybe I can get an officially certified
court translator to answer our questions. Do you care to put a list
together of questions you would like answered to make a decision on the
table/inclusion into various locales?

Hope this helps,
Egor


On 09.10.2018 00:04, Rafal Luzynski wrote:
> 5.10.2018 12:36 Egor Kobylkin <egor@kobylkin.com> wrote:
>> [...] I see three options: 1. those locale maintainers that are
>> fine with using ISO 9:1995/GOST_7.79_System_B cyrillic
>> transliteration table (Ru) include it in their locales.
>> https://sourceware.org/bugzilla/attachment.cgi?id=11289 2. those
>> that that want to have a differing table can create their own 
>> variety based on the spreadsheet I have prepared 
>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include
>> it in this patch. 3. those that want to omit a cyrillic
>> transliteration altogether for now state so and just carry over the
>> bug #2872 from the year 2006.
>> 
>> Does this make sense to you?
> 
> The problem is that we don't have a separate maintainer for each
> locale, we have only 2 maintainers for about 200 locales and we must
> represent them all.  Sometimes a locale may happen to be our own
> native locale or of someone in this list, or it may be a locale which
> we accidentally can speak as a foreign language, or we may have
> friends who can speak it. Or it may be totally unknown and we still
> must somehow handle it.
> 
> I think that these transliteration rules should be included in
> multiple locales on "opt-in" basis rather than "opt-out".  I mean, we
> should not include them in all locales unless someone explicitly
> provides a different rules.  Instead, I think we should add them
> (maybe with modification) only to those locales where we have a good
> reason to think they will work.
> 
> Particularly, I think that those rules will not be helpful at all
> for the languages which use neither Latin nor Cyrillic alphabet.
> 
>> [...] The fact that the patch is reflecting Russian variety of ISO 
>> 9:1995/GOST_7.79_System_B is because a) ISO
>> 9:1995/GOST_7.79_System_B is available and can be helpful to a
>> majority of cyrillic users b) I have access to it including via
>> being proficient in Russian.
> 
> I took a look at these standards and as first I doubted they may be 
> correct for English language now I understand they are created for 
> Russian users.  Therefore I think it is pretty correct to include
> them to Russian locale data.  Will it be OK if we say that it is only
> for Russian language?  Will it be satisfying for you and/or your
> users?
> 
>> It is offered to all the respective locale maintainers as a
>> stopgap solution. Stopgap in the sense that it is better to have
>> some transliteration than not to have any at all and carry over the
>> bug from 2006. That it may be a somewhat officially correct
>> transliteration for ru_RU is a bonus. In that sense I would dub the
>> discussion on the correctness for other languages "offtopic". Let
>> me know if this is not OK.
> 
> If you refer to other languages than Russian which also use the
> Cyrillic alphabet but need a different transliteration rules than
> Russian for the same characters then it is OK for me now.  I am
> afraid that the iconv algorithm does not handle such case.  Of
> course, we should add this missing feature eventually but I do not
> volunteer to do it now.
> 
>> [...] P.S. specifically as to how address languages other than Ru
>> included in GOST_7.79_System_B: we can take the first option left
>> to right from that table (Ru,By,Uk,Bg,Mk). Then it will technically
>> work for all those locales/languages but with errors where Ru
>> supersedes their own variants.
> 
> Makes sense, as long as we cannot select the source language now.
> 
> But, while at this, is there anything that stops are from adding
> transliteration rules for additional Cyrillic characters not used in
> Russian but used in other languages?
> 
> Regards,
> 
> Rafal
> 


[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 16669 bytes --]

From: Marko Myllynen <myllynen@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, Rafal Luzynski <digitalfreak@lingonborough.com>, Keld Simonsen <keld@keldix.com>
Cc: libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
Date: Mon, 8 Oct 2018 15:40:53 +0300
Message-ID: <f51992ad-008b-03a4-8880-4c12edced53b@redhat.com>

Hi,

Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.

- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
- No duplicates:

% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>

should become:

% CYRILLIC SMALL LETTER IE
<U0435> <U0065>

- There are few issues with the definitions:

% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"

% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"

I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.

Thanks,

On 2018-10-05 23:47, Egor Kobylkin wrote:
> After some kind help from Marko in the offline discussion
> I realized the multi/single character approach I originally took was
> against the  of the iconv(1) logic anyway. So there is no harm in
> dropping it and adopting Marko's suggestion instead. I will do so and
> will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
> GOST 7.79 System B (for ASCII).
> 
> However this doesn't resolve the issue for ASCII part being different
> for various locales. Again, I am offering the locale maintainers to let
> me know if they want to 1) adopt the one I am supplying, 2) write their
> own or 3) ignore the patch altogether. Your feedback is appreciated!
> 
> This is the relevant part that helped:
>> The first part (ISO-8859-15 or ASCII) defines the target encoding for
>> iconv(1). //TRANSLIT is described in the iconv(1) man page as:
>>
>> If the string //TRANSLIT is appended to to-encoding,  characters 
>> being  converted  are  transliterated  when needed and possible. This
>> means that when a character cannot be  represented  in  the target
>> character set, it can be approximated through one or sev‐ eral
>> similar looking characters.  Characters that are outside of the
>> target  character  set  and  cannot  be  transliterated are replaced
>> with a question mark (?) in the output.
>>
>> So in the above examples, iconv(1) encounters the character U+0428
>> which is not part of either of the target encoding and since
>> //TRANSLIT is specified, iconv(1) tries transliteration according to
>> the rules defined above, in case of ASCII U+0160 is not part of the
>> target encoding so the next alternative is used.
> 
> Bests,
> Egor Kobylkin
> 
> On 05.10.2018 14:21, Marko Myllynen wrote:
>> Hi,
>>
>> The scheme I proposed would also be ASCII compatible; consider this 
>> example:
>>
>> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
>>
>> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv 
>> -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
>>  \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as 
>> per System B.
>>
>> Thanks,
>>
>> On 2018-10-05 15:00, Egor Kobylkin wrote:
>>> Hi Marko,
>>>
>>> I have chosen the System B because it is ASCII compartible. System 
>>> A is not ASCII compartible (diacritics in target).
>>>
>>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
>>>
>>>
>>>
> "GOST 7.79 contains two transliteration tables.
>>>
>>> System A one Cyrillic character to one Latin character, some with 
>>> diacritics – identical to ISO 9:1995
>>>
>>> System B one Cyrillic character to one or many Latin characters 
>>> without diacritics " Hope this helps, Egor
>>>
>>> On 05.10.2018 13:54, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>>>> possible and if not, then fall back to GOST 7.79 System B?
>>>>
>>>> Implementation-wise current translit_* files have few examples 
>>>> where a non-ASCII transliteration is tried first before an ASCII 
>>>> fallback. These examples are from translit_neutral:
>>>>
>>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
>>>> TRIPLE PRIME <U2037>
>>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>>>
>>>> Thanks,
>>>>
>>>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>>>> Keld,Marko,Rafal, other locale maintainers,
>>>>>
>>>>> this all is written with having in mind a minimal viable fix 
>>>>> for this bug asap. I want to avoid wasting maintainers time 
>>>>> getting into fundamental discussions here (although for 
>>>>> perfectly good reasons).
>>>>>
>>>>> I see three options: 1. those locale maintainers that are fine 
>>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic 
>>>>> transliteration table (Ru) include it in their locales (see 
>>>>> attached screenshot of the table). 2. those that that want to 
>>>>> have a differing table can create their own variety based on 
>>>>> the spreadsheet I have prepared 
>>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and 
>>>>> include it in this patch. 3. those that want to omit a
>>>>> cyrillic transliteration altogether for now state so and just
>>>>> carry over the bug #2872 from the year 2006.
>>>>>
>>>>> Does this make sense to you?
>>>>>
>>>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>>>>  transliteration table. ASCII being AMERICAN Standard Code for
>>>>>  Information Interchange, that is obviously orthogonal to any 
>>>>> transliteration rule of other countries. As such it is not 
>>>>> explicitly targeting transliteration standards of any country.
>>>>>
>>>>> The fact that the patch is reflecting Russian variety of ISO 
>>>>> 9:1995/GOST_7.79_System_B is because a) ISO 
>>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a 
>>>>> majority of cyrillic users b) I have access to it including
>>>>> via being proficient in Russian.
>>>>>
>>>>> It is offered to all the respective locale maintainers as a 
>>>>> stopgap solution. Stopgap in the sense that it is better to 
>>>>> have some transliteration than not to have any at all and
>>>>> carry over the bug from 2006. That it may be a somewhat
>>>>> officially correct transliteration for ru_RU is a bonus. In
>>>>> that sense I would dub the discussion on the correctness for
>>>>> other languages "offtopic". Let me know if this is not OK.
>>>>>
>>>>> You are all are correctly mentioning the deficiencies of this 
>>>>> approach. However, I couldn't find a better straightforward 
>>>>> approach as of yet. Happy to hear from you as on how this
>>>>> could be handled.
>>>>>
>>>>> There is a danger of being caught in the web of 
>>>>> language/country differences. I propose just pruning the 
>>>>> locales that are not comfortable including this current table. 
>>>>> We can address possible solutions in the second wave of 
>>>>> patching.
>>>>>
>>>>> I am vary of getting into discussions on specific country 
>>>>> variants just because of the sheer complexity of this topic.
>>>>> It is probably better addressed by respective maintainers of
>>>>> their locales. I do not see a "one fits all" solution in this
>>>>> first wave possible.
>>>>>
>>>>> I would like to have this "three options plan of action"
>>>>> vetted first and then we could go to the specific detail.
>>>>> (Like, for instance, what characters should be included in to
>>>>> the table, and in which transliteration form.)
>>>>>
>>>>> I am looking forward to your reply, Egor Kobylkin
>>>>>
>>>>> P.S. specifically as to how address languages other than Ru 
>>>>> included in GOST_7.79_System_B: we can take the first option 
>>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will 
>>>>> technically work for all those locales/languages but with 
>>>>> errors where Ru supersedes their own variants.
>>>>>
>>>>>
>>>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>>>
>>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Please note that translitteration of Cyrillic to latin
>>>>>>>> is not universal. There are different schemes for for 
>>>>>>>> example German, English and Danish, and there is also an 
>>>>>>>> ISO standard for it.
>>>>>>>
>>>>>>> Thanks for your feedback, Keld!
>>>>>>>
>>>>>>> Could the locale maintainers that wouldn't like to include 
>>>>>>> this patch explicitly state so here?
>>>>>>
>>>>>> I think it is about me so I must reply.  I am sorry about 
>>>>>> that and the sole reason is my lack of time.  I'm just a 
>>>>>> volunteer here, that means it's not my regular job to work
>>>>>> on locale data nor anything in glibc nor in any other open 
>>>>>> source project.  I do these things only in my free time
>>>>>> which I don't have much.  Of course you will see my
>>>>>> contributions here and there but they are either trivial or
>>>>>> take me months to complete.  Your patches are on my radar but
>>>>>> I can't tell any ETA for them.  Of course, there are other
>>>>>> people around here and they are all welcome to come and
>>>>>> join.
>>>>>>
>>>>>>> That is: - In the case that there is a different preferred 
>>>>>>> cyrillic transliteration table for any specific locale 
>>>>>>> their maintainers may want to point me to it so I can 
>>>>>>> supply a separate table/patch. - Or they could state 
>>>>>>> explicitly that for some reason they would like to exclude 
>>>>>>> their locale from the patch for a default cyrillic 
>>>>>>> transliteration altogether.
>>>>>>
>>>>>> As Keld wrote, there are probably separate rules for every 
>>>>>> language so I don't think you should treat your rules as 
>>>>>> universal and include them in every locale.  At first sight, 
>>>>>> it seems to me they work only for English (as a destination 
>>>>>> locale).  Also, although it is called "transliteration from 
>>>>>> Cyrillic" it seems that it covers only Russian alphabet. What
>>>>>> about other languages which use Cyrillic alphabet but add
>>>>>> their own diacritic characters?  Think about Belarusian, 
>>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, 
>>>>>> Tatar, and more.  What about languages which use Cyrillic 
>>>>>> alphabet but transliterate their respective letters in a 
>>>>>> different way than Russian?  For example, Russian "Ъ" is (I 
>>>>>> think) usually skipped in transliteration, I think you 
>>>>>> propose "``", but when transliterating from Bulgarian they 
>>>>>> usually transliterate this as "ă".
>>>>>>
>>>>>> Few remarks:
>>>>>>
>>>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be 
>>>>>> better? * You transliterate "ц" as "cz", wouldn't "ts" be 
>>>>>> better?  By the way, in Polish language "cz" is a correct 
>>>>>> transliteration of "ч". * You transliterate "й" as "j", this 
>>>>>> is fine in many languages but wouldn't "y" be better in 
>>>>>> English? * In case of "е": how will you know if it is
>>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"?
>>>>>>
>>>>>> These remarks are obviously incomplete, your patch deserves 
>>>>>> much more attention to review.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Rafal
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 22:04               ` Rafal Luzynski
  2018-10-08 22:52                 ` Egor Kobylkin
@ 2018-10-08 23:20                 ` Zack Weinberg
  2018-10-09 15:26                   ` Carlos O'Donell
  2018-10-09 16:10                 ` Marko Myllynen
  2 siblings, 1 reply; 111+ messages in thread
From: Zack Weinberg @ 2018-10-08 23:20 UTC (permalink / raw)
  To: Rafal Luzynski, GNU C Library

On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski
<digitalfreak@lingonborough.com> wrote:
> The problem is that we don't have a separate maintainer for each locale,
> we have only 2 maintainers for about 200 locales and we must represent
> them all.  Sometimes a locale may happen to be our own native locale or
> of someone in this list, or it may be a locale which we accidentally can
> speak as a foreign language, or we may have friends who can speak it.
> Or it may be totally unknown and we still must somehow handle it.

I just want to mention that this is also why most of the non-locale
maintainers tend to stay out of threads about locales.  We know we're
even less expert on these issues than you are, and I think as a
general rule you should be assuming that the community is OK with what
you're doing unless someone speaks up to object.

zw

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 22:23                         ` Rafal Luzynski
@ 2018-10-08 23:35                           ` Egor Kobylkin
  2018-10-09 13:18                             ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-08 23:35 UTC (permalink / raw)
  To: Rafal Luzynski, Marko Myllynen, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

On 09.10.2018 00:23, Rafal Luzynski wrote:
> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
>> Hi,
>>
>> Thanks for the update. I have few mostly cosmetic comments below,
>> hopefully we'll hear from others whether they agree with this direction.
>>

Yeah, the earlier we have feedback the more productive we are. I'd be
happy to get much feedback on this as early as possible. So please
everybody concerned please chime in.

> 
>> - No duplicates:
>>
>> % CYRILLIC SMALL LETTER IE
>> <U0435> <U0065>; <U0065>
>>
>> should become:
>>
>> % CYRILLIC SMALL LETTER IE
>> <U0435> <U0065>
>>
>> - There are few issues with the definitions:
>>
>> % CYRILLIC CAPITAL LETTER U
>> <U0423> <U0055>; <U0055>
>> % CYRILLIC UNDEFINED
>> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>>
>> % CYRILLIC SMALL LETTER U
>> <U0443> <U0075>; <U0075>
>> % CYRILLIC UNDEFINED
>> <U0443><U0443> <U00FA>; "<U0075><U0060>"
> 
> Are the duplicates here because some Cyrillic letters may have multiple
> Latin transliterations depending on the context, for example Cyrillic IE
> must be transliterated sometimes as "e", sometimes as "ie", sometimes
> as "ye" or "je"?  Can we provide rules for groups of characters instead?
No, the duplicates are just by design of my line generating logic. I
have fixed (removed) them. The varying transcription between
languages/locales can not be handled in one file at all as far as I
understood.

> 
>> I wonder would it be possible to automate generation of this file so
>> that issues like the above could avoided? But perhaps that could be the
>> next step once this initial patch lands.

I am generating the content part of the translit_cyrillc from the
LibreOffice Spreadsheet. Not sure if you had time to view it by now?
https://sourceware.org/bugzilla/attachment.cgi?id=11299

Anyway I have just fixed the issues identified by Marko above in that
spreadsheet. I will do the changes for the below request and then upload
the new translit_cyrillic file to the bugzilla.


>> - Please add the standard glibc locale header (see the existing
>> translit_* files for reference)
>> - Consider wrapping the header lines at or around column 70-72
>> - Consider describing which characters, character ranges, or blocks are
>> supported (perhaps also describe why some of those are not included, see
>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
>> - Please remove trailing whitespaces and spaces after ;
>
> Thanks for this, Marko.  While at this, in the ChangeLog and in the commit
> message these paths:
>
> 	* locales/aa_DJ: likewise
>
> 1. Should be a relative path starting in the root directory of glibc
source,
>    that is: "* localedata/locales/aa_DJ".
> 2. Should be "Likewise." (starting with an uppercase and ending with a
dot).

will do.

Bests,
Egor

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 23:35                           ` Egor Kobylkin
@ 2018-10-09 13:18                             ` Egor Kobylkin
  2018-10-09 18:34                               ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 13:18 UTC (permalink / raw)
  To: Rafal Luzynski, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 3786 bytes --]

Hi,

I have now implemented all the changes requested for translit_cyrillic
file but started hitting what seems like a bug:

- If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the
locale compilation fails i.e. grep CYRILLIC < $testfile |
LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen.

- If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic
everything works, just the transliteration of <U0425> fails as expected
(? is displayed)

- If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_
line the transliteration of <U0425> works again (others as ?).

Would you have any idea into what direction should I look? The new
translit_cyrillic is attached.

(<U0425> is % CYRILLIC CAPITAL LETTER HA)

Best regards,
Egor

On 09.10.2018 01:35, Egor Kobylkin wrote:
> On 09.10.2018 00:23, Rafal Luzynski wrote:
>> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
>>> Hi,
>>>
>>> Thanks for the update. I have few mostly cosmetic comments below,
>>> hopefully we'll hear from others whether they agree with this direction.
>>>
> 
> Yeah, the earlier we have feedback the more productive we are. I'd be
> happy to get much feedback on this as early as possible. So please
> everybody concerned please chime in.
> 
>>
>>> - No duplicates:
>>>
>>> % CYRILLIC SMALL LETTER IE
>>> <U0435> <U0065>; <U0065>
>>>
>>> should become:
>>>
>>> % CYRILLIC SMALL LETTER IE
>>> <U0435> <U0065>
>>>
>>> - There are few issues with the definitions:
>>>
>>> % CYRILLIC CAPITAL LETTER U
>>> <U0423> <U0055>; <U0055>
>>> % CYRILLIC UNDEFINED
>>> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>>>
>>> % CYRILLIC SMALL LETTER U
>>> <U0443> <U0075>; <U0075>
>>> % CYRILLIC UNDEFINED
>>> <U0443><U0443> <U00FA>; "<U0075><U0060>"
>>
>> Are the duplicates here because some Cyrillic letters may have multiple
>> Latin transliterations depending on the context, for example Cyrillic IE
>> must be transliterated sometimes as "e", sometimes as "ie", sometimes
>> as "ye" or "je"?  Can we provide rules for groups of characters instead?
> No, the duplicates are just by design of my line generating logic. I
> have fixed (removed) them. The varying transcription between
> languages/locales can not be handled in one file at all as far as I
> understood.
> 
>>
>>> I wonder would it be possible to automate generation of this file so
>>> that issues like the above could avoided? But perhaps that could be the
>>> next step once this initial patch lands.
> 
> I am generating the content part of the translit_cyrillc from the
> LibreOffice Spreadsheet. Not sure if you had time to view it by now?
> https://sourceware.org/bugzilla/attachment.cgi?id=11299
> 
> Anyway I have just fixed the issues identified by Marko above in that
> spreadsheet. I will do the changes for the below request and then upload
> the new translit_cyrillic file to the bugzilla.
> 
> 
>>> - Please add the standard glibc locale header (see the existing
>>> translit_* files for reference)
>>> - Consider wrapping the header lines at or around column 70-72
>>> - Consider describing which characters, character ranges, or blocks are
>>> supported (perhaps also describe why some of those are not included, see
>>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
>>> - Please remove trailing whitespaces and spaces after ;
>>
>> Thanks for this, Marko.  While at this, in the ChangeLog and in the commit
>> message these paths:
>>
>> 	* locales/aa_DJ: likewise
>>
>> 1. Should be a relative path starting in the root directory of glibc
> source,
>>    that is: "* localedata/locales/aa_DJ".
>> 2. Should be "Likewise." (starting with an uppercase and ending with a
> dot).
> 
> will do.
> 
> Bests,
> Egor
> 


[-- Attachment #2: translit_cyrillic --]
[-- Type: text/plain, Size: 12688 bytes --]

escape_char /
comment_char %

% This file is part of the GNU C Library and contains locale data.
% The Free Software Foundation does not claim any copyright interest
% in the locale data contained in this file.  The foregoing does not
% affect the license of the GNU C Library as a whole.  It does not
% exempt you from the conditions of the license if your use would
% otherwise be governed by that license.

% Transliterations of cyrillic letters to latin and/or ascii symbols.
% Inspired by ISO 9.1995 / GOST 7.79-2000.
% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995 
% It implements the GOST_7.79 System A (Latin Script) as a first 
% option and System B Cyrillic (ASCII) as a second option. Check
% https://en.wikipedia.org/wiki/ISO_9 for reference. 
% The System B is extended from GOST_7.79-Russian using open sources 
% of the transliteration mappings and the "h/`" diacritics logic.

% Usage examples:
% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
%   | iconv -f ISO-8859-15 -t UTF-8 # System A
% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.

% Contributions welcome for the rest of Cyrillic script in Unicode
% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
% Generated from UnicodeData.txt with 
% https://sourceware.org/bugzilla/attachment.cgi?id=11300.

LC_CTYPE

translit_start

% CYRILLIC CAPITAL LETTER IO
<U0401> <U00CB>;"<U0059><U004F>"
% CYRILLIC CAPITAL LETTER DJE
<U0402> <U0110>;"<U0044><U004A>"
% CYRILLIC CAPITAL LETTER GJE
<U0403> <U01F4>;"<U0047><U0060>"
% CYRILLIC CAPITAL LETTER UKRAINIAN IE
<U0404> <U00CA>;"<U0059><U0065>"
% CYRILLIC CAPITAL LETTER DZE
<U0405> <U1E90>;"<U005A><U0060>"
% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
<U0406> <U00CC>;<U0049>
% CYRILLIC CAPITAL LETTER YI
<U0407> <U00CF>;"<U0059><U0069>"
% CYRILLIC CAPITAL LETTER JE
<U0408> "<U004A><U030C>";<U004A>
% CYRILLIC CAPITAL LETTER LJE
<U0409> "<U004C><U0302>";"<U004C><U0060>"
% CYRILLIC CAPITAL LETTER NJE
<U040A> "<U004E><U0302>";"<U004E><U0060>"
% CYRILLIC CAPITAL LETTER TSHE
<U040B> <U0106>;"<U0054><U0053><U0048>"
% CYRILLIC CAPITAL LETTER KJE
<U040C> <U1E30>;"<U004B><U0060>"
% CYRILLIC CAPITAL LETTER SHORT U
<U040E> <U016C>;"<U0055><U0060>"
% CYRILLIC CAPITAL LETTER DZHE
<U040F> "<U0044><U0302>";"<U0044><U0068>"
% CYRILLIC CAPITAL LETTER A
<U0410> <U0041>
% CYRILLIC CAPITAL LETTER BE
<U0411> <U0042>
% CYRILLIC CAPITAL LETTER VE
<U0412> <U0056>
% CYRILLIC CAPITAL LETTER GHE
<U0413> <U0047>
% CYRILLIC CAPITAL LETTER DE
<U0414> <U0044>
% CYRILLIC CAPITAL LETTER IE
<U0415> <U0045>
% CYRILLIC CAPITAL LETTER ZHE
<U0416> <U017D>;"<U005A><U0048>"
% CYRILLIC CAPITAL LETTER ZE
<U0417> <U005A>
% CYRILLIC CAPITAL LETTER I
<U0418> <U0049>
% CYRILLIC CAPITAL LETTER SHORT I
<U0419> <U004A>
% CYRILLIC CAPITAL LETTER KA
<U041A> <U004B>
% CYRILLIC CAPITAL LETTER EL
<U041B> <U004C>
% CYRILLIC CAPITAL LETTER EM
<U041C> <U004D>
% CYRILLIC CAPITAL LETTER EN
<U041D> <U004E>
% CYRILLIC CAPITAL LETTER O
<U041E> <U004F>
% CYRILLIC CAPITAL LETTER PE
<U041F> <U0050>
% CYRILLIC CAPITAL LETTER ER
<U0420> <U0052>
% CYRILLIC CAPITAL LETTER ES
<U0421> <U0053>
% CYRILLIC CAPITAL LETTER TE
<U0422> <U0054>
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>
% CYRILLIC UNDEFINED
"<U0423><U0301>" <U00DA>;"<U0055><U0060>"
% CYRILLIC CAPITAL LETTER EF
<U0424> <U0046>
% CYRILLIC CAPITAL LETTER HA
<U0425> <U0048>;<U0058>
% CYRILLIC CAPITAL LETTER TSE
<U0426> <U0043>;"<U0043><U005A>"
% CYRILLIC CAPITAL LETTER CHE
<U0427> <U010C>;"<U0043><U0048>"
% CYRILLIC CAPITAL LETTER SHA
<U0428> <U0160>;"<U0053><U0048>"
% CYRILLIC CAPITAL LETTER SHCHA
<U0429> <U015C>;"<U0053><U0048><U0048>"
% CYRILLIC CAPITAL LETTER HARD SIGN
<U042A> <U02BA>;"<U0041><U0060>"
% CYRILLIC CAPITAL LETTER YERU
<U042B> <U0059>;"<U0059><U0060>"
% CYRILLIC CAPITAL LETTER SOFT SIGN
<U042C> <U02B9>;<U0060>
% CYRILLIC CAPITAL LETTER E
<U042D> <U00C8>;"<U0045><U0060>"
% CYRILLIC CAPITAL LETTER YU
<U042E> <U00DB>;"<U0059><U0055>"
% CYRILLIC CAPITAL LETTER YA
<U042F> <U00C2>;"<U0059><U0041>"
% CYRILLIC SMALL LETTER A
<U0430> <U0061>
% CYRILLIC SMALL LETTER BE
<U0431> <U0062>
% CYRILLIC SMALL LETTER VE
<U0432> <U0076>
% CYRILLIC SMALL LETTER GHE
<U0433> <U0067>
% CYRILLIC SMALL LETTER DE
<U0434> <U0064>
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
% CYRILLIC SMALL LETTER ZHE
<U0436> <U017E>;"<U007A><U0068>"
% CYRILLIC SMALL LETTER ZE
<U0437> <U007A>
% CYRILLIC SMALL LETTER I
<U0438> <U0069>
% CYRILLIC SMALL LETTER SHORT I
<U0439> <U006A>
% CYRILLIC SMALL LETTER KA
<U043A> <U006B>
% CYRILLIC SMALL LETTER EL
<U043B> <U006C>
% CYRILLIC SMALL LETTER EM
<U043C> <U006D>
% CYRILLIC SMALL LETTER EN
<U043D> <U006E>
% CYRILLIC SMALL LETTER O
<U043E> <U006F>
% CYRILLIC SMALL LETTER PE
<U043F> <U0070>
% CYRILLIC SMALL LETTER ER
<U0440> <U0072>
% CYRILLIC SMALL LETTER ES
<U0441> <U0073>
% CYRILLIC SMALL LETTER TE
<U0442> <U0074>
% CYRILLIC SMALL LETTER U
<U0443> <U0075>
% CYRILLIC UNDEFINED
"<U0443><U0301>" <U00FA>;"<U0075><U0060>"
% CYRILLIC SMALL LETTER EF
<U0444> <U0066>
% CYRILLIC SMALL LETTER HA
<U0445> <U0068>;<U0078>
% CYRILLIC SMALL LETTER TSE
<U0446> <U0063>;"<U0063><U007A>"
% CYRILLIC SMALL LETTER CHE
<U0447> <U010D>;"<U0063><U0068>"
% CYRILLIC SMALL LETTER SHA
<U0448> <U0161>;"<U0073><U0068>"
% CYRILLIC SMALL LETTER SHCHA
<U0449> <U015D>;"<U0073><U0068><U0068>"
% CYRILLIC SMALL LETTER HARD SIGN
<U044A> <U02BA>;"<U0060><U0060>"
% CYRILLIC SMALL LETTER YERU
<U044B> <U0079>;"<U0079><U0060>"
% CYRILLIC SMALL LETTER SOFT SIGN
<U044C> <U02B9>;<U0060>
% CYRILLIC SMALL LETTER E
<U044D> <U00E8>;"<U0065><U0060>"
% CYRILLIC SMALL LETTER YU
<U044E> <U00FB>;"<U0079><U0075>"
% CYRILLIC SMALL LETTER YA
<U044F> <U00E2>;"<U0079><U0061>"
% CYRILLIC SMALL LETTER IO
<U0451> <U00EB>;"<U0079><U006F>"
% CYRILLIC SMALL LETTER DJE
<U0452> <U0111>;"<U0064><U006A>"
% CYRILLIC SMALL LETTER GJE
<U0453> <U01F5>;"<U0067><U0060>"
% CYRILLIC SMALL LETTER UKRAINIAN IE
<U0454> <U00EA>;"<U0079><U0065>"
% CYRILLIC SMALL LETTER DZE
<U0455> <U1E91>;"<U007A><U0060>"
% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
<U0456> <U00EC>;<U0069>
% CYRILLIC SMALL LETTER YI
<U0457> <U00EF>;"<U0079><U0069>"
% CYRILLIC SMALL LETTER JE
<U0458> <U01F0>;<U006A>
% CYRILLIC SMALL LETTER LJE
<U0459> "<U006C><U0302>";"<U006C><U0060>"
% CYRILLIC SMALL LETTER NJE
<U045A> "<U006E><U0302>";"<U006E><U0060>"
% CYRILLIC SMALL LETTER TSHE
<U045B> <U0107>;"<U0074><U0073><U0068>"
% CYRILLIC SMALL LETTER KJE
<U045C> <U1E31>;"<U006B><U0060>"
% CYRILLIC SMALL LETTER SHORT U
<U045E> <U016D>;"<U0075><U0060>"
% CYRILLIC SMALL LETTER DZHE
<U045F> "<U0064><U0302>";"<U0064><U0068>"
% CYRILLIC CAPITAL LETTER BIG YUS
<U046A> <U01CD>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER BIG YUS
<U046B> <U01CE>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER FITA
<U0472> "<U0046><U0300>";"<U0046><U0068>"
% CYRILLIC SMALL LETTER FITA
<U0473> "<U0066><U0300>";"<U0066><U0068>"
% CYRILLIC CAPITAL LETTER IZHITSA
<U0474> <U1EF2>;"<U0059><U0068>"
% CYRILLIC SMALL LETTER IZHITSA
<U0475> <U1EF3>;"<U0079><U0068>"
% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
<U048C> <U011A>;"<U0045><U0060>"
% CYRILLIC SMALL LETTER SEMISOFT SIGN
<U048D> <U011B>;"<U0065><U0060>"
% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
<U0490> "<U0047><U0300>";"<U0047><U0060>"
% CYRILLIC SMALL LETTER GHE WITH UPTURN
<U0491> "<U0067><U0300>";"<U0067><U0060>"
% CYRILLIC CAPITAL LETTER GHE WITH STROKE
<U0492> <U0120>;"<U0047><U0048>"
% CYRILLIC SMALL LETTER GHE WITH STROKE
<U0493> <U0121>;"<U0067><U0068>"
% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
<U0494> <U011E>;"<U0047><U0048>"
% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
<U0495> <U011F>;"<U0067><U0068>"
% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
<U049A> <U0136>;"<U004B><U0060>"
% CYRILLIC SMALL LETTER KA WITH DESCENDER
<U049B> <U0137>;"<U006B><U0060>"
% CYRILLIC CAPITAL LETTER KA WITH STROKE
<U049E> "<U004B><U0304>";"<U004B><U0060>"
% CYRILLIC SMALL LETTER KA WITH STROKE
<U049F> "<U006B><U0304>";"<U006B><U0060>"
% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
<U04A2> <U1E46>;"<U004E><U0060>"
% CYRILLIC SMALL LETTER EN WITH DESCENDER
<U04A3> <U1E47>;"<U006E><U0060>"
% CYRILLIC CAPITAL LIGATURE EN GHE
<U04A4> <U1E44>;"<U004E><U0047>"
% CYRILLIC SMALL LIGATURE EN GHE
<U04A5> <U1E45>;"<U006E><U0067>"
% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
<U04A6> <U1E54>;"<U0050><U0060>"
% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
<U04A7> <U1E55>;"<U0070><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN HA
<U04A8> <U00D2>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN HA
<U04A9> <U00F2>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
<U04AA> <U00C7>;"<U0043><U0060>"
% CYRILLIC SMALL LETTER ES WITH DESCENDER
<U04AB> <U00E7>;"<U0043><U0060>"
% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
<U04AC> <U0162>;"<U0054><U0060>"
% CYRILLIC SMALL LETTER TE WITH DESCENDER
<U04AD> <U0163>;"<U0074><U0060>"
% CYRILLIC CAPITAL LETTER STRAIGHT U
<U04AE> <U00D9>;<U0055>
% CYRILLIC SMALL LETTER STRAIGHT U
<U04AF> <U00F9>;<U0075>
% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
<U04B2> <U1E28>;"<U0048><U0060>"
% CYRILLIC SMALL LETTER HA WITH DESCENDER
<U04B3> <U1E29>;"<U0068><U0060>"
% CYRILLIC CAPITAL LIGATURE TE TSE
<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
% CYRILLIC SMALL LIGATURE TE TSE
<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
% CYRILLIC CAPITAL LETTER SHHA
<U04BA> <U1E24>;"<U0053><U0048><U0060>"
% CYRILLIC SMALL LETTER SHHA
<U04BB> <U1E25>;"<U0053><U0048><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN CHE
<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
% CYRILLIC LETTER PALOCHKA
<U04C0> <U2021>;<U0069>
% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
% CYRILLIC SMALL LETTER ZHE WITH BREVE
<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
<U04CB> <U00C7>;"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER KHAKASSIAN CHE
<U04CC> <U00E7>;"<U0063><U0068><U0060>"
% CYRILLIC CAPITAL LETTER A WITH BREVE
<U04D0> <U0102>;"<U0041><U0060>"
% CYRILLIC SMALL LETTER A WITH BREVE
<U04D1> <U0103>;"<U0061><U0060>"
% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
<U04D2> <U00C4>;"<U0041><U0060>"
% CYRILLIC SMALL LETTER A WITH DIAERESIS
<U04D3> <U00E4>;"<U0061><U0060>"
% CYRILLIC CAPITAL LETTER IE WITH BREVE
<U04D6> <U0114>;"<U0045><U0060>"
% CYRILLIC SMALL LETTER IE WITH BREVE
<U04D7> <U0115>;"<U0065><U0060>"
% CYRILLIC CAPITAL LETTER SCHWA
<U04D8> "<U0041><U030B>";"<U0041><U0060>"
% CYRILLIC SMALL LETTER SCHWA
<U04D9> "<U0061><U030B>";"<U0061><U0060>"
% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
<U04DE> "<U005A><U0308>";"<U005A><U0060>"
% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
<U04DF> "<U007A><U0308>";"<U007A><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
<U04E0> <U0179>;"<U005A><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN DZE
<U04E1> <U017A>;"<U007A><U0060>"
% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
<U04E4> <U00CE>;"<U0049><U0060>"
% CYRILLIC SMALL LETTER I WITH DIAERESIS
<U04E5> <U00EE>;"<U0069><U0060>"
% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
<U04E6> <U00D6>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER O WITH DIAERESIS
<U04E7> <U00F6>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER BARRED O
<U04E8> <U00D4>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER BARRED O
<U04E9> <U00F4>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
<U04F0> <U00DC>;"<U0055><U0060>"
% CYRILLIC SMALL LETTER U WITH DIAERESIS
<U04F1> <U00FC>;"<U0075><U0060>"
% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
<U04F2> <U0170>;"<U0055><U0060>"
% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
<U04F3> <U0171>;"<U0075><U0060>"
% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
<U04F8> <U0178>;"<U0059><U0060>"
% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
<U04F9> <U00FF>;"<U0079><U0060>"
% RIGHT SINGLE QUOTATION MARK
<U2019> <U2035>;<U0027>

translit_end

END LC_CTYPE

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 23:20                 ` Zack Weinberg
@ 2018-10-09 15:26                   ` Carlos O'Donell
  2018-10-09 21:51                     ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2018-10-09 15:26 UTC (permalink / raw)
  To: Zack Weinberg, Rafal Luzynski, GNU C Library

On 10/8/18 7:20 PM, Zack Weinberg wrote:
> On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski
> <digitalfreak@lingonborough.com> wrote:
>> The problem is that we don't have a separate maintainer for each locale,
>> we have only 2 maintainers for about 200 locales and we must represent
>> them all.  Sometimes a locale may happen to be our own native locale or
>> of someone in this list, or it may be a locale which we accidentally can
>> speak as a foreign language, or we may have friends who can speak it.
>> Or it may be totally unknown and we still must somehow handle it.
> 
> I just want to mention that this is also why most of the non-locale
> maintainers tend to stay out of threads about locales.  We know we're
> even less expert on these issues than you are, and I think as a
> general rule you should be assuming that the community is OK with what
> you're doing unless someone speaks up to object.

I agree with Zach here.

Rafal and Mike are localedata subsystem maintainers, and your best efforts
are the best we have right now in the community.

I also agree that a conservative position of is always a good place to start,
but it sounds like Egor has added enough coverage to perhaps make all of
these transliterations opt-in by default.

I don't have a good sense of this though, and so I defer to you as a the
subsystem maintainer to review and formulate a position. If you have any
specific questions, I can certainly help review.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 22:04               ` Rafal Luzynski
  2018-10-08 22:52                 ` Egor Kobylkin
  2018-10-08 23:20                 ` Zack Weinberg
@ 2018-10-09 16:10                 ` Marko Myllynen
  2018-10-09 16:22                   ` Egor Kobylkin
                                     ` (2 more replies)
  2 siblings, 3 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-09 16:10 UTC (permalink / raw)
  To: Rafal Luzynski, Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

On 2018-10-09 01:04, Rafal Luzynski wrote:
> 
> Particularly, I think that those rules will not be helpful at all for
> the languages which use neither Latin nor Cyrillic alphabet.

This is certainly a very good point.

> If you refer to other languages than Russian which also use the Cyrillic
> alphabet but need a different transliteration rules than Russian for
> the same characters then it is OK for me now.  I am afraid that the iconv
> algorithm does not handle such case.  Of course, we should add this missing
> feature eventually but I do not volunteer to do it now.

Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:

Russian:        Борис Николаевич Ельцин
Int'l:          Boris Nikolaevič Elʹcin
Finnish:        Boris Nikolajevitš Jeltsin
French:         Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]

For French you'll get the correct transliteration with iconv by using -t
ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
not so obvious how to get the above kind transliteration for ISO 9
international or especially for the phonetic case.

One thing that might be helpful here could be something like:

$ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
ž

That is, force transliteration of each character (if defined) even if
it's part of the target character set. AFAICS this is not currently
possible.

> But, while at this, is there anything that stops are from adding transliteration
> rules for additional Cyrillic characters not used in Russian but used in
> other languages?

This would probably make sense.

FWIW, for Finnish the diff for Russian to be applied in the locale on
top of translit_cyrillic (ISO 9) rules would be something like below, I
still need to check whether there are rules needed for other languages
than Russian that could be added (I hope to submit a proper patch
against fi_FI shortly after translit_cyrillic has landed):

<U0446> "<U0074><U0073>"
<U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
<U0448> "<U0161>";"<U0073><U0068>"
<U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
<U044A> ""
<U044C> ""
<U044D> "<U0065>"
<U044E> "<U006A><U0075>"
<U044F> "<U006A><U0061>"
<U0451> "<U006A><U006F>"

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 16:10                 ` Marko Myllynen
@ 2018-10-09 16:22                   ` Egor Kobylkin
  2018-10-09 16:49                     ` Marko Myllynen
  2018-10-09 22:08                   ` Rafal Luzynski
  2018-10-11 10:10                   ` Marko Myllynen
  2 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 16:22 UTC (permalink / raw)
  To: Marko Myllynen, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

In the hope to be helpful: what you describe below from
https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_,
not transliteration.

Transliteration is what we have done with ISO 9 or GOST 7.79 System A
and it could be the same for all languages indeed.

The transcription can be phonetic or serve other purposes and depends on
the target language or use case. We have used the GOST 7.79 System B.

Egor

On 09.10.2018 18:10, Marko Myllynen wrote:
> Hi,
> 
> On 2018-10-09 01:04, Rafal Luzynski wrote:
>>
>> Particularly, I think that those rules will not be helpful at all for
>> the languages which use neither Latin nor Cyrillic alphabet.
> 
> This is certainly a very good point.
> 
>> If you refer to other languages than Russian which also use the Cyrillic
>> alphabet but need a different transliteration rules than Russian for
>> the same characters then it is OK for me now.  I am afraid that the iconv
>> algorithm does not handle such case.  Of course, we should add this missing
>> feature eventually but I do not volunteer to do it now.
> 
> Yes, this would be needed for correct transliteration of different
> languages, and this might be quite a bit of work. There's also the case
> of transliteration and character sets, consider the transliteration
> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
> 
> Russian:        Борис Николаевич Ельцин
> Int'l:          Boris Nikolaevič Elʹcin
> Finnish:        Boris Nikolajevitš Jeltsin
> French:         Boris Nikolaïevitch Ieltsine
> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
> 
> For French you'll get the correct transliteration with iconv by using -t
> ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
> not so obvious how to get the above kind transliteration for ISO 9
> international or especially for the phonetic case.
> 
> One thing that might be helpful here could be something like:
> 
> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
> ž
> 
> That is, force transliteration of each character (if defined) even if
> it's part of the target character set. AFAICS this is not currently
> possible.
> 
>> But, while at this, is there anything that stops are from adding transliteration
>> rules for additional Cyrillic characters not used in Russian but used in
>> other languages?
> 
> This would probably make sense.
> 
> FWIW, for Finnish the diff for Russian to be applied in the locale on
> top of translit_cyrillic (ISO 9) rules would be something like below, I
> still need to check whether there are rules needed for other languages
> than Russian that could be added (I hope to submit a proper patch
> against fi_FI shortly after translit_cyrillic has landed):
> 
> <U0446> "<U0074><U0073>"
> <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
> <U0448> "<U0161>";"<U0073><U0068>"
> <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
> <U044A> ""
> <U044C> ""
> <U044D> "<U0065>"
> <U044E> "<U006A><U0075>"
> <U044F> "<U006A><U0061>"
> <U0451> "<U006A><U006F>"
> 
> Thanks,
> 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 16:22                   ` Egor Kobylkin
@ 2018-10-09 16:49                     ` Marko Myllynen
  0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-09 16:49 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

To clarify, the page has a section explaining the differences between
transliteration and transcription and how the terminology is not
entirely unambiguous. It also explains that the national standard SFS
4900 overrides ISO 9, thus ISO 9 can't be used as-is in Finnish context.

Thanks,

On 2018-10-09 19:22, Egor Kobylkin wrote:
> In the hope to be helpful: what you describe below from
> https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_,
> not transliteration.
> 
> Transliteration is what we have done with ISO 9 or GOST 7.79 System A
> and it could be the same for all languages indeed.
> 
> The transcription can be phonetic or serve other purposes and depends on
> the target language or use case. We have used the GOST 7.79 System B.
> 
> Egor
> 
> On 09.10.2018 18:10, Marko Myllynen wrote:
>> Hi,
>>
>> On 2018-10-09 01:04, Rafal Luzynski wrote:
>>>
>>> Particularly, I think that those rules will not be helpful at all for
>>> the languages which use neither Latin nor Cyrillic alphabet.
>>
>> This is certainly a very good point.
>>
>>> If you refer to other languages than Russian which also use the Cyrillic
>>> alphabet but need a different transliteration rules than Russian for
>>> the same characters then it is OK for me now.  I am afraid that the iconv
>>> algorithm does not handle such case.  Of course, we should add this missing
>>> feature eventually but I do not volunteer to do it now.
>>
>> Yes, this would be needed for correct transliteration of different
>> languages, and this might be quite a bit of work. There's also the case
>> of transliteration and character sets, consider the transliteration
>> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>>
>> Russian:        Борис Николаевич Ельцин
>> Int'l:          Boris Nikolaevič Elʹcin
>> Finnish:        Boris Nikolajevitš Jeltsin
>> French:         Boris Nikolaïevitch Ieltsine
>> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
>>
>> For French you'll get the correct transliteration with iconv by using -t
>> ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
>> not so obvious how to get the above kind transliteration for ISO 9
>> international or especially for the phonetic case.
>>
>> One thing that might be helpful here could be something like:
>>
>> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
>> ž
>>
>> That is, force transliteration of each character (if defined) even if
>> it's part of the target character set. AFAICS this is not currently
>> possible.
>>
>>> But, while at this, is there anything that stops are from adding transliteration
>>> rules for additional Cyrillic characters not used in Russian but used in
>>> other languages?
>>
>> This would probably make sense.
>>
>> FWIW, for Finnish the diff for Russian to be applied in the locale on
>> top of translit_cyrillic (ISO 9) rules would be something like below, I
>> still need to check whether there are rules needed for other languages
>> than Russian that could be added (I hope to submit a proper patch
>> against fi_FI shortly after translit_cyrillic has landed):
>>
>> <U0446> "<U0074><U0073>"
>> <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
>> <U0448> "<U0161>";"<U0073><U0068>"
>> <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
>> <U044A> ""
>> <U044C> ""
>> <U044D> "<U0065>"
>> <U044E> "<U006A><U0075>"
>> <U044F> "<U006A><U0061>"
>> <U0451> "<U006A><U006F>"
>>
>> Thanks,
>>
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 13:18                             ` Egor Kobylkin
@ 2018-10-09 18:34                               ` Egor Kobylkin
  2018-10-09 22:17                                 ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 18:34 UTC (permalink / raw)
  To: Rafal Luzynski, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo


The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"

The <U0301> is "combining" and obviously it doesn't work if enclosed in
quotes with the letter codepoint. Please let me know if there is another
explanation.

I will now make those changes and generate the patch itself.
Egor

On 09.10.2018 15:18, Egor Kobylkin wrote:
> Hi,
> 
> I have now implemented all the changes requested for translit_cyrillic
> file but started hitting what seems like a bug:
> 
> - If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the
> locale compilation fails i.e. grep CYRILLIC < $testfile |
> LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
> iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen.
> 
> - If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic
> everything works, just the transliteration of <U0425> fails as expected
> (? is displayed)
> 
> - If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_
> line the transliteration of <U0425> works again (others as ?).
> 
> Would you have any idea into what direction should I look? The new
> translit_cyrillic is attached.
> 
> (<U0425> is % CYRILLIC CAPITAL LETTER HA)
> 
> Best regards,
> Egor
> 
> On 09.10.2018 01:35, Egor Kobylkin wrote:
>> On 09.10.2018 00:23, Rafal Luzynski wrote:
>>> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
>>>> Hi,
>>>>
>>>> Thanks for the update. I have few mostly cosmetic comments below,
>>>> hopefully we'll hear from others whether they agree with this direction.
>>>>
>>
>> Yeah, the earlier we have feedback the more productive we are. I'd be
>> happy to get much feedback on this as early as possible. So please
>> everybody concerned please chime in.
>>
>>>
>>>> - No duplicates:
>>>>
>>>> % CYRILLIC SMALL LETTER IE
>>>> <U0435> <U0065>; <U0065>
>>>>
>>>> should become:
>>>>
>>>> % CYRILLIC SMALL LETTER IE
>>>> <U0435> <U0065>
>>>>
>>>> - There are few issues with the definitions:
>>>>
>>>> % CYRILLIC CAPITAL LETTER U
>>>> <U0423> <U0055>; <U0055>
>>>> % CYRILLIC UNDEFINED
>>>> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>>>>
>>>> % CYRILLIC SMALL LETTER U
>>>> <U0443> <U0075>; <U0075>
>>>> % CYRILLIC UNDEFINED
>>>> <U0443><U0443> <U00FA>; "<U0075><U0060>"
>>>
>>> Are the duplicates here because some Cyrillic letters may have multiple
>>> Latin transliterations depending on the context, for example Cyrillic IE
>>> must be transliterated sometimes as "e", sometimes as "ie", sometimes
>>> as "ye" or "je"?  Can we provide rules for groups of characters instead?
>> No, the duplicates are just by design of my line generating logic. I
>> have fixed (removed) them. The varying transcription between
>> languages/locales can not be handled in one file at all as far as I
>> understood.
>>
>>>
>>>> I wonder would it be possible to automate generation of this file so
>>>> that issues like the above could avoided? But perhaps that could be the
>>>> next step once this initial patch lands.
>>
>> I am generating the content part of the translit_cyrillc from the
>> LibreOffice Spreadsheet. Not sure if you had time to view it by now?
>> https://sourceware.org/bugzilla/attachment.cgi?id=11299
>>
>> Anyway I have just fixed the issues identified by Marko above in that
>> spreadsheet. I will do the changes for the below request and then upload
>> the new translit_cyrillic file to the bugzilla.
>>
>>
>>>> - Please add the standard glibc locale header (see the existing
>>>> translit_* files for reference)
>>>> - Consider wrapping the header lines at or around column 70-72
>>>> - Consider describing which characters, character ranges, or blocks are
>>>> supported (perhaps also describe why some of those are not included, see
>>>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
>>>> - Please remove trailing whitespaces and spaces after ;
>>>
>>> Thanks for this, Marko.  While at this, in the ChangeLog and in the commit
>>> message these paths:
>>>
>>> 	* locales/aa_DJ: likewise
>>>
>>> 1. Should be a relative path starting in the root directory of glibc
>> source,
>>>    that is: "* localedata/locales/aa_DJ".
>>> 2. Should be "Likewise." (starting with an uppercase and ending with a
>> dot).
>>
>> will do.
>>
>> Bests,
>> Egor
>>
> 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-08 22:52                 ` Egor Kobylkin
@ 2018-10-09 21:43                   ` Rafal Luzynski
  0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 21:43 UTC (permalink / raw)
  To: Egor Kobylkin, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

9.10.2018 00:52 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> Just to make sure we are not talking at cross purposes. Since your last
> email on this topic on the suggestion from Marko I have already
> implemented ISO 9 transliteration for all characters there are. This
> should cover most if not all Slavic Cyrillic. You seem to have just
> noticed and replied to this email of Marko as I write mine.

That's great.  I'm sorry about not noticing this before, as you can see
this only confirms that I'm unable to give a proper attention to your bug.


9.10.2018 01:35 Egor Kobylkin <egor@kobylkin.com> wrote:
> On 09.10.2018 00:23, Rafal Luzynski wrote:
> > Are the duplicates here because some Cyrillic letters may have multiple
> > Latin transliterations depending on the context, for example Cyrillic IE
> > must be transliterated sometimes as "e", sometimes as "ie", sometimes
> > as "ye" or "je"? Can we provide rules for groups of characters instead?
>
> No, the duplicates are just by design of my line generating logic. I
> have fixed (removed) them. The varying transcription between
> languages/locales can not be handled in one file at all as far as I
> understood.

No, I did not mean here different languages but that some letters may need
to be transliterated in a different way depending on the context.  For
example, a letter "е" might be transliterated as "e" or "ie" or "je"
depending on whether it appears after "ж" or after another consonant
or after a vowel or a soft or hard sign etc.  All within Russian language.
(Sorry if I'm messing that, maybe what I wrote is wrong but may be correct
for another combination of letters.)

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 15:26                   ` Carlos O'Donell
@ 2018-10-09 21:51                     ` Rafal Luzynski
  0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 21:51 UTC (permalink / raw)
  To: Carlos O'Donell, GNU C Library

9.10.2018 17:26 Carlos O'Donell <carlos@redhat.com> wrote:
> [...]
> but it sounds like Egor has added enough coverage to perhaps make all of
> these transliterations opt-in by default.

I think that it is correct if this transliteration is meant to be "Russian
language as if it used a Latin alphabet (even if it does not actually
except in some computer systems which do not support Cyrillic)"
but not if it is meant to be "Russian language to make sure it is comfortable
for reading by English speakers (assuming that everyone else should be fine
with English if their native language is not supported)".

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 16:10                 ` Marko Myllynen
  2018-10-09 16:22                   ` Egor Kobylkin
@ 2018-10-09 22:08                   ` Rafal Luzynski
  2018-10-10 11:21                     ` Marko Myllynen
  2018-10-11 10:10                   ` Marko Myllynen
  2 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 22:08 UTC (permalink / raw)
  To: Marko Myllynen, Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote:
> On 2018-10-09 01:04, Rafal Luzynski wrote:
> > If you refer to other languages than Russian which also use the Cyrillic
> > alphabet but need a different transliteration rules than Russian for
> > the same characters then it is OK for me now. I am afraid that the iconv
> > algorithm does not handle such case. Of course, we should add this missing
> > feature eventually but I do not volunteer to do it now.
>
> Yes, this would be needed for correct transliteration of different
> languages, and this might be quite a bit of work. There's also the case
> of transliteration and character sets, consider the transliteration
> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>
> Russian: Борис Николаевич Ельцин
> Int'l: Boris Nikolaevič Elʹcin
> Finnish: Boris Nikolajevitš Jeltsin
> French: Boris Nikolaïevitch Ieltsine
> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]

No, I did not mean the transcription using the rules of the destination
locale using Latin but that the rules of transliteration may be different
depending on the language of the source text.  For example, consider
this Cyrillic string: "нъг" (I'm not telling that it is actually used
in any existing word but still must be handled).  By our transliteration
rules it will be transliterated as "n``g".  But this is fine for Russian;
if we knew that the source string is Ukrainian it would be transliterated
as "n``h"; if it was Bulgarian it would be transliterated as "năg".
Similarly, if you had to transliterate the Latin letters "sch" to Cyrillic
first you would have to ask what was be the source language.

Unfortunately, I think that distinction of the source language is impossible
at the moment so let's assume that we fall back to Russian if there is
any ambiguity.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 18:34                               ` Egor Kobylkin
@ 2018-10-09 22:17                                 ` Rafal Luzynski
  2018-10-09 22:40                                   ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 22:17 UTC (permalink / raw)
  To: Egor Kobylkin, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
> "<U0443><U0301>" (<U00FA>).
> It works now with
> % CYRILLIC UNDEFINED
> <U0423><U0301> <U00DA>;"<U0055><U0060>"
> % CYRILLIC UNDEFINED
> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>
> [...]

I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all.  I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1]  So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.

Regards,

Rafal


[1] https://en.wikipedia.org/wiki/Russian_alphabet#Diacritics

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 22:17                                 ` Rafal Luzynski
@ 2018-10-09 22:40                                   ` Egor Kobylkin
  2018-10-09 22:42                                     ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 22:40 UTC (permalink / raw)
  To: Rafal Luzynski, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

On 10.10.2018 00:17, Rafal Luzynski wrote:
> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>> "<U0443><U0301>" (<U00FA>).
>> It works now with
>> % CYRILLIC UNDEFINED
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>> % CYRILLIC UNDEFINED
>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>
>> [...]
> 
> I wonder why you need Cyrillic U with acute, and why you comment it
> as "undefined" at all.  I know that any Cyrillic vowel may appear with
> an acute accent but "the diacritic is used only in dictionaries, children's
> books, resources for foreign-language learners (...)". [1]  So maybe
> all vowels with an acute accent should be handled (which I think is fine)
> rather than just U.

I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.

There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. That’s why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.

Manually we can change it to whatever you’d suggest in the
translit_cyrillic. I just don’t know the right name.

On my side I think I have all outstanding tasks complete for the patch
https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let
me know explicitly if you'd like anything changed there.

I was planning to rewrite just the commit message according to your
earlier feedback and resubmit sometime soon.

Bests,
Diego

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 22:40                                   ` Egor Kobylkin
@ 2018-10-09 22:42                                     ` Egor Kobylkin
  2018-10-10 11:22                                       ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 22:42 UTC (permalink / raw)
  To: Rafal Luzynski, Marko Myllynen
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

Ups, sorry, wrong link to the patch
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303

On 10.10.2018 00:40, Egor Kobylkin wrote:
> On 10.10.2018 00:17, Rafal Luzynski wrote:
>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>>> "<U0443><U0301>" (<U00FA>).
>>> It works now with
>>> % CYRILLIC UNDEFINED
>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>> % CYRILLIC UNDEFINED
>>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>>
>>> [...]
>>
>> I wonder why you need Cyrillic U with acute, and why you comment it
>> as "undefined" at all.  I know that any Cyrillic vowel may appear with
>> an acute accent but "the diacritic is used only in dictionaries, children's
>> books, resources for foreign-language learners (...)". [1]  So maybe
>> all vowels with an acute accent should be handled (which I think is fine)
>> rather than just U.
> 
> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
> implemented it on Marko's suggestion. Personally I have no opinion on
> what letters should be included and under what name. These funny Us just
> happened to be in the ISO9 table.
> 
> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
> in Unicode. That’s why its coming through that way from my worksheet as
> it does a reverse lookup on the names based on the Unicode codepoints.
> 
> Manually we can change it to whatever you’d suggest in the
> translit_cyrillic. I just don’t know the right name.
> 
> On my side I think I have all outstanding tasks complete for the patch
> https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let
> me know explicitly if you'd like anything changed there.

correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303

> 
> I was planning to rewrite just the commit message according to your
> earlier feedback and resubmit sometime soon.
> 


Bests,
Egor


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 22:08                   ` Rafal Luzynski
@ 2018-10-10 11:21                     ` Marko Myllynen
  0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-10 11:21 UTC (permalink / raw)
  To: Rafal Luzynski, Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

On 2018-10-10 01:08, Rafal Luzynski wrote:
> 9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote:
>> On 2018-10-09 01:04, Rafal Luzynski wrote:
>>> If you refer to other languages than Russian which also use the Cyrillic
>>> alphabet but need a different transliteration rules than Russian for
>>> the same characters then it is OK for me now. I am afraid that the iconv
>>> algorithm does not handle such case. Of course, we should add this missing
>>> feature eventually but I do not volunteer to do it now.
>>
>> Yes, this would be needed for correct transliteration of different
>> languages, and this might be quite a bit of work. There's also the case
>> of transliteration and character sets, consider the transliteration
>> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>>
>> Russian: Борис Николаевич Ельцин
>> Int'l: Boris Nikolaevič Elʹcin
>> Finnish: Boris Nikolajevitš Jeltsin
>> French: Boris Nikolaïevitch Ieltsine
>> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
> 
> No, I did not mean the transcription using the rules of the destination
> locale using Latin but that the rules of transliteration may be different
> depending on the language of the source text.

Yes, I mentioned this case in my earlier email:

https://sourceware.org/ml/libc-alpha/2018-10/msg00083.html

> this Cyrillic string: "нъг" (I'm not telling that it is actually used
> in any existing word but still must be handled).  By our transliteration
> rules it will be transliterated as "n``g".  But this is fine for Russian;
> if we knew that the source string is Ukrainian it would be transliterated
> as "n``h"; if it was Bulgarian it would be transliterated as "năg".

And according to SFS 4900, in fi_FI for this string we would see for
Russian ng, for Ukrainian nh, and for Bulgarian năg.

> Unfortunately, I think that distinction of the source language is impossible
> at the moment so let's assume that we fall back to Russian if there is
> any ambiguity.

Yeah, it's not optimal but probably the most decent compromise for now.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 22:42                                     ` Egor Kobylkin
@ 2018-10-10 11:22                                       ` Marko Myllynen
  2018-10-10 12:19                                         ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-10 11:22 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

Hi,

On 2018-10-10 01:42, Egor Kobylkin wrote:
> Ups, sorry, wrong link to the patch
> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303

Although I haven't checked every rule this in general looks very good
(but see below). Not sure do we want to add the few missing characters
mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
least initially the more exotic characters, like the historic ones,
though.) Perhaps filing a bug or two for these cases for separate
consideration would be ok.

> On 10.10.2018 00:40, Egor Kobylkin wrote:
>> On 10.10.2018 00:17, Rafal Luzynski wrote:
>>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>>>> "<U0443><U0301>" (<U00FA>).
>>>> It works now with
>>>> % CYRILLIC UNDEFINED
>>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>>> % CYRILLIC UNDEFINED
>>>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>>>
>>>> [...]
>>>
>>> I wonder why you need Cyrillic U with acute, and why you comment it
>>> as "undefined" at all.  I know that any Cyrillic vowel may appear with
>>> an acute accent but "the diacritic is used only in dictionaries, children's
>>> books, resources for foreign-language learners (...)". [1]  So maybe
>>> all vowels with an acute accent should be handled (which I think is fine)
>>> rather than just U.
>>
>> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
>> implemented it on Marko's suggestion. Personally I have no opinion on
>> what letters should be included and under what name. These funny Us just
>> happened to be in the ISO9 table.
>>
>> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
>> in Unicode. That’s why its coming through that way from my worksheet as
>> it does a reverse lookup on the names based on the Unicode codepoints.
>>
>> Manually we can change it to whatever you’d suggest in the
>> translit_cyrillic. I just don’t know the right name.

I'm not sure this will work, no existing rule in translit_* files
contain two characters, I'd assume that the rule for U+0423 is applied
first and then the below rule is never used.

% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"

Perhaps this should be commented out or removed altogether if it's not
working as intended.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-10 11:22                                       ` Marko Myllynen
@ 2018-10-10 12:19                                         ` Egor Kobylkin
  2018-10-10 12:34                                           ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-10 12:19 UTC (permalink / raw)
  To: Marko Myllynen, Rafal Luzynski
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

On 10.10.2018 13:22, Marko Myllynen wrote:
>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
> 
> Although I haven't checked every rule this in general looks very good
> (but see below). 


> Not sure do we want to add the few missing characters
> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
> least initially the more exotic characters, like the historic ones,
> though.) Perhaps filing a bug or two for these cases for separate
> consideration would be ok.

The question here is what should serve as their transliteration and
transcription?
They are not covered by ISO9 neither by GOST 7.79. So maybe it would be
reasonable to assume there is no notable occurrence of those anywhere?

Anyway I am happy to include your specific suggestions for all and any
Unicode quartets in this form:
[Cyrillic Unicode
; ISO9 Latin Transliteration (System A) as Unicode
; Transcription (System B) as (mulitcharacter)ASCII
; name to put in %COMMENT
].


> 
>> On 10.10.2018 00:40, Egor Kobylkin wrote:
>>> On 10.10.2018 00:17, Rafal Luzynski wrote:
>>>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>
>>>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>>>>> "<U0443><U0301>" (<U00FA>).
>>>>> It works now with
>>>>> % CYRILLIC UNDEFINED
>>>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>>>> % CYRILLIC UNDEFINED
>>>>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>>>>
>>>>> [...]
>>>>
>>>> I wonder why you need Cyrillic U with acute, and why you comment it
>>>> as "undefined" at all.  I know that any Cyrillic vowel may appear with
>>>> an acute accent but "the diacritic is used only in dictionaries, children's
>>>> books, resources for foreign-language learners (...)". [1]  So maybe
>>>> all vowels with an acute accent should be handled (which I think is fine)
>>>> rather than just U.
>>>
>>> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
>>> implemented it on Marko's suggestion. Personally I have no opinion on
>>> what letters should be included and under what name. These funny Us just
>>> happened to be in the ISO9 table.
>>>
>>> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
>>> in Unicode. That’s why its coming through that way from my worksheet as
>>> it does a reverse lookup on the names based on the Unicode codepoints.
>>>
>>> Manually we can change it to whatever you’d suggest in the
>>> translit_cyrillic. I just don’t know the right name.
> 
> I'm not sure this will work, no existing rule in translit_* files
> contain two characters, I'd assume that the rule for U+0423 is applied
> first and then the below rule is never used.
> 
> % CYRILLIC UNDEFINED
> <U0423><U0301> <U00DA>;"<U0055><U0060>"
> 
> Perhaps this should be commented out or removed altogether if it's not
> working as intended.

here is a result of my test on
https://sourceware.org/bugzilla/attachment.cgi?id=11304

U0423 0301-У́  -> U0423 0301-U
U0443 0301-у́ -> U0443 0301-u

So yes, they are not processed. I would drop them to not to have special
cases. But I am also fine with keeping them because all work is done
already.

Result:
CYRILLIC RUSSIAN S``esh` eshhyo e`tih myagkih francuzskih bulok, da
vypej zhe chayu. SA`ESH` ESHHYO E`TIH MYAGKIH FRANCUZSKIH BULOK? DA
VYPEJ ZHE CHAYU!
CYRILLIC COMPLETE U0401-YO U0402-DJ U0403-G` U0404-Ye U0405-Z` U0406-I
U0407-Yi U0408-J U0409-L` U040A-N` U040B-TSH U040C-K` U040E-U` U040F-Dh
U0410-A U0411-B U0412-V U0413-G U0414-D U0415-E U0416-ZH U0417-Z U0418-I
U0419-J U041A-K U041B-L U041C-M U041D-N U041E-O U041F-P U0420-R U0421-S
U0422-T U0423-U U0423 0301-U U0424-F U0425-H U0426-C U0427-CH U0428-SH
U0429-SHH U042A-`` U042B-Y U042C-` U042D-E` U042E-YU U042F-YA U0430-a
U0431-b U0432-v U0433-g U0434-d U0435-e U0436-zh U0437-z U0438-i U0439-j
U043A-k U043B-l U043C-m U043D-n U043E-o U043F-p U0440-r U0441-s U0442-t
U0443-u U0443 0301-u U0444-f U0445-h U0446-c U0447-ch U0448-sh U0449-shh
U044A-A` U044B-y U044C-` U044D-e` U044E-yu U044F-ya U0451-yo U0452-dj
U0453-g` U0454-ye U0455-z` U0456-i U0457-yi U0458-j U0459-l` U045A-n`
U045B-tsh U045C-k` U045E-u` U045F-dh U046A-O` U046B-o` U0472-Fh U0473-fh
U0474-Yh U0475-yh U048C-E` U048D-e`  U0490-G` U0491-g` U0492-GH U0493-gh
U0494-GH U0495-gh U0496-ZH` U0497-zh` U049A-K` U049B-k` U049E-K`
U049F-k` U04A2-N` U04A3-n` U04A4-NG U04A5-ng U04A6-P` U04A7-p` U04A8-O`
U04A9-o` U04AA-C` U04AB-C` U04AC-T` U04AD-t` U04AE-U U04AF-u U04B2-H`
U04B3-h` U04B4-TCZ U04B5-tcz U04BA-SH` U04BB-SH` U04BC-CH` U04BD-ch`
U04BE-CH` U04BF-ch` U04C0-i U04C1-ZH` U04C2-zh` U04CB-CH` U04CC-ch`
U04D0-A` U04D1-a` U04D2-A` U04D3-a` U04D6-E` U04D7-e` U04D8-A` U04D9-a`
U04DC-ZH` U04DD-zh` U04DE-Z` U04DF-z` U04E0-Z` U04E1-z` U04E4-I`
U04E5-i` U04E6-O` U04E7-o` U04E8-O` U04E9-o` U04F0-U` U04F1-u` U04F2-U`
U04F3-u` U04F4-CH` U04F5-ch` U04F8-Y` U04F9-y` U2019-'

Source:
CYRILLIC RUSSIAN Съешь ещё этих мягких французских булок, да выпей же
чаю. СЪЕШЬ ЕЩЁ ЭТИХ МЯГКИХ ФРАНЦУЗСКИХ БУЛОК? ДА ВЫПЕЙ ЖЕ ЧАЮ!
CYRILLIC COMPLETE U0401-Ё U0402-Ђ U0403-Ѓ U0404-Є U0405-Ѕ U0406-І
U0407-Ї U0408-Ј U0409-Љ U040A-Њ U040B-Ћ U040C-Ќ U040E-Ў U040F-Џ U0410-А
U0411-Б U0412-В U0413-Г U0414-Д U0415-Е U0416-Ж U0417-З U0418-И U0419-Й
U041A-К U041B-Л U041C-М U041D-Н U041E-О U041F-П U0420-Р U0421-С U0422-Т
U0423-У U0423 0301-У́ U0424-Ф U0425-Х U0426-Ц U0427-Ч U0428-Ш U0429-Щ
U042A-ъ U042B-Ы U042C-ь U042D-Э U042E-Ю U042F-Я U0430-а U0431-б U0432-в
U0433-г U0434-д U0435-е U0436-ж U0437-з U0438-и U0439-й U043A-к U043B-л
U043C-м U043D-н U043E-о U043F-п U0440-р U0441-с U0442-т U0443-у U0443
0301-у́ U0444-ф U0445-х U0446-ц U0447-ч U0448-ш U0449-щ U044A-Ъ U044B-ы
U044C-Ь U044D-э U044E-ю U044F-я U0451-ё U0452-ђ U0453-ѓ U0454-є U0455-ѕ
U0456-і U0457-ї U0458-ј U0459-љ U045A-њ U045B-ћ U045C-ќ U045E-ў U045F-џ
U046A-Ѫ U046B-ѫ U0472-Ѳ U0473-ѳ U0474-Ѵ U0475-ѵ U048C-Ҍ U048D-ҍ  U0490-Ґ
U0491-ґ U0492-Ғ U0493-ғ U0494-Ҕ U0495-ҕ U0496-Җ U0497-җ U049A-Қ U049B-қ
U049E-Ҟ U049F-ҟ U04A2-Ң U04A3-ң U04A4-Ҥ U04A5-ҥ U04A6-Ҧ U04A7-ҧ U04A8-Ҩ
U04A9-ҩ U04AA-Ҫ U04AB-ҫ U04AC-Ҭ U04AD-ҭ U04AE-Ү U04AF-ү U04B2-Ҳ U04B3-ҳ
U04B4-Ҵ U04B5-ҵ U04BA-Һ U04BB-һ U04BC-Ҽ U04BD-ҽ U04BE-Ҿ U04BF-ҿ U04C0-Ӏ
U04C1-Ӂ U04C2-ӂ U04CB-Ӌ U04CC-ӌ U04D0-Ӑ U04D1-ӑ U04D2-Ӓ U04D3-ӓ U04D6-Ӗ
U04D7-ӗ U04D8-Ә U04D9-ә U04DC-Ӝ U04DD-ӝ U04DE-Ӟ U04DF-ӟ U04E0-Ӡ U04E1-ӡ
U04E4-Ӥ U04E5-ӥ U04E6-Ӧ U04E7-ӧ U04E8-Ө U04E9-ө U04F0-Ӱ U04F1-ӱ U04F2-Ӳ
U04F3-ӳ U04F4-Ӵ U04F5-ӵ U04F8-Ӹ U04F9-ӹ U2019-’


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-10 12:19                                         ` Egor Kobylkin
@ 2018-10-10 12:34                                           ` Marko Myllynen
  0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-10 12:34 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski
  Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
	Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo

Hi,

On 2018-10-10 15:19, Egor Kobylkin wrote:
> On 10.10.2018 13:22, Marko Myllynen wrote:
>>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>>
>> Although I haven't checked every rule this in general looks very good
>> (but see below). 
> 
>> Not sure do we want to add the few missing characters
>> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
>> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
>> least initially the more exotic characters, like the historic ones,
>> though.) Perhaps filing a bug or two for these cases for separate
>> consideration would be ok.
> 
> The question here is what should serve as their transliteration and
> transcription?

Not sure, so filing a separate bug about this once your patch is merged
might be the most suitable action for now, I don't think we want to
postpone merging your work further due to these non-ISO 9 cases.

>> I'm not sure this will work, no existing rule in translit_* files
>> contain two characters, I'd assume that the rule for U+0423 is applied
>> first and then the below rule is never used.
>>
>> % CYRILLIC UNDEFINED
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>
>> Perhaps this should be commented out or removed altogether if it's not
>> working as intended.
> 
> So yes, they are not processed. I would drop them to not to have special
> cases. But I am also fine with keeping them because all work is done
> already.
I'd probably drop them but I don't feel strongly about this either way.

Thanks for your efforts, I don't have any further comments, I'll leave
this now for Rafal and Mike to provide additional feedback and hopefully
merge soon.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
       [not found] ` <20180412224352.GB2911@altlinux.org>
  2018-07-17 19:34   ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
  2018-08-06 19:00   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
@ 2018-10-10 22:29   ` Egor Kobylkin
  2018-10-11  9:59     ` Marko Myllynen
  2018-10-11 11:04     ` Rafal Luzynski
  2018-10-11 15:44   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
                     ` (10 subsequent siblings)
  13 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-10 22:29 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 66179 bytes --]

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11302
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
	* localedata/locales/aa_DJ: Likewise.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/sd_PK: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/C	2018-10-09 19:02:45.000000000 +0000
@@ -2293,6 +2293,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-09 19:02:45.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-09 19:02:45.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-09 19:02:45.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-09 19:02:45.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-09 19:02:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-09 19:02:45.000000000 +0000
@@ -165,6 +165,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-09 19:02:45.000000000 +0000
@@ -85,6 +85,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-09 19:02:45.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-09 19:02:45.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-09 19:02:46.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-09 19:02:46.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-09 19:02:46.000000000 +0000
@@ -71,6 +71,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-09 19:02:46.000000000 +0000
@@ -38,6 +38,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-09 19:02:46.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-09 19:02:46.000000000 +0000
@@ -108,6 +108,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-09 19:02:46.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-09 19:02:46.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-09 19:02:46.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-09 19:02:46.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""


+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-09 19:02:46.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-09 19:02:46.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-09 19:02:46.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-09 19:02:46.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-09 19:02:47.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-09 19:02:47.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-09 19:02:47.000000000 +0000
@@ -112,6 +112,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-09 19:02:47.000000000 +0000
@@ -78,6 +78,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-09 19:02:47.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-09 19:02:47.000000000 +0000
@@ -136,6 +136,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-09 19:02:47.000000000 +0000
@@ -53,6 +53,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-09 19:02:47.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-09 19:02:47.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-09 19:02:47.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-09 19:02:47.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-09 19:02:48.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-09 19:02:48.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-09 19:02:48.000000000 +0000
@@ -75,6 +75,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-09 19:02:48.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-09 19:02:48.000000000 +0000
@@ -149,6 +149,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-09 19:02:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-09 19:02:48.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-09 19:02:48.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-09 19:02:48.000000000 +0000
@@ -157,6 +157,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-09 19:02:48.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-09 19:02:49.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-09 19:02:49.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-09 19:02:49.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-09 19:02:49.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-09 19:02:49.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-09 19:02:49.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-09 19:02:49.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-09 19:02:49.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-09 19:02:49.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-09 19:02:49.000000000 +0000
@@ -163,6 +163,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-09 19:02:50.000000000 +0000
@@ -110,6 +110,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-09 19:02:50.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-09 19:02:50.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-09 19:02:50.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-09 19:02:50.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-09 19:02:50.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-09 19:02:50.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-09 19:02:50.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-09 19:02:50.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-09 19:02:50.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-09 19:02:50.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-09 19:02:50.000000000 +0000
@@ -138,6 +138,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-09 19:02:51.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-09 19:02:51.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-09 19:02:51.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-09 19:02:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-09 19:02:51.000000000 +0000
@@ -116,6 +116,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-09 19:02:51.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-09 19:02:51.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-09 19:02:51.000000000 +0000
@@ -73,6 +73,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-09 19:02:51.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-09 19:02:51.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-09 19:02:19.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-09 19:02:51.000000000
+0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-09 19:02:51.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-09 19:02:51.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-09 19:02:52.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-09 19:02:52.000000000 +0000
@@ -90,6 +90,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-09 19:02:52.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-09 19:02:52.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-09 19:02:52.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-09 19:02:52.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-09 19:02:52.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-09 19:02:52.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-09 19:02:52.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-09 19:02:52.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-09 19:02:52.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>

 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-09 19:02:53.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-09 19:02:53.000000000 +0000
@@ -36,6 +36,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-09 19:02:53.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-09 19:02:53.000000000 +0000
@@ -2423,6 +2423,7 @@

 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic	2018-10-09 19:02:54.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-09 19:02:53.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-09 19:02:53.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-09 19:02:53.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-09 19:02:53.000000000 +0000
@@ -65,6 +65,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-09 19:02:53.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-09 19:02:53.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-09 19:02:54.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-09 19:02:54.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-09 19:02:54.000000000 +0000
@@ -40,6 +40,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-09 19:02:54.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-09 19:02:54.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE






[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56412 bytes --]

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/C	2018-10-09 19:02:45.000000000 +0000
@@ -2293,6 +2293,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-09 19:02:45.000000000 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-09 19:02:45.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-09 19:02:45.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-09 19:02:45.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-09 19:02:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-09 19:02:45.000000000 +0000
@@ -165,6 +165,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-09 19:02:45.000000000 +0000
@@ -85,6 +85,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-09 19:02:45.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-09 19:02:45.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-09 19:02:46.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-09 19:02:46.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-09 19:02:46.000000000 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-09 19:02:46.000000000 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-09 19:02:46.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-09 19:02:46.000000000 +0000
@@ -108,6 +108,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-09 19:02:46.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-09 19:02:46.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-09 19:02:46.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-09 19:02:46.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-09 19:02:46.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-09 19:02:46.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-09 19:02:46.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-09 19:02:46.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-09 19:02:47.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-09 19:02:47.000000000 +0000
@@ -72,6 +72,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-09 19:02:47.000000000 +0000
@@ -112,6 +112,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-09 19:02:47.000000000 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-09 19:02:47.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-09 19:02:47.000000000 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-09 19:02:47.000000000 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-09 19:02:47.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-09 19:02:47.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-09 19:02:47.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-09 19:02:47.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-09 19:02:48.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-09 19:02:48.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-09 19:02:48.000000000 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-09 19:02:48.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-09 19:02:48.000000000 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-09 19:02:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-09 19:02:48.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-09 19:02:48.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-09 19:02:48.000000000 +0000
@@ -157,6 +157,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-09 19:02:48.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-09 19:02:49.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-09 19:02:49.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-09 19:02:49.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-09 19:02:49.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-09 19:02:49.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-09 19:02:49.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-09 19:02:49.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-09 19:02:49.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-09 19:02:49.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-09 19:02:49.000000000 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-09 19:02:50.000000000 +0000
@@ -110,6 +110,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-09 19:02:50.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-09 19:02:50.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-09 19:02:50.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-09 19:02:50.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-09 19:02:50.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-09 19:02:50.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-09 19:02:50.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-09 19:02:50.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-09 19:02:50.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-09 19:02:50.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-09 19:02:50.000000000 +0000
@@ -138,6 +138,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-09 19:02:51.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-09 19:02:51.000000000 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-09 19:02:51.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-09 19:02:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-09 19:02:51.000000000 +0000
@@ -116,6 +116,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-09 19:02:51.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-09 19:02:51.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-09 19:02:51.000000000 +0000
@@ -73,6 +73,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-09 19:02:51.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-09 19:02:51.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-09 19:02:51.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-09 19:02:51.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-09 19:02:52.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-09 19:02:52.000000000 +0000
@@ -90,6 +90,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-09 19:02:52.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-09 19:02:52.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-09 19:02:52.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-09 19:02:52.000000000 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-09 19:02:52.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-09 19:02:52.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-09 19:02:52.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-09 19:02:52.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-09 19:02:52.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-09 19:02:53.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-09 19:02:53.000000000 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-09 19:02:53.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-09 19:02:53.000000000 +0000
@@ -2423,6 +2423,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-09 19:02:54.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with 
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-09 19:02:53.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-09 19:02:53.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-09 19:02:53.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-09 19:02:53.000000000 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-09 19:02:53.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-09 19:02:53.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-09 19:02:54.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-09 19:02:54.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-09 19:02:54.000000000 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-09 19:02:54.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-09 19:02:54.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-10 22:29   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
@ 2018-10-11  9:59     ` Marko Myllynen
  2018-10-11 11:04     ` Rafal Luzynski
  1 sibling, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-11  9:59 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Rafal Luzynski
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

Hi,

Looks like there's one rule after all which might be debatable, I'll
just highlight it and let others to comment and decide what to do with it.

On 2018-10-11 01:29, Egor Kobylkin wrote:
> 
> +% RIGHT SINGLE QUOTATION MARK
> +<U2019> <U2035>;<U0027>
translit_neutral (which is included by i18n) has:

% RIGHT SINGLE QUOTATION MARK
<U2019> <U0027> % not <U00B4> because it's often used as an apostrophe

In practice the end result might well be the same (since if U+2019 is
not available then probably U+2035 is neither and both rules produce
U+0027). However, given that translit_cyrillic would be included in
every locale, I'm not sure is this kind of minor discrepancy ok or not.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  2018-10-09 16:10                 ` Marko Myllynen
  2018-10-09 16:22                   ` Egor Kobylkin
  2018-10-09 22:08                   ` Rafal Luzynski
@ 2018-10-11 10:10                   ` Marko Myllynen
  2 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-11 10:10 UTC (permalink / raw)
  To: Rafal Luzynski, Egor Kobylkin, Keld Simonsen
  Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
	Carlos O'Donell, Max Kutny, danilo

Hi,

On 2018-10-09 19:10, Marko Myllynen wrote:
> 
> One thing that might be helpful here could be something like:
> 
> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
> ž
> 
> That is, force transliteration of each character (if defined) even if
> it's part of the target character set. AFAICS this is not currently
> possible.
FWIW, this is currently not possible with iconv(1) but uconv(1) supports
this with -x (AFAICS it's using ICU not glibc locale data):

https://en.wikipedia.org/wiki/uconv
https://linux.die.net/man/1/uconv
https://github.com/unicode-org/icu/tree/master/icu4c/source/extra/uconv

Cheers,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-10 22:29   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
  2018-10-11  9:59     ` Marko Myllynen
@ 2018-10-11 11:04     ` Rafal Luzynski
  2018-10-11 13:10       ` Marko Myllynen
                         ` (3 more replies)
  1 sibling, 4 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-11 11:04 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Marko Myllynen
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

Thank you, Egor.  I am looking at your patch and although I have
not yet finished, here are some remarks:

First of all, I think that such a large patch should also include
the tests.  Please see how automatic tests are performed in locale
data and write your own.

11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
> [...]

I think that eventually we would like to include your translit_cyrillic
also in these locales because I assume that your rules should work good
for them as well, also should include more characters than the individual
language contributors took into account.  Similarly to Mike's work on
collation: a common rules were created and all locales include them adding
their own language specific modifications.

> [...]
> COMMIT MESSAGE:
> [...]
> I am excluding these locales from this proposed patch. I have written
> directly to locale maintainer emails listed in the files. Volodymyr
> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the

I am not sure if we want Cyrillic text in the commit message.  Shouldn't
it be, uhm, tranlisterated? :-)

"sr_CS" - I guess you meant "sr_RS".

"sr_YU" has been dropped, do we want to mention it?

> [...]
> [BZ #2872]
> * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79

Please start "Add" with an uppercase.  BTW, shouldn't it be "New file"
instead?

> System A transliteration System B transcription table from Cyrillic to
> Latin/ASCII.
> * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
> translit section.

Same, "Add" here.

> * localedata/locales/aa_DJ: Likewise.

Good (here and everywhere below).

> [...]
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
> +0000
> @@ -0,0 +1,383 @@
> +escape_char /
> +comment_char %
> +
> +% This file is part of the GNU C Library and contains locale data.
> +% The Free Software Foundation does not claim any copyright interest
> +% in the locale data contained in this file. The foregoing does not
> +% affect the license of the GNU C Library as a whole. It does not
> +% exempt you from the conditions of the license if your use would
> +% otherwise be governed by that license.
> +
> +% Transliterations of cyrillic letters to latin and/or ascii symbols.

"cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".

> +% Inspired by ISO 9.1995 / GOST 7.79-2000.
> +% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
> +% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995

Typos:

"i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
"U4001" - I guess you meant "U0401"
"U4F9" -> "U04F9".  I think that "U4F9" is not definitely bad but
let's be consistent.

Also I can see some gaps in the range.  Are you going to fill them
or maybe for now just mention that they exist?

> +% It implements the GOST_7.79 System A (Latin Script) as a first
> +% option and System B Cyrillic (ASCII) as a second option. Check
> +% https://en.wikipedia.org/wiki/ISO_9 for reference.
> +% The System B is extended from GOST_7.79-Russian using open sources
> +% of the transliteration mappings and the "h/`" diacritics logic.

What is "h/`" diacritics logic?

> +
> +% Usage examples:
> +% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
> +% | iconv -f ISO-8859-15 -t UTF-8 # System A
> +% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
> +
> +% Contributions welcome for the rest of Cyrillic script in Unicode

Sure, I'm not going to stop you from pushing these changes just because
there are missing characters.  I will consider adding them later.

> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
> +% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
> +% Generated from UnicodeData.txt with
> +% https://sourceware.org/bugzilla/attachment.cgi?id=11301.

1. Is the file really generated with a script and not modified later?
If yes then maybe you should contribute the script instead?  In that case,
you should also not post this file to libc-locale, maintainers and
developers should be able to regenerate it.
2. The link leads to a LibreOffice spreadsheet.

> +LC_CTYPE
> +
> +translit_start
> +

<U0400> is missing here.  Are you going to leave it for now?

> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> <U00CB>;"<U0059><U004F>"
> [...]
> +% CYRILLIC CAPITAL LETTER KJE
> +<U040C> <U1E30>;"<U004B><U0060>"

<U040D> is missing here.  Can we add it already?

> +% CYRILLIC CAPITAL LETTER SHORT U
> +<U040E> <U016C>;"<U0055><U0060>"
> [...]
> +% CYRILLIC CAPITAL LETTER U
> +<U0423> <U0055>
> +% CYRILLIC UNDEFINED
> +<U0423><U0301> <U00DA>;"<U0055><U0060>"

This still makes me wonder.

Does it work at all?
What if we remove this rule, won't it be transliterated as
<U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
will eventually produce "Ú"?
Why is it called "UNDEFINED"?
Do we need similar rules for other characters?

> [...]
> +% CYRILLIC SMALL LETTER U
> +<U0443> <U0075>
> +% CYRILLIC UNDEFINED
> +<U0443><U0301> <U00FA>;"<U0075><U0060>"

Same here.

> [...]
> +% CYRILLIC SMALL LETTER YA
> +<U044F> <U00E2>;"<U0079><U0061>"

Again <U0450> missing (because it is lowercase variant of <U0400>).

> +% CYRILLIC SMALL LETTER IO
> +<U0451> <U00EB>;"<U0079><U006F>"
> [...]
> +% CYRILLIC SMALL LETTER KJE
> +<U045C> <U1E31>;"<U006B><U0060>"

<U045D> missing (same reason as <U040D>).

> +% CYRILLIC SMALL LETTER SHORT U
> +<U045E> <U016D>;"<U0075><U0060>"
> +% CYRILLIC SMALL LETTER DZHE
> +<U045F> "<U0064><U0302>";"<U0064><U0068>"

More letters missing here.  Is this because they are historic so we
don't want to include them now?  Well, but "YUS" is also historic.
(Please, do not remove YUS for consistency).

> +% CYRILLIC CAPITAL LETTER BIG YUS
> +<U046A> <U01CD>;"<U004F><U0060>"
> +% CYRILLIC SMALL LETTER BIG YUS
> +<U046B> <U01CE>;"<U006F><U0060>"
> [...]

I will continue but, again, I don't give any ETA so other reviewers
are welcome here.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-11 11:04     ` Rafal Luzynski
@ 2018-10-11 13:10       ` Marko Myllynen
  2018-10-11 13:50       ` Volodymyr Lisivka
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-11 13:10 UTC (permalink / raw)
  To: Rafal Luzynski, Egor Kobylkin, libc-alpha, libc-locales, mfabian
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

Hi,

On 2018-10-11 14:04, Rafal Luzynski wrote:
> 
> First of all, I think that such a large patch should also include
> the tests.  Please see how automatic tests are performed in locale
> data and write your own.
> 
> 11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> Also I can see some gaps in the range.  Are you going to fill them
> or maybe for now just mention that they exist?
>
> <U040D> is missing here.  Can we add it already?
>
> Sure, I'm not going to stop you from pushing these changes just because
> there are missing characters.  I will consider adding them later.
> 
> <U0400> is missing here.  Are you going to leave it for now?

See check https://sourceware.org/ml/libc-alpha/2018-10/msg00160.html.

>> +% CYRILLIC CAPITAL LETTER U
>> +<U0423> <U0055>
>> +% CYRILLIC UNDEFINED
>> +<U0423><U0301> <U00DA>;"<U0055><U0060>"
> 
> This still makes me wonder.
> 
> Does it work at all?

No, see the above link.

More importantly, I realized that ICU uconv(1) I mentioned earlier
should make a great reference for this data; output of the currently
included transliteration rules should match uconv(1) output. If that is
not the case, the patch or uconv(1) might have an issue. If the outputs
match, then we should be able to safely assume the patch is ok.

It could also be considered to use uconv(1) output as reference how the
handle to currently missing characters.

(uconv(1) is part of the icu package on Fedora/CentOS/RHEL/openSUSE.)

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-11 11:04     ` Rafal Luzynski
  2018-10-11 13:10       ` Marko Myllynen
@ 2018-10-11 13:50       ` Volodymyr Lisivka
  2018-10-11 14:59       ` Egor Kobylkin
  2018-10-11 15:05       ` Egor Kobylkin
  3 siblings, 0 replies; 111+ messages in thread
From: Volodymyr Lisivka @ 2018-10-11 13:50 UTC (permalink / raw)
  To: digitalfreak
  Cc: Egor Kobylkin, libc-alpha, libc-locales, mfabian, myllynen, ldv,
	Max Kutny, danilo

чт, 11 жовт. 2018 о 14:05 Rafal Luzynski <digitalfreak@lingonborough.com> пише:
>
> Thank you, Egor.  I am looking at your patch and although I have
> not yet finished, here are some remarks:
>
> First of all, I think that such a large patch should also include
> the tests.  Please see how automatic tests are performed in locale
> data and write your own.
>
> 11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
> > [...]
> > From this patch I have excluded locales that already mention cyrillic or
> > have a transliteration table for it:
> > az_AZ
> > iso14651_t1_common
> > ky_KG
> > mn_MN
> > sr_RS
> > tg_TJ
> > tk_TM
> > tt_RU
> > uk_UA
> > uz_UZ
> > uz_UZ@cyrillic
> > [...]
>
> I think that eventually we would like to include your translit_cyrillic
> also in these locales because I assume that your rules should work good
> for them as well, also should include more characters than the individual
> language contributors took into account.

It's very good idea. Transliteration in Ukrainian locale predates this
work for about decade. It well tested. I also have automatic test
cases, which I can adapt to current standard. Let's drop Russian
transliteration rules and replace them with Ukrainian transliteration
rules. I assume that Ukrainian rules should work good for them as
well.

Ukrainian language is the oldest and most developed language in Slavic
family - last king of all Slavs named Madzhak/Muzhik (Brave), leader
of Volyniana union, was lived in Western Ukraine in Volyn` region.
After Madzhak capturing of Madzhak, kingdom was split into multiple
western parts and eastern part, where 9 Slavic tribes were united by
Rus` tribe, which abandoned their city, now known as Old Russa,
because of epidemic. IMHO, it's will be fair to use rules of the
oldest Slavic union.

> Similarly to Mike's work on
> collation: a common rules were created and all locales include them adding
> their own language specific modifications.

It's good idea too. In our own locale we prefer that words in our
language will be at top of a sorted list. Currently, in Ukrainian
locale it works as intended, but Russian locale has inverted order.
IMHO, Russian locale should use Ukrainian rules.

$ echo 'один два three four'| tr ' ' '\n' | LANG=uk_UA.utf8 sort
два
один
four
three
$ echo 'один два three four'| tr ' ' '\n' | LANG=ru_RU.utf8 sort
four
three
два
один


>
> > [...]
> > COMMIT MESSAGE:
> > [...]
> > I am excluding these locales from this proposed patch. I have written
> > directly to locale maintainer emails listed in the files. Volodymyr
> > Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> > Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>
> I am not sure if we want Cyrillic text in the commit message.  Shouldn't
> it be, uhm, tranlisterated? :-)
>
> "sr_CS" - I guess you meant "sr_RS".
>
> "sr_YU" has been dropped, do we want to mention it?
>
> > [...]
> > [BZ #2872]
> > * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
>
> Please start "Add" with an uppercase.  BTW, shouldn't it be "New file"
> instead?
>
> > System A transliteration System B transcription table from Cyrillic to
> > Latin/ASCII.
> > * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
> > translit section.
>
> Same, "Add" here.
>
> > * localedata/locales/aa_DJ: Likewise.
>
> Good (here and everywhere below).
>
> > [...]
> > diff -uNr a/localedata/locales/translit_cyrillic
> > b/localedata/locales/translit_cyrillic
> > --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> > +0000
> > +++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
> > +0000
> > @@ -0,0 +1,383 @@
> > +escape_char /
> > +comment_char %
> > +
> > +% This file is part of the GNU C Library and contains locale data.
> > +% The Free Software Foundation does not claim any copyright interest
> > +% in the locale data contained in this file. The foregoing does not
> > +% affect the license of the GNU C Library as a whole. It does not
> > +% exempt you from the conditions of the license if your use would
> > +% otherwise be governed by that license.
> > +
> > +% Transliterations of cyrillic letters to latin and/or ascii symbols.
>
> "cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
>
> > +% Inspired by ISO 9.1995 / GOST 7.79-2000.
> > +% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
> > +% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
>
> Typos:
>
> "i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
> "U4001" - I guess you meant "U0401"
> "U4F9" -> "U04F9".  I think that "U4F9" is not definitely bad but
> let's be consistent.
>
> Also I can see some gaps in the range.  Are you going to fill them
> or maybe for now just mention that they exist?
>
> > +% It implements the GOST_7.79 System A (Latin Script) as a first
> > +% option and System B Cyrillic (ASCII) as a second option. Check
> > +% https://en.wikipedia.org/wiki/ISO_9 for reference.
> > +% The System B is extended from GOST_7.79-Russian using open sources
> > +% of the transliteration mappings and the "h/`" diacritics logic.
>
> What is "h/`" diacritics logic?
>
> > +
> > +% Usage examples:
> > +% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
> > +% | iconv -f ISO-8859-15 -t UTF-8 # System A
> > +% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
> > +
> > +% Contributions welcome for the rest of Cyrillic script in Unicode
>
> Sure, I'm not going to stop you from pushing these changes just because
> there are missing characters.  I will consider adding them later.
>
> > +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
> > +% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
> > +% Generated from UnicodeData.txt with
> > +% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
>
> 1. Is the file really generated with a script and not modified later?
> If yes then maybe you should contribute the script instead?  In that case,
> you should also not post this file to libc-locale, maintainers and
> developers should be able to regenerate it.
> 2. The link leads to a LibreOffice spreadsheet.
>
> > +LC_CTYPE
> > +
> > +translit_start
> > +
>
> <U0400> is missing here.  Are you going to leave it for now?
>
> > +% CYRILLIC CAPITAL LETTER IO
> > +<U0401> <U00CB>;"<U0059><U004F>"
> > [...]
> > +% CYRILLIC CAPITAL LETTER KJE
> > +<U040C> <U1E30>;"<U004B><U0060>"
>
> <U040D> is missing here.  Can we add it already?
>
> > +% CYRILLIC CAPITAL LETTER SHORT U
> > +<U040E> <U016C>;"<U0055><U0060>"
> > [...]
> > +% CYRILLIC CAPITAL LETTER U
> > +<U0423> <U0055>
> > +% CYRILLIC UNDEFINED
> > +<U0423><U0301> <U00DA>;"<U0055><U0060>"
>
> This still makes me wonder.
>
> Does it work at all?
> What if we remove this rule, won't it be transliterated as
> <U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
> will eventually produce "Ú"?
> Why is it called "UNDEFINED"?
> Do we need similar rules for other characters?
>
> > [...]
> > +% CYRILLIC SMALL LETTER U
> > +<U0443> <U0075>
> > +% CYRILLIC UNDEFINED
> > +<U0443><U0301> <U00FA>;"<U0075><U0060>"
>
> Same here.
>
> > [...]
> > +% CYRILLIC SMALL LETTER YA
> > +<U044F> <U00E2>;"<U0079><U0061>"
>
> Again <U0450> missing (because it is lowercase variant of <U0400>).
>
> > +% CYRILLIC SMALL LETTER IO
> > +<U0451> <U00EB>;"<U0079><U006F>"
> > [...]
> > +% CYRILLIC SMALL LETTER KJE
> > +<U045C> <U1E31>;"<U006B><U0060>"
>
> <U045D> missing (same reason as <U040D>).
>
> > +% CYRILLIC SMALL LETTER SHORT U
> > +<U045E> <U016D>;"<U0075><U0060>"
> > +% CYRILLIC SMALL LETTER DZHE
> > +<U045F> "<U0064><U0302>";"<U0064><U0068>"
>
> More letters missing here.  Is this because they are historic so we
> don't want to include them now?  Well, but "YUS" is also historic.
> (Please, do not remove YUS for consistency).
>
> > +% CYRILLIC CAPITAL LETTER BIG YUS
> > +<U046A> <U01CD>;"<U004F><U0060>"
> > +% CYRILLIC SMALL LETTER BIG YUS
> > +<U046B> <U01CE>;"<U006F><U0060>"
> > [...]
>
> I will continue but, again, I don't give any ETA so other reviewers
> are welcome here.
>
> Regards,
>
> Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-11 11:04     ` Rafal Luzynski
  2018-10-11 13:10       ` Marko Myllynen
  2018-10-11 13:50       ` Volodymyr Lisivka
@ 2018-10-11 14:59       ` Egor Kobylkin
  2018-10-11 21:30         ` Egor Kobylkin
  2018-10-11 15:05       ` Egor Kobylkin
  3 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 14:59 UTC (permalink / raw)
  To: Rafal Luzynski, libc-alpha, libc-locales, mfabian, Marko Myllynen
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 10321 bytes --]

Hi Rafal
On 11.10.2018 13:04, Rafal Luzynski wrote:
> Thank you, Egor.  I am looking at your patch and although I have
> not yet finished, here are some remarks:
> 
> First of all, I think that such a large patch should also include
> the tests.  Please see how automatic tests are performed in locale
> data and write your own.
Could you please point me to the existing automatic tests?
Locally I am using the test suggested in glibc locales wiki.
From my commit message:
"The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
"
I am visually checking whether any iconv run fails for all those locales
but you must refer to some automated unit test with a boolean outcome,
right?

> 
> 11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
>> [...]
>> From this patch I have excluded locales that already mention cyrillic or
>> have a transliteration table for it:
>> az_AZ
>> iso14651_t1_common
>> ky_KG
>> mn_MN
>> sr_RS
>> tg_TJ
>> tk_TM
>> tt_RU
>> uk_UA
>> uz_UZ
>> uz_UZ@cyrillic
>> [...]
> 
> I think that eventually we would like to include your translit_cyrillic
> also in these locales because I assume that your rules should work good
> for them as well, also should include more characters than the individual
> language contributors took into account.  Similarly to Mike's work on
> collation: a common rules were created and all locales include them adding
> their own language specific modifications.

This is fine with me. Should anybody supply translit_xxxxxxxxxxxx for
any of the mentioned locales we can include them as well. Wouldn't it be
easier to coordinate those as separate patches though?

> 
>> [...]
>> COMMIT MESSAGE:
>> [...]
>> I am excluding these locales from this proposed patch. I have written
>> directly to locale maintainer emails listed in the files. Volodymyr
>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>> Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
> 
> I am not sure if we want Cyrillic text in the commit message.  Shouldn't
> it be, uhm, tranlisterated? :-)

I do not see any Cyrillic text in the commit message.
the ?????? you see are the actual "?" symbols coming out of iconv now.

> 
> "sr_CS" - I guess you meant "sr_RS".
> 
> "sr_YU" has been dropped, do we want to mention it?

The list of locales and the patch itself is generated from the actual
locales - I do not hand pick them, only exclude the ones in the
exclusion list above.

> 
>> [...]
>> [BZ #2872]
>> * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
> 
> Please start "Add" with an uppercase.  BTW, shouldn't it be "New file"
> instead?
> 
>> System A transliteration System B transcription table from Cyrillic to
>> Latin/ASCII.
>> * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
>> translit section.
> 
> Same, "Add" here.
> 
>> * localedata/locales/aa_DJ: Likewise.
> 
> Good (here and everywhere below).
> 
>> [...]
>> diff -uNr a/localedata/locales/translit_cyrillic
>> b/localedata/locales/translit_cyrillic
>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
>> +0000
>> +++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
>> +0000
>> @@ -0,0 +1,383 @@
>> +escape_char /
>> +comment_char %
>> +
>> +% This file is part of the GNU C Library and contains locale data.
>> +% The Free Software Foundation does not claim any copyright interest
>> +% in the locale data contained in this file. The foregoing does not
>> +% affect the license of the GNU C Library as a whole. It does not
>> +% exempt you from the conditions of the license if your use would
>> +% otherwise be governed by that license.
>> +
>> +% Transliterations of cyrillic letters to latin and/or ascii symbols.
> 
> "cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
> 
>> +% Inspired by ISO 9.1995 / GOST 7.79-2000.
>> +% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
>> +% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
> 
> Typos:
> 
> "i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
> "U4001" - I guess you meant "U0401"
> "U4F9" -> "U04F9".  I think that "U4F9" is not definitely bad but
> let's be consistent.

These are all good catches. I will fix them and resubmit.

> 
> Also I can see some gaps in the range.  Are you going to fill them
> or maybe for now just mention that they exist?
> 
No, were not going to fill them please see this:
On 10.10.2018 14:34, Marko Myllynen wrote:
> On 2018-10-10 15:19, Egor Kobylkin wrote:
>> On 10.10.2018 13:22, Marko Myllynen wrote:
>>>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>>> Although I haven't checked every rule this in general looks very good
>>> (but see below).
>>> Not sure do we want to add the few missing characters
>>> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
>>> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
>>> least initially the more exotic characters, like the historic ones,
>>> though.) Perhaps filing a bug or two for these cases for separate
>>> consideration would be ok.
>> The question here is what should serve as their transliteration and
>> transcription?
> Not sure, so filing a separate bug about this once your patch is merged
> might be the most suitable action for now, I don't think we want to
> postpone merging your work further due to these non-ISO 9 cases.
>


>> +% It implements the GOST_7.79 System A (Latin Script) as a first
>> +% option and System B Cyrillic (ASCII) as a second option. Check
>> +% https://en.wikipedia.org/wiki/ISO_9 for reference.
>> +% The System B is extended from GOST_7.79-Russian using open sources
>> +% of the transliteration mappings and the "h/`" diacritics logic.
> 
> What is "h/`" diacritics logic?
Basically some Linguist mentioned that they have chosen "h" and '`" to
represent the diacritics for the transcription (i.e. GOST 7.79 System
B). This way there is some resemblance to the watertight transliteration
as per ISO 9 (Sysetem A) but it is still all in ASCII. We have decided
to extend GOST 7.79 to the all ISO 9 characters and so I have extended
it following that Linguist logic.


>> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
>> +% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
>> +% Generated from UnicodeData.txt with
>> +% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
> 
> 1. Is the file really generated with a script and not modified later?
> If yes then maybe you should contribute the script instead?  In that case,
> you should also not post this file to libc-locale, maintainers and
> developers should be able to regenerate it.
> 2. The link leads to a LibreOffice spreadsheet.
No, I do not have a script. The "generated" means it is a result of
formulas in that spreadsheet. People are welcome to write a script that
should be straightforward implementation of those rules in formulas.

> 
>> +LC_CTYPE
>> +
>> +translit_start
>> +
> 
> <U0400> is missing here.  Are you going to leave it for now?
Yes, it is to be left out, not in ISO 9. See the exchange with Marko above.

> 
>> +% CYRILLIC CAPITAL LETTER IO
>> +<U0401> <U00CB>;"<U0059><U004F>"
>> [...]
>> +% CYRILLIC CAPITAL LETTER KJE
>> +<U040C> <U1E30>;"<U004B><U0060>"
> 
> <U040D> is missing here.  Can we add it already?
Yes, it is to be left out, not in ISO 9. See the exchange with Marko above.

> 
>> +% CYRILLIC CAPITAL LETTER SHORT U
>> +<U040E> <U016C>;"<U0055><U0060>"
>> [...]
>> +% CYRILLIC CAPITAL LETTER U
>> +<U0423> <U0055>
>> +% CYRILLIC UNDEFINED
>> +<U0423><U0301> <U00DA>;"<U0055><U0060>"
> 
> This still makes me wonder.
> 
> Does it work at all?
> What if we remove this rule, won't it be transliterated as
> <U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
> will eventually produce "Ú"?
> Why is it called "UNDEFINED"?
On 10.10.2018 14:34, Marko Myllynen wrote:
> On 2018-10-10 15:19, Egor Kobylkin wrote:
>> On 10.10.2018 13:22, Marko Myllynen wrote:
...
>>> I'm not sure this will work, no existing rule in translit_* files
>>> contain two characters, I'd assume that the rule for U+0423 is applied
>>> first and then the below rule is never used.
>>>
>>> % CYRILLIC UNDEFINED
>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>>
>>> Perhaps this should be commented out or removed altogether if it's not
>>> working as intended.
>>
>> So yes, they are not processed. I would drop them to not to have special
>> cases. But I am also fine with keeping them because all work is done
>> already.
> I'd probably drop them but I don't feel strongly about this either way.
>
> Thanks for your efforts, I don't have any further comments, I'll leave
> this now for Rafal and Mike to provide additional feedback and hopefully
> merge soon.

Could you also please check the discussion with Marko on UNDEFINED and
other related topics? You were on To: or CC: for those emails.
The same for the other characters below.

> Do we need similar rules for other characters?
> 
>> [...]
>> +% CYRILLIC SMALL LETTER U
>> +<U0443> <U0075>
>> +% CYRILLIC UNDEFINED
>> +<U0443><U0301> <U00FA>;"<U0075><U0060>"
> 
> Same here.
> 
>> [...]
>> +% CYRILLIC SMALL LETTER YA
>> +<U044F> <U00E2>;"<U0079><U0061>"
> 
> Again <U0450> missing (because it is lowercase variant of <U0400>).
> 
>> +% CYRILLIC SMALL LETTER IO
>> +<U0451> <U00EB>;"<U0079><U006F>"
>> [...]
>> +% CYRILLIC SMALL LETTER KJE
>> +<U045C> <U1E31>;"<U006B><U0060>"
> 
> <U045D> missing (same reason as <U040D>).
> 
>> +% CYRILLIC SMALL LETTER SHORT U
>> +<U045E> <U016D>;"<U0075><U0060>"
>> +% CYRILLIC SMALL LETTER DZHE
>> +<U045F> "<U0064><U0302>";"<U0064><U0068>"
> 
> More letters missing here.  Is this because they are historic so we
> don't want to include them now?  Well, but "YUS" is also historic.
> (Please, do not remove YUS for consistency).
> 
>> +% CYRILLIC CAPITAL LETTER BIG YUS
>> +<U046A> <U01CD>;"<U004F><U0060>"
>> +% CYRILLIC SMALL LETTER BIG YUS
>> +<U046B> <U01CE>;"<U006F><U0060>"
>> [...]
> 
> I will continue but, again, I don't give any ETA so other reviewers
> are welcome here.
> 
> Regards,
> 
> Rafal
> 

Bests,
Egor

[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 7307 bytes --]

From: Marko Myllynen <myllynen@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, Rafal Luzynski <digitalfreak@lingonborough.com>
Cc: Keld Simonsen <keld@keldix.com>, libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
Date: Wed, 10 Oct 2018 15:34:26 +0300
Message-ID: <286bc20c-db97-5244-8c26-a3a95e989361@redhat.com>

Hi,

On 2018-10-10 15:19, Egor Kobylkin wrote:
> On 10.10.2018 13:22, Marko Myllynen wrote:
>>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>>
>> Although I haven't checked every rule this in general looks very good
>> (but see below). 
> 
>> Not sure do we want to add the few missing characters
>> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
>> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
>> least initially the more exotic characters, like the historic ones,
>> though.) Perhaps filing a bug or two for these cases for separate
>> consideration would be ok.
> 
> The question here is what should serve as their transliteration and
> transcription?

Not sure, so filing a separate bug about this once your patch is merged
might be the most suitable action for now, I don't think we want to
postpone merging your work further due to these non-ISO 9 cases.

>> I'm not sure this will work, no existing rule in translit_* files
>> contain two characters, I'd assume that the rule for U+0423 is applied
>> first and then the below rule is never used.
>>
>> % CYRILLIC UNDEFINED
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>
>> Perhaps this should be commented out or removed altogether if it's not
>> working as intended.
> 
> So yes, they are not processed. I would drop them to not to have special
> cases. But I am also fine with keeping them because all work is done
> already.
I'd probably drop them but I don't feel strongly about this either way.

Thanks for your efforts, I don't have any further comments, I'll leave
this now for Rafal and Mike to provide additional feedback and hopefully
merge soon.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-11 11:04     ` Rafal Luzynski
                         ` (2 preceding siblings ...)
  2018-10-11 14:59       ` Egor Kobylkin
@ 2018-10-11 15:05       ` Egor Kobylkin
  3 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 15:05 UTC (permalink / raw)
  To: Rafal Luzynski, libc-alpha, libc-locales, mfabian, Marko Myllynen
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

On 11.10.2018 13:04, Rafal Luzynski wrote:
> Thank you, Egor.  I am looking at your patch and although I have
> not yet finished, here are some remarks:
...
>> [...]
>> [BZ #2872]
>> * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
> 
> Please start "Add" with an uppercase.  BTW, shouldn't it be "New file"
> instead?

"New file or Add" -  I don't know. You tell me.
> 
>> System A transliteration System B transcription table from Cyrillic to
>> Latin/ASCII.
>> * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
>> translit section.
> 
> Same, "Add" here.
> 

Same, please advise.
Bests,
Egor


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (2 preceding siblings ...)
  2018-10-10 22:29   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
@ 2018-10-11 15:44   ` Egor Kobylkin
  2018-10-11 21:33   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 15:44 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 66184 bytes --]

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_YU, sr_CS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
	* localedata/locales/aa_DJ: Likewise.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/sd_PK: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""


+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>

 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@

 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE








[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56414 bytes --]

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with 
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
  2018-10-11 14:59       ` Egor Kobylkin
@ 2018-10-11 21:30         ` Egor Kobylkin
  0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 21:30 UTC (permalink / raw)
  To: Rafal Luzynski, libc-alpha, libc-locales
  Cc: mfabian, Marko Myllynen, Dmitry V. Levin, Volodymyr Lisivka,
	Max Kutny, danilo

On 11.10.2018 16:59, Egor Kobylkin wrote:
> On 11.10.2018 13:04, Rafal Luzynski wrote:
>>> COMMIT MESSAGE:
>>> [...]
>>> I am excluding these locales from this proposed patch. I have written
>>> directly to locale maintainer emails listed in the files. Volodymyr
>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>> Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>>
>> I am not sure if we want Cyrillic text in the commit message.  Shouldn't
>> it be, uhm, tranlisterated? :-)
> 
> I do not see any Cyrillic text in the commit message.
> the ?????? you see are the actual "?" symbols coming out of iconv now.
> 
>>
>> "sr_CS" - I guess you meant "sr_RS".
>>
>> "sr_YU" has been dropped, do we want to mention it?
> 
> The list of locales and the patch itself is generated from the actual
> locales - I do not hand pick them, only exclude the ones in the
> exclusion list above.

Ah, yes, that message above should read sr_RS. Will fix.
There is no sr_YU anymore indeed, so I will drop it. No changes to the
patch, just the commit message.

Bests,
Egor

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (3 preceding siblings ...)
  2018-10-11 15:44   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
@ 2018-10-11 21:33   ` Egor Kobylkin
  2018-10-12 14:05   ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 21:33 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen
  Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 66178 bytes --]

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
	* localedata/locales/aa_DJ: Likewise.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/sd_PK: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""


+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>

 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@

 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE









[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56415 bytes --]

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with 
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (4 preceding siblings ...)
  2018-10-11 21:33   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
@ 2018-10-12 14:05   ` Egor Kobylkin
  2018-10-13  0:59     ` Rafal Luzynski
  2018-10-17 14:16   ` [PATCH v6] " Egor Kobylkin
                     ` (7 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-12 14:05 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen,
	Dmitry V. Levin
  Cc: Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 66191 bytes --]

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/C: Add include "translit_cyrillic";"" to LC_CTYPE
translit section.
	* localedata/locales/aa_DJ: Likewise.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/sd_PK: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""


+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@

 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>

 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@

 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""

+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@

 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end

 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@

 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE










[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56416 bytes --]

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with 
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 



^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-10-12 14:05   ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-10-13  0:59     ` Rafal Luzynski
  2018-10-13 16:58       ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-13  0:59 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Marko Myllynen,
	Dmitry V. Levin
  Cc: Volodymyr Lisivka, Max Kutny, danilo

Egor,

Thank you for the update.  I took a closer look at your patch so this
time my review is more complete than before although not yet fully complete.

As far as I understand, ISO-9 and its GOST variants are meant to be
universal rather than Russian-specific.  Therefore it is correct to place
them in the external file, like translit_cyrillic, and then include this
file in other locales adding locale specific modifications, if required.
For example, if there are any Russian-specific rules not included in this
file, they should go to ru_RU.

The text of the ISO-9 standard is not available in public, have we got
anything better than an article in Wikipedia?

Regarding the format of your commit message, I hesitate to say anything
more because there are more experienced maintainers around here.  Please
take a look at the Contribution Checklist. [1]

While at this, what is your legal relationship with GLIBC project?  Have
you signed the FSF Copyright Assignment?  It is not necessary for the locale
data but it might be necessary if you are going to contribute the testing code.

Regarding the tests, I think there is no complete transliteration test
suite at the moment.  Probably the only test is localedata/bug-iconv-trans.c.
You can also see the collation tests placed in the same directory, they
use those multiple *.UTF-8.in files.

You can skip the tests for now.

Technical issue:  Please either attach your patch to the email message or
paste it inline, not both.  The patch as it is now is not applicable.
I had to edit it manually to apply.


12.10.2018 16:05 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic

I confirm that these locales are excluded and there are no other missing
locales.

> [...]
>
> diff -uNr a/localedata/locales/C b/localedata/locales/C
> --- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
> +++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000

There is no such file.  Where have you got the source code from?  Are you
sure this is glibc? :-)

> [...]
> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> --- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
> +++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
> @@ -1394,6 +1394,7 @@
> <U137A> <U0060><U0039><U0030>
> <U137B> <U0060><U0031><U0030><U0030>
> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
> +include "translit_cyrillic";""
> translit_end
> %
> END LC_CTYPE

Shouldn't “include "translit_cyrillic";""” be placed before the custom rules,
together with other includes?  The same in more files, I will not mention
them all.

> [...]
> diff -uNr a/localedata/locales/sd_IN@devanagari
> b/localedata/locales/sd_IN@devanagari
> --- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000
> +0000
> +++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000
> +0000

Those 3 lines have been broken by the email agent, the patch is not applicable.

> [...]
> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> --- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
> +++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000

There is no such file in glibc.

> [...]
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
> +0000

Again 3 lines broken, the patch is not applicable.

> [...]
> +% Contributions welcome for the rest of Cyrillic script in Unicode
> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.

I am still tempted to add more Cyrillic characters but I understand
that it must be clearly separated which transliteration rules come from
ISO-9 and which are our own invention.  But that's not for now.

> [...]
> +translit_start
> +
> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> <U00CB>;"<U0059><U004F>"

This says that for ASCII (GOST 7.79 System B) you would like to transliterate
"Ё" as "YO" but the table in Wikipedia says "Yo".  I understand that one or
another may be correct depending on the context but we should be consistent
and also better let's stick with the standard.

> +% CYRILLIC CAPITAL LETTER DJE
> +<U0402> <U0110>;"<U0044><U004A>"

This says "DJ" but System B does not mention it.  Where does it come from?
Also, I think it should be "Dj" rather than "DJ".

> +% CYRILLIC CAPITAL LETTER GJE
> +<U0403> <U01F4>;"<U0047><U0060>"

Correct, according to both systems.

> +% CYRILLIC CAPITAL LETTER UKRAINIAN IE
> +<U0404> <U00CA>;"<U0059><U0065>"

"Ye" - correct.

> +% CYRILLIC CAPITAL LETTER DZE
> +<U0405> <U1E90>;"<U005A><U0060>"

Correct.

> +% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
> +<U0406> <U00CC>;<U0049>

Correct.  The table mentions an alternative transliteration "I`" but
says that it is "only before vowels for Old Russian and Old Bulgarian".
I think we can skip this other variant.

> +% CYRILLIC CAPITAL LETTER YI
> +<U0407> <U00CF>;"<U0059><U0069>"

"Yi" - correct.

> +% CYRILLIC CAPITAL LETTER JE
> +<U0408> "<U004A><U030C>";<U004A>

Correct.

> +% CYRILLIC CAPITAL LETTER LJE
> +<U0409> "<U004C><U0302>";"<U004C><U0060>"

Correct, according to the standard.  If Serbian language requires "Lj"
then overrides should go to sr_RS file.

> +% CYRILLIC CAPITAL LETTER NJE
> +<U040A> "<U004E><U0302>";"<U004E><U0060>"

Correct, the same comment.

> +% CYRILLIC CAPITAL LETTER TSHE
> +<U040B> <U0106>;"<U0054><U0053><U0048>"

Where does "TSH" come from?  It is not mentioned by the System B table.
Also I am afraid this is not correct.

> +% CYRILLIC CAPITAL LETTER KJE
> +<U040C> <U1E30>;"<U004B><U0060>"

Correct.

> +% CYRILLIC CAPITAL LETTER SHORT U
> +<U040E> <U016C>;"<U0055><U0060>"

"U`" - correct.

> +% CYRILLIC CAPITAL LETTER DZHE
> +<U040F> "<U0044><U0302>";"<U0044><U0068>"

"Dh" - correct.

> [...]
> +% CYRILLIC CAPITAL LETTER ZHE
> +<U0416> <U017D>;"<U005A><U0048>"

"ZH" - shouldn't be "Zh"?

> [...]
> +% CYRILLIC UNDEFINED
> +<U0423><U0301> <U00DA>;"<U0055><U0060>"

1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
2. OK, the System A table mentions this letter but System B does not.
   Somehow we should handle it.  I think that "U`" is the best we can
   do for now.
3. It must be tested whether this actually works.

> [...]
> +% CYRILLIC CAPITAL LETTER HA
> +<U0425> <U0048>;<U0058>

I don't think that "H" is unavailable in any encoding therefore it will
always be transliterated as "H" and never as "X".  We can't help it and
I don't think it is bad.

> +% CYRILLIC CAPITAL LETTER TSE
> +<U0426> <U0043>;"<U0043><U005A>"

1. "CZ" - maybe should be "Cz"?
2. Are we able to implement the rule: "c before i, e, y, j"?

> +% CYRILLIC CAPITAL LETTER CHE
> +<U0427> <U010C>;"<U0043><U0048>"

"CH" -> "Ch"?

> +% CYRILLIC CAPITAL LETTER SHA
> +<U0428> <U0160>;"<U0053><U0048>"

"SH" -> "Sh"?

> +% CYRILLIC CAPITAL LETTER SHCHA
> +<U0429> <U015C>;"<U0053><U0048><U0048>"

"SHH" -> "Shh"?

> +% CYRILLIC CAPITAL LETTER HARD SIGN
> +<U042A> <U02BA>;"<U0041><U0060>"

"A`" is only for Bulgarian and should go to bg_BG.  How should
we transliterate an upper case hard sign to plain ASCII?  I think
that just "``", same as lower case.

> +% CYRILLIC CAPITAL LETTER YERU
> +<U042B> <U0059>;"<U0059><U0060>"

Again, as "Y" is always available it will never be transliterated
as "Y`".

> +% CYRILLIC CAPITAL LETTER SOFT SIGN
> +<U042C> <U02B9>;<U0060>

OK, I like it to be transliterated to plain ASCII as "`".

> +% CYRILLIC CAPITAL LETTER E
> +<U042D> <U00C8>;"<U0045><U0060>"

OK

> +% CYRILLIC CAPITAL LETTER YU
> +<U042E> <U00DB>;"<U0059><U0055>"

"YU" -> "Yu"?

> +% CYRILLIC CAPITAL LETTER YA
> +<U042F> <U00C2>;"<U0059><U0041>"

"YA" -> "Ya"?

> [...]

I am sorry, this is of course incomplete but that's enough for tonight.

Regards,

Rafal


[1] https://sourceware.org/glibc/wiki/Contribution%20checklist

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-10-13  0:59     ` Rafal Luzynski
@ 2018-10-13 16:58       ` Egor Kobylkin
  2018-10-15 11:04         ` Marko Myllynen
  2018-10-23 23:08         ` Rafal Luzynski
  0 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-13 16:58 UTC (permalink / raw)
  To: Rafal Luzynski, libc-alpha, libc-locales
  Cc: mfabian, Marko Myllynen, Dmitry V. Levin, Volodymyr Lisivka,
	Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 7317 bytes --]

Hi Rafal,

Thanks for the thorough checking, it really helps.

On 13.10.2018 02:59, Rafal Luzynski wrote:
> Technical issue:  Please either attach your patch to the email 
> message or paste it inline, not both.  The patch as it is now is not 
> applicable. I had to edit it manually to apply.
>> diff -uNr a/localedata/locales/C b/localedata/locales/C --- 
>> a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++ 
>> b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
> 
> There is no such file.  Where have you got the source code from?
> Are you sure this is glibc? :-)

I was running my patch process against the Ubuntu 18.04 version of
localedata/locales. Now I have checked out the GitHub glibc source v2.28
and done the same. Please find the new patch attached. I am not
submitting it as a patch request because we have not yet addressed the
rest of your comments below. But at least this should be working as a
patch for you. Please let me know if there is any problem there still.

>> [...] From this patch I have excluded locales that already mention 
>> cyrillic or have a transliteration table for it: az_AZ 
>> iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ 
>> uz_UZ@cyrillic
> 
> I confirm that these locales are excluded and there are no other 
> missing locales.

Because of the surprisingly different list of locales between Ubuntu and
glibc there is now a different list of excluded ones as well.

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

az_AZ, ky_KG are now included because they don't have cyrillic translit
in glibc. iso14651_t1_common is still implicitly excluded, because it
doesn't have 'translit_end' string.

Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
after the patch applied (az_AZ is explicitly including tr_TR). I do not
see a reason, maybe you could check?


> Regarding the tests, I think there is no complete transliteration 
> test suite at the moment.  Probably the only test is 
> localedata/bug-iconv-trans.c. You can also see the collation tests 
> placed in the same directory, they use those multiple *.UTF-8.in 
> files.
> 
> You can skip the tests for now.

In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
change the list of the symbols we are now transliterating

  const char str[] = "ÄäÖöÜüß";
  const char expected[] = "AEaeOEoeUEuess";

like this

  const char str[] =
"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩъЫьЭЮЯабвгдежзийклмнопрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
  const char expected[] =
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
Y`y`'";

First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.


>> [...] diff -uNr a/localedata/locales/am_ET 
>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
>> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
>> "translit_cyrillic";"" translit_end % END LC_CTYPE
> 
> Shouldn't “include "translit_cyrillic";""” be placed before the 
> custom rules, together with other includes?  The same in more files, 
> I will not mention them all.

If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.

As with some other comments, I am not super familiar with the formats of
glibc files. So if you have a definitive suggestion - pls. formulate it
as an imperative, not a question.


>> [...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401> 
>> <U00CB>;"<U0059><U004F>"
> 
> This says that for ASCII (GOST 7.79 System B) you would like to 
> transliterate "Ё" as "YO" but the table in Wikipedia says "Yo".  I 
> understand that one or another may be correct depending on the 
> context but we should be consistent and also better let's stick with 
> the standard.

The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
example for "Сх" and "Ш" that would both transliterate to Sh:
With SH:"Схема"->"Shema" but "Шема"->"SHema"
With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
This is important e.g. for renaming files, grouping as in using uniq etc.

> 
>> +% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
> 
> This says "DJ" but System B does not mention it.  Where does it come 
> from? Also, I think it should be "Dj" rather than "DJ".
I took the first two letters from its name.


>> [...] +% CYRILLIC UNDEFINED +<U0423><U0301> 
>> <U00DA>;"<U0055><U0060>"
> 
> 1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
> 2. OK, the System A table mentions this letter but System B does not.
> Somehow we should handle it.  I think that "U`" is the best we can do
> for now. 3. It must be tested whether this actually works.
1. Let's do it just before you are ready to commit the patch, because it
breaks formulas in my worksheet and I will have to do it manually?
3. I have tested and it doesn't work/gets ignored. But if you were to
handle COMBINING it would work, wouldn't it?


>> [...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
> 
> I don't think that "H" is unavailable in any encoding therefore it 
> will always be transliterated as "H" and never as "X".  We can't
> help it and I don't think it is bad.
> 
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.

>> +% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
> 
> 1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
> rule: "c before i, e, y, j"?
> 
1. see for CYRILLIC CAPITAL LETTER IO
2. not sure what you are talking about in 2. but I believe it's not
possible as per Marko's email.


>> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A> 
>> <U02BA>;"<U0041><U0060>"
> 
> "A`" is only for Bulgarian and should go to bg_BG.  How should we 
> transliterate an upper case hard sign to plain ASCII?  I think that 
> just "``", same as lower case.
This is to avoid collision. Besides AFAIK e.g. in Russian there is no
capital hard sign because there are no words starting with it.

> 
>> +% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
> 
> Again, as "Y" is always available it will never be transliterated as 
> "Y`".
> 
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.


Bests,
Diego

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56408 bytes --]

diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/aa_DJ	2018-10-13 16:52:32.666374687 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/af_ZA	2018-10-13 16:52:32.442373810 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/ak_GH	2018-10-13 16:52:32.774375109 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/am_ET	2018-10-13 16:52:32.466373904 +0000
@@ -893,6 +893,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/ar_EG	2018-10-13 16:52:32.806375234 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/az_AZ b/localedata/locales/az_AZ
--- a/localedata/locales/az_AZ	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/az_AZ	2018-10-13 16:52:32.494374014 +0000
@@ -136,6 +136,7 @@
 <U0259> "<U00E4>"
 <U018F> "<U00C4>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/be_BY	2018-10-13 16:52:32.518374107 +0000
@@ -91,6 +91,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/bem_ZM	2018-10-13 16:52:32.674374718 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/ber_DZ	2018-10-13 16:52:32.878375516 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/ber_MA	2018-10-13 16:52:32.858375438 +0000
@@ -83,6 +83,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/bg_BG	2018-10-13 16:52:32.446373826 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bi_VU	2018-10-13 16:52:32.786375156 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bn_BD	2018-10-13 16:52:32.766375078 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bo_CN	2018-10-13 16:52:32.930375719 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/ca_ES	2018-10-13 16:52:32.930375719 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/ce_RU	2018-10-13 16:52:32.490373998 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/cmn_TW	2018-10-13 16:52:32.670374702 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-13 16:52:32.238373012 +0000
+++ b/localedata/locales/cs_CZ	2018-10-13 16:52:32.874375500 +0000
@@ -215,6 +215,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-13 16:52:32.238373012 +0000
+++ b/localedata/locales/cv_RU	2018-10-13 16:52:32.610374468 +0000
@@ -103,6 +103,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/cy_GB	2018-10-13 16:52:32.434373779 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/da_DK	2018-10-13 16:52:32.894375579 +0000
@@ -169,6 +169,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/de_DE	2018-10-13 16:52:32.898375594 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/dv_MV	2018-10-13 16:52:32.842375375 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/dz_BT	2018-10-13 16:52:32.838375360 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/el_GR	2018-10-13 16:52:32.862375454 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_GB	2018-10-13 16:52:32.794375187 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_NG	2018-10-13 16:52:32.626374530 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_ZM	2018-10-13 16:52:32.454373857 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/es_CU	2018-10-13 16:52:32.886375547 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/es_ES	2018-10-13 16:52:32.426373748 +0000
@@ -107,6 +107,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/et_EE	2018-10-13 16:52:32.758375046 +0000
@@ -113,6 +113,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fa_IR	2018-10-13 16:52:32.446373826 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ff_SN	2018-10-13 16:52:32.466373904 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fi_FI	2018-10-13 16:52:32.846375391 +0000
@@ -177,6 +177,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fr_FR	2018-10-13 16:52:32.522374123 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ga_IE	2018-10-13 16:52:32.906375626 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gd_GB	2018-10-13 16:52:32.894375579 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gu_IN	2018-10-13 16:52:32.802375218 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gv_GB	2018-10-13 16:52:32.626374530 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/he_IL	2018-10-13 16:52:32.926375704 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hi_IN	2018-10-13 16:52:32.634374561 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hif_FJ	2018-10-13 16:52:32.642374593 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hr_HR	2018-10-13 16:52:32.870375485 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ht_HT	2018-10-13 16:52:32.798375203 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hu_HU	2018-10-13 16:52:32.518374107 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hy_AM	2018-10-13 16:52:32.766375078 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/id_ID	2018-10-13 16:52:32.522374123 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/is_IS	2018-10-13 16:52:32.606374452 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/it_IT	2018-10-13 16:52:32.770375093 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ja_JP	2018-10-13 16:52:32.754375031 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kab_DZ	2018-10-13 16:52:32.922375688 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kk_KZ	2018-10-13 16:52:32.866375469 +0000
@@ -99,6 +99,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/km_KH	2018-10-13 16:52:32.598374421 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kn_IN	2018-10-13 16:52:32.762375062 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ko_KR	2018-10-13 16:52:32.582374358 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ks_IN	2018-10-13 16:52:32.510374076 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kw_GB	2018-10-13 16:52:32.790375171 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ky_KG b/localedata/locales/ky_KG
--- a/localedata/locales/ky_KG	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ky_KG	2018-10-13 16:52:32.410373685 +0000
@@ -82,6 +82,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lb_LU	2018-10-13 16:52:32.874375500 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lg_UG	2018-10-13 16:52:32.430373763 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lij_IT	2018-10-13 16:52:32.782375140 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ln_CD	2018-10-13 16:52:32.438373795 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lo_LA	2018-10-13 16:52:32.530374154 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lt_LT	2018-10-13 16:52:32.602374436 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lv_LV	2018-10-13 16:52:32.794375187 +0000
@@ -125,6 +125,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mg_MG	2018-10-13 16:52:32.486373982 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mhr_RU	2018-10-13 16:52:32.866375469 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mk_MK	2018-10-13 16:52:32.598374421 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ml_IN	2018-10-13 16:52:32.610374468 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ms_MY	2018-10-13 16:52:32.638374577 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mt_MT	2018-10-13 16:52:32.890375563 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-13 16:52:32.530374154 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/nb_NO	2018-10-13 16:52:32.778375125 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ne_NP	2018-10-13 16:52:32.842375375 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nhn_MX	2018-10-13 16:52:32.766375078 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/niu_NU	2018-10-13 16:52:32.802375218 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/niu_NZ	2018-10-13 16:52:32.850375407 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nl_NL	2018-10-13 16:52:32.602374436 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nr_ZA	2018-10-13 16:52:32.918375673 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/oc_FR	2018-10-13 16:52:32.818375281 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/om_KE	2018-10-13 16:52:32.918375673 +0000
@@ -156,6 +156,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/or_IN	2018-10-13 16:52:32.926375704 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/os_RU	2018-10-13 16:52:32.910375641 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pa_IN	2018-10-13 16:52:32.638374577 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pa_PK	2018-10-13 16:52:32.422373732 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pl_PL	2018-10-13 16:52:32.502374045 +0000
@@ -130,6 +130,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pt_PT	2018-10-13 16:52:32.910375641 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/quz_PE	2018-10-13 16:52:32.470373920 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ro_RO	2018-10-13 16:52:32.646374608 +0000
@@ -142,6 +142,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ru_RU	2018-10-13 16:52:32.534374170 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/rw_RW	2018-10-13 16:52:32.814375265 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sa_IN	2018-10-13 16:52:32.790375171 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sd_IN	2018-10-13 16:52:32.770375093 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-13 16:52:32.818375281 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/se_NO	2018-10-13 16:52:32.634374561 +0000
@@ -221,6 +221,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sgs_LT	2018-10-13 16:52:32.810375250 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/shn_MM	2018-10-13 16:52:32.506374060 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/si_LK	2018-10-13 16:52:32.814375265 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sk_SK	2018-10-13 16:52:32.418373716 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sl_SI	2018-10-13 16:52:32.486373982 +0000
@@ -2120,6 +2120,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sm_WS	2018-10-13 16:52:32.498374029 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/so_SO	2018-10-13 16:52:32.414373701 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sq_AL	2018-10-13 16:52:32.798375203 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ss_ZA	2018-10-13 16:52:32.846375391 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/st_ZA	2018-10-13 16:52:32.906375626 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sv_SE	2018-10-13 16:52:32.630374546 +0000
@@ -173,6 +173,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sw_KE	2018-10-13 16:52:32.590374389 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ta_IN	2018-10-13 16:52:32.586374374 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/te_IN	2018-10-13 16:52:32.642374593 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/th_TH	2018-10-13 16:52:32.902375610 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ti_ET	2018-10-13 16:52:32.618374499 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/tn_ZA	2018-10-13 16:52:32.882375532 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/to_TO	2018-10-13 16:52:32.822375297 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/tpi_PG	2018-10-13 16:52:32.454373857 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/tr_TR	2018-10-13 16:52:32.662374671 +0000
@@ -2538,6 +2538,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-13 16:52:32.942375766 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced 
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ts_ZA	2018-10-13 16:52:32.806375234 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/unm_US	2018-10-13 16:52:32.782375140 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ur_IN	2018-10-13 16:52:32.762375062 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ur_PK	2018-10-13 16:52:32.510374076 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ve_ZA	2018-10-13 16:52:32.854375422 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/vi_VN	2018-10-13 16:52:32.826375313 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/wa_BE	2018-10-13 16:52:32.850375407 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/wo_SN	2018-10-13 16:52:32.886375547 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/xh_ZA	2018-10-13 16:52:32.858375438 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/yi_US	2018-10-13 16:52:32.506374060 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/yuw_PG	2018-10-13 16:52:32.494374014 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-13 16:52:32.278373168 +0000
+++ b/localedata/locales/zh_CN	2018-10-13 16:52:32.862375454 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-13 16:52:32.278373168 +0000
+++ b/localedata/locales/zu_ZA	2018-10-13 16:52:32.886375547 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-10-13 16:58       ` Egor Kobylkin
@ 2018-10-15 11:04         ` Marko Myllynen
  2018-10-15 11:54           ` Egor Kobylkin
  2018-10-23 23:08         ` Rafal Luzynski
  1 sibling, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-15 11:04 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales
  Cc: mfabian, Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

Hi,

On 2018-10-13 19:58, Egor Kobylkin wrote:
> On 13.10.2018 02:59, Rafal Luzynski wrote:
> 
>> Regarding the tests, I think there is no complete transliteration 
>> test suite at the moment.  Probably the only test is 
>> localedata/bug-iconv-trans.c. You can also see the collation tests 
>> placed in the same directory, they use those multiple *.UTF-8.in 
>> files.
>>
>> You can skip the tests for now.
> 
> First I though they could just be added but not all locales
> transliterate Umlauts so just extending the current test won't do as it
> will fail for those locales.

I still think a one-time check against uconv(1) (part of Unicode's ICU
project) for discrepancies.

>>> [...] diff -uNr a/localedata/locales/am_ET 
>>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
>>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
>>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
>>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
>>> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
>>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>>
>> Shouldn't “include "translit_cyrillic";""” be placed before the 
>> custom rules, together with other includes?  The same in more files, 
>> I will not mention them all.
> 
> If I recall correctly it is because of the
> "translit_end
> END LC_CTYPE"
> part at the end of the translit_cyrillic. This way it works for any
> locale, regardless whether it has translit itself or not. And being at
> the end it does not supersede any previous transliteration that may be
> there for a reason.

I suspect one problem would be that the latter rule wins, so if there
are some locale-specific rules than possible translit_* inclusions would
override them if not included before the locale-specific rules.

Cheers,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-10-15 11:04         ` Marko Myllynen
@ 2018-10-15 11:54           ` Egor Kobylkin
  0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-15 11:54 UTC (permalink / raw)
  To: Marko Myllynen, Rafal Luzynski, libc-alpha, libc-locales
  Cc: mfabian, Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

On 15.10.2018 13:04, Marko Myllynen wrote:
> Hi,
> 
> On 2018-10-13 19:58, Egor Kobylkin wrote:
>> On 13.10.2018 02:59, Rafal Luzynski wrote:
>>
>>> Regarding the tests, I think there is no complete transliteration 
>>> test suite at the moment.  Probably the only test is 
>>> localedata/bug-iconv-trans.c. You can also see the collation tests 
>>> placed in the same directory, they use those multiple *.UTF-8.in 
>>> files.
>>>
>>> You can skip the tests for now.
>>
>> First I though they could just be added but not all locales
>> transliterate Umlauts so just extending the current test won't do as it
>> will fail for those locales.
> 
> I still think a one-time check against uconv(1) (part of Unicode's ICU
> project) for discrepancies.

Just an addition. I have changes a few constants to see whether
localedata/bug-iconv-trans.c could be made to test cyrillic. Attached is
the bug-iconv-trans-cyr.c that goes through in this form. I had to save
it as UTF-8 instead of ISO-8859-15 for localedata/bug-iconv-trans.c.

>>>> [...] diff -uNr a/localedata/locales/am_ET 
>>>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
>>>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
>>>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
>>>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
>>>> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
>>>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>>>
>>> Shouldn't “include "translit_cyrillic";""” be placed before the 
>>> custom rules, together with other includes?  The same in more files, 
>>> I will not mention them all.
>>
>> If I recall correctly it is because of the
>> "translit_end
>> END LC_CTYPE"
>> part at the end of the translit_cyrillic. This way it works for any
>> locale, regardless whether it has translit itself or not. And being at
>> the end it does not supersede any previous transliteration that may be
>> there for a reason.
> 
> I suspect one problem would be that the latter rule wins, so if there
> are some locale-specific rules than possible translit_* inclusions would
> override them if not included before the locale-specific rules.

What is the best way forward here? Can somebody make an explicit
suggestion on how to change the current approach if needed?

Bests,
Egor


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: bug-iconv-trans-cyr.c --]
[-- Type: text/x-csrc; name="bug-iconv-trans-cyr.c", Size: 2207 bytes --]

#include <iconv.h>
#include <locale.h>
#include <stdio.h>
#include <string.h>

int
main (void)
{
  iconv_t cd;
  const char str[] = "CyrillicLetters_ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’";
  const char expected[] = "CyrillicLetters_YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUUFHCCHSHSHHA`Y`E`YUYAabvgdezhzijklmnoprstuufhcchshshh``y`e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'";
  char *inptr = (char *) str;
  size_t inlen = strlen (str) + 1;
  char outbuf[500];
  char *outptr = outbuf;
  size_t outlen = sizeof (outbuf);
  int result = 0;
  size_t n;

  if (setlocale (LC_ALL, "de_DE.UTF-8") == NULL)
    {
      puts ("setlocale failed");
      return 1;
    }

  cd = iconv_open ("ANSI_X3.4-1968//TRANSLIT", "UTF-8");
  if (cd == (iconv_t) -1)
    {
      puts ("iconv_open failed");
      return 1;
    }

  n = iconv (cd, &inptr, &inlen, &outptr, &outlen);
  if (n != 174)
    {
      if (n == (size_t) -1)
	printf ("iconv() returned error: %m\n");
      else
	printf ("iconv() returned %Zd, expected 7\n", n);
      result = 1;
    }
  if (inlen != 0)
    {
      puts ("not all input consumed");
      result = 1;
    }
  else if (inptr - str != strlen (str) + 1)
    {
      printf ("inptr wrong, advanced by %td\n", inptr - str);
      result = 1;
    }
  if (memcmp (outbuf, expected, sizeof (expected)) != 0)
    {
      printf ("result wrong: \"%.*s\", expected: \"%s\"\n",
	      (int) (sizeof (outbuf) - outlen), outbuf, expected);
      result = 1;
    }
  else if (outlen != sizeof (outbuf) - sizeof (expected))
    {
      printf ("outlen wrong: %Zd, expected %Zd\n", outlen,
	      sizeof (outbuf) - 15);
      result = 1;
    }
  else
    printf ("output is \"%s\" which is OK\n", outbuf);

  return result;
}

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v6] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (5 preceding siblings ...)
  2018-10-12 14:05   ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-10-17 14:16   ` Egor Kobylkin
  2018-11-01 22:51   ` [PATCH v7] " Egor Kobylkin
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-17 14:16 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen,
	Dmitry V. Levin
  Cc: Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 10007 bytes --]

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-17  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/az_AZ: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/ky_KG: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.












[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56408 bytes --]

diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-17 13:52:01.871309540 +0000
+++ b/localedata/locales/aa_DJ	2018-10-17 13:52:02.415310947 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-17 13:52:01.875309551 +0000
+++ b/localedata/locales/af_ZA	2018-10-17 13:52:02.211310419 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-17 13:52:01.875309551 +0000
+++ b/localedata/locales/ak_GH	2018-10-17 13:52:02.519311216 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-17 13:52:01.875309551 +0000
+++ b/localedata/locales/am_ET	2018-10-17 13:52:02.235310481 +0000
@@ -893,6 +893,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-17 13:52:01.879309562 +0000
+++ b/localedata/locales/ar_EG	2018-10-17 13:52:02.551311298 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/az_AZ b/localedata/locales/az_AZ
--- a/localedata/locales/az_AZ	2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/az_AZ	2018-10-17 13:52:02.259310543 +0000
@@ -136,6 +136,7 @@
 <U0259> "<U00E4>"
 <U018F> "<U00C4>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/be_BY	2018-10-17 13:52:02.283310605 +0000
@@ -91,6 +91,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/bem_ZM	2018-10-17 13:52:02.423310967 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/ber_DZ	2018-10-17 13:52:02.623311484 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/ber_MA	2018-10-17 13:52:02.603311433 +0000
@@ -83,6 +83,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-17 13:52:01.895309603 +0000
+++ b/localedata/locales/bg_BG	2018-10-17 13:52:02.215310430 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-17 13:52:01.895309603 +0000
+++ b/localedata/locales/bi_VU	2018-10-17 13:52:02.531311247 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-17 13:52:01.895309603 +0000
+++ b/localedata/locales/bn_BD	2018-10-17 13:52:02.511311195 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-17 13:52:01.899309613 +0000
+++ b/localedata/locales/bo_CN	2018-10-17 13:52:02.675311619 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-17 13:52:01.903309623 +0000
+++ b/localedata/locales/ca_ES	2018-10-17 13:52:02.675311619 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-17 13:52:01.903309623 +0000
+++ b/localedata/locales/ce_RU	2018-10-17 13:52:02.255310533 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-17 13:52:01.903309623 +0000
+++ b/localedata/locales/cmn_TW	2018-10-17 13:52:02.419310957 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-17 13:52:01.939309717 +0000
+++ b/localedata/locales/cs_CZ	2018-10-17 13:52:02.619311474 +0000
@@ -215,6 +215,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-17 13:52:01.939309717 +0000
+++ b/localedata/locales/cv_RU	2018-10-17 13:52:02.359310802 +0000
@@ -103,6 +103,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-17 13:52:01.943309726 +0000
+++ b/localedata/locales/cy_GB	2018-10-17 13:52:02.207310409 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-17 13:52:01.943309726 +0000
+++ b/localedata/locales/da_DK	2018-10-17 13:52:02.635311515 +0000
@@ -169,6 +169,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-17 13:52:01.943309726 +0000
+++ b/localedata/locales/de_DE	2018-10-17 13:52:02.639311526 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/dv_MV	2018-10-17 13:52:02.587311391 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/dz_BT	2018-10-17 13:52:02.583311382 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/el_GR	2018-10-17 13:52:02.607311443 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/en_GB	2018-10-17 13:52:02.539311268 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-17 13:52:01.951309747 +0000
+++ b/localedata/locales/en_NG	2018-10-17 13:52:02.379310854 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-17 13:52:01.951309747 +0000
+++ b/localedata/locales/en_ZM	2018-10-17 13:52:02.227310461 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-17 13:52:01.955309758 +0000
+++ b/localedata/locales/es_CU	2018-10-17 13:52:02.631311506 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-17 13:52:01.955309758 +0000
+++ b/localedata/locales/es_ES	2018-10-17 13:52:02.195310378 +0000
@@ -107,6 +107,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-17 13:52:01.955309758 +0000
+++ b/localedata/locales/et_EE	2018-10-17 13:52:02.503311174 +0000
@@ -113,6 +113,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/fa_IR	2018-10-17 13:52:02.219310440 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/ff_SN	2018-10-17 13:52:02.235310481 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/fi_FI	2018-10-17 13:52:02.595311412 +0000
@@ -177,6 +177,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/fr_FR	2018-10-17 13:52:02.287310616 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-17 13:52:01.963309778 +0000
+++ b/localedata/locales/ga_IE	2018-10-17 13:52:02.651311557 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-17 13:52:01.963309778 +0000
+++ b/localedata/locales/gd_GB	2018-10-17 13:52:02.639311526 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/gu_IN	2018-10-17 13:52:02.551311298 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/gv_GB	2018-10-17 13:52:02.375310843 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/he_IL	2018-10-17 13:52:02.671311609 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/hi_IN	2018-10-17 13:52:02.383310865 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/hif_FJ	2018-10-17 13:52:02.395310895 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/hr_HR	2018-10-17 13:52:02.615311464 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/ht_HT	2018-10-17 13:52:02.543311277 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/hu_HU	2018-10-17 13:52:02.279310595 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/hy_AM	2018-10-17 13:52:02.515311205 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/id_ID	2018-10-17 13:52:02.283310605 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/is_IS	2018-10-17 13:52:02.359310802 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-17 13:52:01.987309840 +0000
+++ b/localedata/locales/it_IT	2018-10-17 13:52:02.519311216 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/ja_JP	2018-10-17 13:52:02.503311174 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/kab_DZ	2018-10-17 13:52:02.663311589 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/kk_KZ	2018-10-17 13:52:02.611311453 +0000
@@ -99,6 +99,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/km_KH	2018-10-17 13:52:02.351310781 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/kn_IN	2018-10-17 13:52:02.507311185 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ko_KR	2018-10-17 13:52:02.339310751 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ks_IN	2018-10-17 13:52:02.275310585 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/kw_GB	2018-10-17 13:52:02.535311257 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ky_KG b/localedata/locales/ky_KG
--- a/localedata/locales/ky_KG	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ky_KG	2018-10-17 13:52:02.171310317 +0000
@@ -82,6 +82,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/lb_LU	2018-10-17 13:52:02.615311464 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/lg_UG	2018-10-17 13:52:02.199310389 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/lij_IT	2018-10-17 13:52:02.527311236 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ln_CD	2018-10-17 13:52:02.211310419 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/lo_LA	2018-10-17 13:52:02.291310627 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/lt_LT	2018-10-17 13:52:02.355310792 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/lv_LV	2018-10-17 13:52:02.539311268 +0000
@@ -125,6 +125,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/mg_MG	2018-10-17 13:52:02.255310533 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/mhr_RU	2018-10-17 13:52:02.611311453 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/mk_MK	2018-10-17 13:52:02.351310781 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/ml_IN	2018-10-17 13:52:02.363310812 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/ms_MY	2018-10-17 13:52:02.391310885 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/mt_MT	2018-10-17 13:52:02.635311515 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-17 13:52:02.295310636 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/nb_NO	2018-10-17 13:52:02.523311227 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/ne_NP	2018-10-17 13:52:02.587311391 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/nhn_MX	2018-10-17 13:52:02.511311195 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/niu_NU	2018-10-17 13:52:02.547311288 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/niu_NZ	2018-10-17 13:52:02.595311412 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/nl_NL	2018-10-17 13:52:02.355310792 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/nr_ZA	2018-10-17 13:52:02.659311578 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/oc_FR	2018-10-17 13:52:02.563311329 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/om_KE	2018-10-17 13:52:02.663311589 +0000
@@ -156,6 +156,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/or_IN	2018-10-17 13:52:02.671311609 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/os_RU	2018-10-17 13:52:02.655311567 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/pa_IN	2018-10-17 13:52:02.387310874 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/pa_PK	2018-10-17 13:52:02.191310367 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/pl_PL	2018-10-17 13:52:02.267310564 +0000
@@ -130,6 +130,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/pt_PT	2018-10-17 13:52:02.651311557 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/quz_PE	2018-10-17 13:52:02.239310492 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/ro_RO	2018-10-17 13:52:02.399310906 +0000
@@ -142,6 +142,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/ru_RU	2018-10-17 13:52:02.295310636 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/rw_RW	2018-10-17 13:52:02.559311319 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sa_IN	2018-10-17 13:52:02.535311257 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sd_IN	2018-10-17 13:52:02.515311205 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-17 13:52:02.563311329 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/se_NO	2018-10-17 13:52:02.387310874 +0000
@@ -221,6 +221,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sgs_LT	2018-10-17 13:52:02.555311309 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/shn_MM	2018-10-17 13:52:02.271310574 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/si_LK	2018-10-17 13:52:02.559311319 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sk_SK	2018-10-17 13:52:02.187310357 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/sl_SI	2018-10-17 13:52:02.251310523 +0000
@@ -2120,6 +2120,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/sm_WS	2018-10-17 13:52:02.263310554 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/so_SO	2018-10-17 13:52:02.183310347 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/sq_AL	2018-10-17 13:52:02.543311277 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/ss_ZA	2018-10-17 13:52:02.591311402 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/st_ZA	2018-10-17 13:52:02.647311547 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/sv_SE	2018-10-17 13:52:02.383310865 +0000
@@ -173,6 +173,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/sw_KE	2018-10-17 13:52:02.343310760 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/ta_IN	2018-10-17 13:52:02.339310751 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/te_IN	2018-10-17 13:52:02.395310895 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/th_TH	2018-10-17 13:52:02.647311547 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/ti_ET	2018-10-17 13:52:02.371310834 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/tn_ZA	2018-10-17 13:52:02.623311484 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/to_TO	2018-10-17 13:52:02.567311340 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/tpi_PG	2018-10-17 13:52:02.223310450 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-17 13:52:02.039309975 +0000
+++ b/localedata/locales/tr_TR	2018-10-17 13:52:02.415310947 +0000
@@ -2538,6 +2538,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-17 13:52:02.687311650 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced 
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-17 13:52:02.039309975 +0000
+++ b/localedata/locales/ts_ZA	2018-10-17 13:52:02.555311309 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-17 13:52:02.039309975 +0000
+++ b/localedata/locales/unm_US	2018-10-17 13:52:02.531311247 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/ur_IN	2018-10-17 13:52:02.507311185 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/ur_PK	2018-10-17 13:52:02.275310585 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/ve_ZA	2018-10-17 13:52:02.599311423 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/vi_VN	2018-10-17 13:52:02.571311351 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/wa_BE	2018-10-17 13:52:02.595311412 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/wo_SN	2018-10-17 13:52:02.631311506 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/xh_ZA	2018-10-17 13:52:02.603311433 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/yi_US	2018-10-17 13:52:02.267310564 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/yuw_PG	2018-10-17 13:52:02.259310543 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/zh_CN	2018-10-17 13:52:02.607311443 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/zu_ZA	2018-10-17 13:52:02.627311495 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-10-13 16:58       ` Egor Kobylkin
  2018-10-15 11:04         ` Marko Myllynen
@ 2018-10-23 23:08         ` Rafal Luzynski
  1 sibling, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-23 23:08 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales
  Cc: mfabian, Marko Myllynen, Dmitry V. Levin, Volodymyr Lisivka,
	Max Kutny, danilo

Hi Egor,

Thank you for your updates and again I'm sorry for my delayed response.
A general remark about this: if you are in a hurry and you need the
corrected transliteration rules for yourself or for your users then
you don't have to wait for the patch to be reviewed and accepted here.
You can make your own locale and use it, you don't need to rebuild glibc,
you don't even need root privileges to do it.  The locale data subsystem
is designed to allow users create and use their own locales.

I have seen and tested locally your newer patch [1] but I will reply
in this thread because I think it is easier to reply in context.

I would like to summarize the differences between v5 [2] and v6 to make
sure that I noticed them all and that you have not introduced any changes
inadvertently.  (Yes, that means I have skipped another patch which you
sent between those two.)

* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* You consequently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Again I must say that I experienced lots of technical difficulties to apply
the patch and I had to rework it manually because it is not applicable as
it is now.  Here I explain below how to make a technically correct patch:

13.10.2018 18:58 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> 
> Hi Rafal,
> 
> Thanks for the thorough checking, it really helps.
> 
> On 13.10.2018 02:59, Rafal Luzynski wrote:
> > Technical issue:  Please either attach your patch to the email 
> > message or paste it inline, not both.  The patch as it is now is not 
> > applicable. I had to edit it manually to apply.
> >> diff -uNr a/localedata/locales/C b/localedata/locales/C --- 
> >> a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++ 
> >> b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
> > 
> > There is no such file.  Where have you got the source code from?
> > Are you sure this is glibc? :-)
> 
> I was running my patch process against the Ubuntu 18.04 version of
> localedata/locales. Now I have checked out the GitHub glibc source v2.28
> and done the same. [...]

Remarks:

* Please use the repository at https://sourceware.org/git/?p=glibc.git
  rather than a copy at GitHub.
* Please use the master branch rather than 2.28.
* Commit your work locally.
* Use "git format-patch" (e.g., "git format-patch HEAD^..HEAD") to generate
  the patch, then you can email it to this list.
* You can email it inline or, if your email client breaks the lines and
inserts
  other unnecessary characters, send as an attachment.
* Use "git pull --rebase" to keep your work up to date.
* Read the Contribution Checklist [3] for more details.

> 
> >> [...] From this patch I have excluded locales that already mention 
> >> cyrillic or have a transliteration table for it: az_AZ 
> >> iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ 
> >> uz_UZ@cyrillic
> > 
> > I confirm that these locales are excluded and there are no other 
> > missing locales.
> 
> Because of the surprisingly different list of locales between Ubuntu and
> glibc there is now a different list of excluded ones as well.
> 
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
> uk_UA
> 
> az_AZ, ky_KG are now included

As far as I can see, there are no other differences between those two
patches.

> because they don't have cyrillic translit
> in glibc. iso14651_t1_common is still implicitly excluded, because it
> doesn't have 'translit_end' string.
> 
> Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
> after the patch applied (az_AZ is explicitly including tr_TR). I do not
> see a reason, maybe you could check?

I noticed that az_AZ does not build at all, localedef program reports
a "circular dependency" (if I recall correctly).  I think that since az_AZ
contains “copy "tr_TR"” and tr_TR already contains (in your patch)
“include "translit_cyrillic";""” you should just remove
“include "translit_cyrillic";""” from az_AZ which effectively means that
there are no changes in az_AZ.  Optionally, you can add a comment to az_AZ
to explain why it does not contain “include "translit_cyrillic";""” and to
make sure that if anyone removes “copy "tr_TR"” ever in the future, the
“include "translit_cyrillic";""” will be added at the same time.  I have
verified that removing that line makes the locale data build without an
error but I have not yet verified that they work as expected.

> > Regarding the tests, I think there is no complete transliteration 
> > test suite at the moment.  Probably the only test is 
> > localedata/bug-iconv-trans.c. You can also see the collation tests 
> > placed in the same directory, they use those multiple *.UTF-8.in 
> > files.
> > 
> > You can skip the tests for now.
> 
> In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
> change the list of the symbols we are now transliterating
> 
>   const char str[] = "ÄäÖöÜüß";
>   const char expected[] = "AEaeOEoeUEuess";
> 
> like this
> 
>   const char str[] =
> "ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩъЫьЭЮЯабвгдежзийклмнопрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
> ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
>   const char expected[] =
> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
> shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
> T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
> Y`y`'";
> 
> First I though they could just be added but not all locales
> transliterate Umlauts so just extending the current test won't do as it
> will fail for those locales.

I noticed that you pasted a patch in a Bugzilla comment. [4] If I understand
correctly you suggest to rework the existing test case to test Cyrillic
transliteration instead of German.  Please don't do it: the existing test
cases may be extended but must not be removed.  I think we should rework
this
test case to handle multiple locales and multiple transliteration pairs;
optionally we can add a new case instead.  Currently I lean into reworking
the existing test case.

> >> [...] diff -uNr a/localedata/locales/am_ET 
> >> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
> >> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
> >> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
> >> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
> >> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
> >> "translit_cyrillic";"" translit_end % END LC_CTYPE
> > 
> > Shouldn't “include "translit_cyrillic";""” be placed before the 
> > custom rules, together with other includes?  The same in more files, 
> > I will not mention them all.
> 
> If I recall correctly it is because of the
> "translit_end
> END LC_CTYPE"
> part at the end of the translit_cyrillic. This way it works for any
> locale, regardless whether it has translit itself or not. And being at
> the end it does not supersede any previous transliteration that may be
> there for a reason.
> 
> As with some other comments, I am not super familiar with the formats of
> glibc files. So if you have a definitive suggestion - pls. formulate it
> as an imperative, not a question.

I feel like a newcomer here so it was meant to be a question to other
more experienced maintainers but probably it's time to change this attitude.
So, also taking into account what Marko wrote, [5] please put the include
directive after all other include directives, or after the "translit_start"
directive if there are no other includes, rather than putting it just before
"translit_end".  Even if putting it at the dnd works sometimes or even
always.
Same as you put #include's near top of the file when writing a C program
even
if sometimes you may put it anywhere and it will work.  If you use a script
to insert your include directives then please rework it, if you insert them
manually then just move them manually.

> >> [...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401> 
> >> <U00CB>;"<U0059><U004F>"
> > 
> > This says that for ASCII (GOST 7.79 System B) you would like to 
> > transliterate "Ё" as "YO" but the table in Wikipedia says "Yo".  I 
> > understand that one or another may be correct depending on the 
> > context but we should be consistent and also better let's stick with 
> > the standard.
> 
> The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
> example for "Сх" and "Ш" that would both transliterate to Sh:
> With SH:"Схема"->"Shema" but "Шема"->"SHema"
> With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
> This is important e.g. for renaming files, grouping as in using uniq etc.

I understand this idea.  Is this part of any existing standard?  I can't
see it regulated by GOST 7.79.

I'd rather not include the transliteration rules which seems reasonable to
us (the developers) but are not known and therefore not acceptable by the
outer world.

> 
> > 
> >> +% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
> > 
> > This says "DJ" but System B does not mention it.  Where does it come 
> > from? Also, I think it should be "Dj" rather than "DJ".
> I took the first two letters from its name.

As I said previously, I would like to add more Cyrillic letters even if
they are not regulated by any standard.  But let's separate them and make
it clear that these rules are based on GOST 7.79 and those are our own
invention (or come from other standard etc.)  I think that all these
rules may even be in the same file but in different parts of it.

> >> [...] +% CYRILLIC UNDEFINED +<U0423><U0301> 
> >> <U00DA>;"<U0055><U0060>"
> > 
> > 1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
> > 2. OK, the System A table mentions this letter but System B does not.
> > Somehow we should handle it.  I think that "U`" is the best we can do
> > for now. 3. It must be tested whether this actually works.
> 1. Let's do it just before you are ready to commit the patch, because it
> breaks formulas in my worksheet and I will have to do it manually?
> 3. I have tested and it doesn't work/gets ignored. But if you were to
> handle COMBINING it would work, wouldn't it?

My guess is that since translit_combining just removes all those combining
diacritic characters and translit_combining is usually included before
translit_cyrillic then <U0301> is removed even before <U0423> is taken
into account.  Also my another guess is that it might work good if you
just removed this rule: <U0423> would be translated to "U" and <U0301>
would remain unchanged and eventually those two characters would produce
"Ú".  But, again, that's just a guess, I have not tested.

> >> [...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
> > 
> > I don't think that "H" is unavailable in any encoding therefore it 
> > will always be transliterated as "H" and never as "X".  We can't
> > help it and I don't think it is bad.
> > 
> But we can keep this for when/if there is a way to explicitly request
> transcription instead of transliteration.

Note that either it will make the test cases fail or we will have to
prepare the test cases deliberately skip the translation of <U0425>
into "X" because "H" will be always working.  We can't force iconv
to choose the second transliteration rule if the first one works.

That means we will have a problem to construct the test cases.

> >> +% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
> > 
> > 1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
> > rule: "c before i, e, y, j"?
> > 
> 1. see for CYRILLIC CAPITAL LETTER IO
> 2. not sure what you are talking about in 2. but I believe it's not
> possible as per Marko's email.

Hm... I can't find a good example now.  Maybe I was mislead by the rules
of Cyrillic transliteration which I learned at school and which are not
necessarily universal and not necessarily useful for English readers.

> >> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A> 
> >> <U02BA>;"<U0041><U0060>"
> > 
> > "A`" is only for Bulgarian and should go to bg_BG.  How should we 
> > transliterate an upper case hard sign to plain ASCII?  I think that 
> > just "``", same as lower case.
> This is to avoid collision.

What collision?

> Besides AFAIK e.g. in Russian there is no
> capital hard sign because there are no words starting with it.

True but it can be used in ALL UPPERCASE text.  Therefore we need a clear
and correct transliteration rule for it.

> 
> > 
> >> +% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
> > 
> > Again, as "Y" is always available it will never be transliterated as 
> > "Y`".
> > 
> But we can keep this for when/if there is a way to explicitly request
> transcription instead of transliteration.

Again, it will be difficult or impossible to construct a correct test case
and we must be aware of this.

Regards,

Rafal


[1] https://sourceware.org/ml/libc-alpha/2018-10/msg00300.html
[2] https://sourceware.org/ml/libc-alpha/2018-10/msg00213.html
[3] https://sourceware.org/glibc/wiki/Contribution%20checklist
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=2872#c47
[5] https://sourceware.org/ml/libc-alpha/2018-10/msg00232.html

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v7] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (6 preceding siblings ...)
  2018-10-17 14:16   ` [PATCH v6] " Egor Kobylkin
@ 2018-11-01 22:51   ` Egor Kobylkin
  2018-11-02  0:00   ` [PATCH v8] " Egor Kobylkin
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-01 22:51 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen,
	Dmitry V. Levin
  Cc: Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 10881 bytes --]

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.


Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-17  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/ky_KG: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-v7-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch --]
[-- Type: text/x-patch; name="0001-v7-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch", Size: 45993 bytes --]

From 733055a6da290f32f508216519de715aa8b5b566 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Thu, 1 Nov 2018 23:46:03 +0100
Subject: [PATCH] v7 Locales: Cyrillic -> ASCII transliteration table [BZ
 #2872]

---
 localedata/locales/aa_DJ            | 1 +
 localedata/locales/af_ZA            | 1 +
 localedata/locales/ak_GH            | 1 +
 localedata/locales/am_ET            | 1 +
 localedata/locales/ar_EG            | 1 +
 localedata/locales/be_BY            | 1 +
 localedata/locales/bem_ZM           | 1 +
 localedata/locales/ber_DZ           | 1 +
 localedata/locales/ber_MA           | 1 +
 localedata/locales/bg_BG            | 1 +
 localedata/locales/bi_VU            | 1 +
 localedata/locales/bn_BD            | 1 +
 localedata/locales/bo_CN            | 1 +
 localedata/locales/ca_ES            | 1 +
 localedata/locales/ce_RU            | 1 +
 localedata/locales/cmn_TW           | 1 +
 localedata/locales/cs_CZ            | 1 +
 localedata/locales/cv_RU            | 1 +
 localedata/locales/cy_GB            | 1 +
 localedata/locales/da_DK            | 1 +
 localedata/locales/de_DE            | 1 +
 localedata/locales/dv_MV            | 1 +
 localedata/locales/dz_BT            | 1 +
 localedata/locales/el_GR            | 1 +
 localedata/locales/en_GB            | 1 +
 localedata/locales/en_NG            | 1 +
 localedata/locales/en_ZM            | 1 +
 localedata/locales/es_CU            | 1 +
 localedata/locales/es_ES            | 1 +
 localedata/locales/et_EE            | 1 +
 localedata/locales/fa_IR            | 1 +
 localedata/locales/ff_SN            | 1 +
 localedata/locales/fi_FI            | 1 +
 localedata/locales/fr_FR            | 1 +
 localedata/locales/ga_IE            | 1 +
 localedata/locales/gd_GB            | 1 +
 localedata/locales/gu_IN            | 1 +
 localedata/locales/gv_GB            | 1 +
 localedata/locales/he_IL            | 1 +
 localedata/locales/hi_IN            | 1 +
 localedata/locales/hif_FJ           | 1 +
 localedata/locales/hr_HR            | 1 +
 localedata/locales/ht_HT            | 1 +
 localedata/locales/hu_HU            | 1 +
 localedata/locales/hy_AM            | 1 +
 localedata/locales/id_ID            | 1 +
 localedata/locales/is_IS            | 1 +
 localedata/locales/it_IT            | 1 +
 localedata/locales/ja_JP            | 1 +
 localedata/locales/kab_DZ           | 1 +
 localedata/locales/kk_KZ            | 1 +
 localedata/locales/km_KH            | 1 +
 localedata/locales/kn_IN            | 1 +
 localedata/locales/ko_KR            | 1 +
 localedata/locales/ks_IN            | 1 +
 localedata/locales/kw_GB            | 1 +
 localedata/locales/ky_KG            | 1 +
 localedata/locales/lb_LU            | 1 +
 localedata/locales/lg_UG            | 1 +
 localedata/locales/lij_IT           | 1 +
 localedata/locales/ln_CD            | 1 +
 localedata/locales/lo_LA            | 1 +
 localedata/locales/lt_LT            | 1 +
 localedata/locales/lv_LV            | 1 +
 localedata/locales/mg_MG            | 1 +
 localedata/locales/mhr_RU           | 1 +
 localedata/locales/mk_MK            | 1 +
 localedata/locales/ml_IN            | 1 +
 localedata/locales/ms_MY            | 1 +
 localedata/locales/mt_MT            | 1 +
 localedata/locales/nan_TW@latin     | 1 +
 localedata/locales/nb_NO            | 1 +
 localedata/locales/ne_NP            | 1 +
 localedata/locales/nhn_MX           | 1 +
 localedata/locales/niu_NU           | 1 +
 localedata/locales/niu_NZ           | 1 +
 localedata/locales/nl_NL            | 1 +
 localedata/locales/nr_ZA            | 1 +
 localedata/locales/oc_FR            | 1 +
 localedata/locales/om_KE            | 1 +
 localedata/locales/or_IN            | 1 +
 localedata/locales/os_RU            | 1 +
 localedata/locales/pa_IN            | 1 +
 localedata/locales/pa_PK            | 1 +
 localedata/locales/pl_PL            | 1 +
 localedata/locales/pt_PT            | 1 +
 localedata/locales/quz_PE           | 1 +
 localedata/locales/ro_RO            | 1 +
 localedata/locales/ru_RU            | 1 +
 localedata/locales/rw_RW            | 1 +
 localedata/locales/sa_IN            | 1 +
 localedata/locales/sd_IN            | 1 +
 localedata/locales/sd_IN@devanagari | 1 +
 localedata/locales/se_NO            | 1 +
 localedata/locales/sgs_LT           | 1 +
 localedata/locales/shn_MM           | 1 +
 localedata/locales/si_LK            | 1 +
 localedata/locales/sk_SK            | 1 +
 localedata/locales/sl_SI            | 1 +
 localedata/locales/sm_WS            | 1 +
 localedata/locales/so_SO            | 1 +
 localedata/locales/sq_AL            | 1 +
 localedata/locales/ss_ZA            | 1 +
 localedata/locales/st_ZA            | 1 +
 localedata/locales/sv_SE            | 1 +
 localedata/locales/sw_KE            | 1 +
 localedata/locales/ta_IN            | 1 +
 localedata/locales/te_IN            | 1 +
 localedata/locales/th_TH            | 1 +
 localedata/locales/ti_ET            | 1 +
 localedata/locales/tn_ZA            | 1 +
 localedata/locales/to_TO            | 1 +
 localedata/locales/tpi_PG           | 1 +
 localedata/locales/tr_TR            | 1 +
 localedata/locales/ts_ZA            | 1 +
 localedata/locales/unm_US           | 1 +
 localedata/locales/ur_IN            | 1 +
 localedata/locales/ur_PK            | 1 +
 localedata/locales/ve_ZA            | 1 +
 localedata/locales/vi_VN            | 1 +
 localedata/locales/wa_BE            | 1 +
 localedata/locales/wo_SN            | 1 +
 localedata/locales/xh_ZA            | 1 +
 localedata/locales/yi_US            | 1 +
 localedata/locales/yuw_PG           | 1 +
 localedata/locales/zh_CN            | 1 +
 localedata/locales/zu_ZA            | 1 +
 127 files changed, 127 insertions(+)

diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
 space <U1361>
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % hoy-sadis followed by a vowel
 <U1205><U12A0>    <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index cca7cc19af..3866f06004 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts.
 % LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 
 translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
 
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % Historicaly we used ISO-8869-2 and wrote digraphs
 % <U01C6> {dž}, <U01C9> {lj} and <U01CC> {nj}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 <U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
 <U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_cjk_variants";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_hangul";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts
 % LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
 
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
 %
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if t/scomma is not available, try first t/scedilla
 <U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
 translit_start
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % A-bole -> A-circonflecse -> AU
 <U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if digraphs are not available (this is the case with iso-8859-8)
 % then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v8] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (7 preceding siblings ...)
  2018-11-01 22:51   ` [PATCH v7] " Egor Kobylkin
@ 2018-11-02  0:00   ` Egor Kobylkin
  2018-11-02 22:22     ` Rafal Luzynski
  2018-11-14 21:25   ` [PATCH v9] " Egor Kobylkin
                     ` (4 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-02  0:00 UTC (permalink / raw)
  To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen,
	Dmitry V. Levin
  Cc: Volodymyr Lisivka, Max Kutny, danilo

[-- Attachment #1: Type: text/plain, Size: 10985 bytes --]

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.


Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-11-02  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/ky_KG: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-v8-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch --]
[-- Type: text/x-patch; name="0001-v8-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch", Size: 59775 bytes --]

From efdde90219d25ecbdc762f113d357cf7de08fc94 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Fri, 2 Nov 2018 00:56:35 +0100
Subject: [PATCH] v8 Locales: Cyrillic -> ASCII transliteration table [BZ
 #2872]

---
 localedata/locales/aa_DJ             |   1 +
 localedata/locales/af_ZA             |   1 +
 localedata/locales/ak_GH             |   1 +
 localedata/locales/am_ET             |   1 +
 localedata/locales/ar_EG             |   1 +
 localedata/locales/be_BY             |   1 +
 localedata/locales/bem_ZM            |   1 +
 localedata/locales/ber_DZ            |   1 +
 localedata/locales/ber_MA            |   1 +
 localedata/locales/bg_BG             |   1 +
 localedata/locales/bi_VU             |   1 +
 localedata/locales/bn_BD             |   1 +
 localedata/locales/bo_CN             |   1 +
 localedata/locales/ca_ES             |   1 +
 localedata/locales/ce_RU             |   1 +
 localedata/locales/cmn_TW            |   1 +
 localedata/locales/cs_CZ             |   1 +
 localedata/locales/cv_RU             |   1 +
 localedata/locales/cy_GB             |   1 +
 localedata/locales/da_DK             |   1 +
 localedata/locales/de_DE             |   1 +
 localedata/locales/dv_MV             |   1 +
 localedata/locales/dz_BT             |   1 +
 localedata/locales/el_GR             |   1 +
 localedata/locales/en_GB             |   1 +
 localedata/locales/en_NG             |   1 +
 localedata/locales/en_ZM             |   1 +
 localedata/locales/es_CU             |   1 +
 localedata/locales/es_ES             |   1 +
 localedata/locales/et_EE             |   1 +
 localedata/locales/fa_IR             |   1 +
 localedata/locales/ff_SN             |   1 +
 localedata/locales/fi_FI             |   1 +
 localedata/locales/fr_FR             |   1 +
 localedata/locales/ga_IE             |   1 +
 localedata/locales/gd_GB             |   1 +
 localedata/locales/gu_IN             |   1 +
 localedata/locales/gv_GB             |   1 +
 localedata/locales/he_IL             |   1 +
 localedata/locales/hi_IN             |   1 +
 localedata/locales/hif_FJ            |   1 +
 localedata/locales/hr_HR             |   1 +
 localedata/locales/ht_HT             |   1 +
 localedata/locales/hu_HU             |   1 +
 localedata/locales/hy_AM             |   1 +
 localedata/locales/id_ID             |   1 +
 localedata/locales/is_IS             |   1 +
 localedata/locales/it_IT             |   1 +
 localedata/locales/ja_JP             |   1 +
 localedata/locales/kab_DZ            |   1 +
 localedata/locales/kk_KZ             |   1 +
 localedata/locales/km_KH             |   1 +
 localedata/locales/kn_IN             |   1 +
 localedata/locales/ko_KR             |   1 +
 localedata/locales/ks_IN             |   1 +
 localedata/locales/kw_GB             |   1 +
 localedata/locales/ky_KG             |   1 +
 localedata/locales/lb_LU             |   1 +
 localedata/locales/lg_UG             |   1 +
 localedata/locales/lij_IT            |   1 +
 localedata/locales/ln_CD             |   1 +
 localedata/locales/lo_LA             |   1 +
 localedata/locales/lt_LT             |   1 +
 localedata/locales/lv_LV             |   1 +
 localedata/locales/mg_MG             |   1 +
 localedata/locales/mhr_RU            |   1 +
 localedata/locales/mk_MK             |   1 +
 localedata/locales/ml_IN             |   1 +
 localedata/locales/ms_MY             |   1 +
 localedata/locales/mt_MT             |   1 +
 localedata/locales/nan_TW@latin      |   1 +
 localedata/locales/nb_NO             |   1 +
 localedata/locales/ne_NP             |   1 +
 localedata/locales/nhn_MX            |   1 +
 localedata/locales/niu_NU            |   1 +
 localedata/locales/niu_NZ            |   1 +
 localedata/locales/nl_NL             |   1 +
 localedata/locales/nr_ZA             |   1 +
 localedata/locales/oc_FR             |   1 +
 localedata/locales/om_KE             |   1 +
 localedata/locales/or_IN             |   1 +
 localedata/locales/os_RU             |   1 +
 localedata/locales/pa_IN             |   1 +
 localedata/locales/pa_PK             |   1 +
 localedata/locales/pl_PL             |   1 +
 localedata/locales/pt_PT             |   1 +
 localedata/locales/quz_PE            |   1 +
 localedata/locales/ro_RO             |   1 +
 localedata/locales/ru_RU             |   1 +
 localedata/locales/rw_RW             |   1 +
 localedata/locales/sa_IN             |   1 +
 localedata/locales/sd_IN             |   1 +
 localedata/locales/sd_IN@devanagari  |   1 +
 localedata/locales/se_NO             |   1 +
 localedata/locales/sgs_LT            |   1 +
 localedata/locales/shn_MM            |   1 +
 localedata/locales/si_LK             |   1 +
 localedata/locales/sk_SK             |   1 +
 localedata/locales/sl_SI             |   1 +
 localedata/locales/sm_WS             |   1 +
 localedata/locales/so_SO             |   1 +
 localedata/locales/sq_AL             |   1 +
 localedata/locales/ss_ZA             |   1 +
 localedata/locales/st_ZA             |   1 +
 localedata/locales/sv_SE             |   1 +
 localedata/locales/sw_KE             |   1 +
 localedata/locales/ta_IN             |   1 +
 localedata/locales/te_IN             |   1 +
 localedata/locales/th_TH             |   1 +
 localedata/locales/ti_ET             |   1 +
 localedata/locales/tn_ZA             |   1 +
 localedata/locales/to_TO             |   1 +
 localedata/locales/tpi_PG            |   1 +
 localedata/locales/tr_TR             |   1 +
 localedata/locales/translit_cyrillic | 383 +++++++++++++++++++++++++++
 localedata/locales/ts_ZA             |   1 +
 localedata/locales/unm_US            |   1 +
 localedata/locales/ur_IN             |   1 +
 localedata/locales/ur_PK             |   1 +
 localedata/locales/ve_ZA             |   1 +
 localedata/locales/vi_VN             |   1 +
 localedata/locales/wa_BE             |   1 +
 localedata/locales/wo_SN             |   1 +
 localedata/locales/xh_ZA             |   1 +
 localedata/locales/yi_US             |   1 +
 localedata/locales/yuw_PG            |   1 +
 localedata/locales/zh_CN             |   1 +
 localedata/locales/zu_ZA             |   1 +
 128 files changed, 510 insertions(+)
 create mode 100644 localedata/locales/translit_cyrillic

diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
 space <U1361>
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % hoy-sadis followed by a vowel
 <U1205><U12A0>    <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index cca7cc19af..3866f06004 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts.
 % LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 
 translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
 
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % Historicaly we used ISO-8869-2 and wrote digraphs
 % <U01C6> {dž}, <U01C9> {lj} and <U01CC> {nj}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 <U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
 <U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_cjk_variants";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_hangul";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts
 % LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
 
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
 %
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if t/scomma is not available, try first t/scedilla
 <U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
new file mode 100644
index 0000000000..073a138a6a
--- /dev/null
+++ b/localedata/locales/translit_cyrillic
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced 
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
 translit_start
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % A-bole -> A-circonflecse -> AU
 <U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if digraphs are not available (this is the case with iso-8859-8)
 % then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v8] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-02  0:00   ` [PATCH v8] " Egor Kobylkin
@ 2018-11-02 22:22     ` Rafal Luzynski
  2018-11-02 23:27       ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-11-02 22:22 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Marko Myllynen,
	Dmitry V. Levin
  Cc: Volodymyr Lisivka, Max Kutny, danilo

Hi Egor,

I have applied your patch locally and I am going to start reviewing it.
I can tell you already that it applies correctly but git reports these
warnings:

    Applying: v8 Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
    .git/rebase-apply/patch:1520: trailing whitespace.
    % i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
    .git/rebase-apply/patch:1521: trailing whitespace.
    % It implements the GOST_7.79 System A (Latin Script) as a first 
    .git/rebase-apply/patch:1523: trailing whitespace.
    % https://en.wikipedia.org/wiki/ISO_9 for reference. 
    .git/rebase-apply/patch:1524: trailing whitespace.
    % The System B is extended from GOST_7.79-Russian using open sources 
    .git/rebase-apply/patch:1535: trailing whitespace.
    % Generated from UnicodeData.txt with a spreadsheet referenced 
    warning: 5 lines add whitespace errors.

Also the commit message is missing from your patch because probably it is
missing from your local repository.  Please re-add it and please remember
that it must contain a summary like this:

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
	System A transliteration System B transcription table from Cyrillic
	to Latin/ASCII.
	* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
	LC_CTYPE translit section.
	* localedata/locales/af_ZA: Likewise.

Hm... as I look at this now I think it should rather be:

	[BZ #2872]
	* localedata/locales/translit_cyrillic: New file.
	* localedata/locales/aa_DJ (LC_CTYPE): Add
	“'include "translit_cyrillic";""”
	* localedata/locales/af_ZA (LC_CTYPE): Likewise.

... and so on.  Optionally you can use:

	* localedata/locales/translit_cyrillic: New file.  Supports
	ISO 9.1995, GOST 7.79 System A transliteration System B
	transcription table from Cyrillic to Latin/ASCII.

I will appreciate more hints about how to write the ChangeLog entry
correctly
from more experienced maintainers.

2.11.2018 01:00 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)

I confirm that this is the only relevant difference between v6 and v8.

> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)

Has this list changed, that is, has any locale been added or removed?

> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.

True, this is another difference and I hope this is correct (I have not
yet tested).

> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.

Correct.

> * Consistently transliterate single uppercase Cyrillic letters
>   to sequences of all uppercase Latin letters in all languages (whenever
>   a Cyrillic letter is transliterated to more than one Latin letter),
>   for example "Ї" is now transliterated as "YI" rather than "Yi".

I think you have not yet explained whether this is required by any existing
standard (please provide links) or whether this is your genuine idea to
distinguish between the cases like "Ш" transliterated to "Sh" and "Сх"
also transliterated to "Sh".

Again, I have not yet started reviewing and testing, this is just a feedback
after applying the patch locally.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v8] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-02 22:22     ` Rafal Luzynski
@ 2018-11-02 23:27       ` Egor Kobylkin
  0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-02 23:27 UTC (permalink / raw)
  To: libc-alpha, libc-locales

Moving everybody from To: and CC: on BCC. It seems at this stage it is
Rafal and me. It is still going to libc-alpha and libc-locales. If you
are interested to be put back on CC - please let me know.

On 02.11.18 23:22, Rafal Luzynski wrote:
>> * Consistently transliterate single uppercase Cyrillic letters to 
>> sequences of all uppercase Latin letters in all languages
>> (whenever a Cyrillic letter is transliterated to more than one
>> Latin letter), for example "Ї" is now transliterated as "YI" rather
>> than "Yi".
> 
> I think you have not yet explained whether this is required by any
> existing standard (please provide links) or whether this is your
> genuine idea to distinguish between the cases like "Ш" transliterated > to "Sh" and
 "Сх" also transliterated to "Sh".

I remember seeing this form of the capitalization it in actual
transliterated texts long time ago but can't find a formal description
as of now. Just don't want to claim this to be my original idea.

>> The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
>> example for "Сх" and "Ш" that would both transliterate to Sh:
>> With SH:"Схема"->"Shema" but "Шема"->"SHema"
>> With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
>> This is important e.g. for renaming files, grouping as in using uniq >> etc.

As for the users - I am a user and I have demonstrated the use cases
where the collisions due to "one symbol capitalization" would cause
irreversible damage to data. For a library like glibc this seems like a
relevant issue to consider.

The "two symbol capitalization" on the other hand would prevent
collision and can be easily corrected in the userspace if needed
with something like

foo="SHema"
foo="${foo:0:1}$(tr '[:upper:]' '[:lower:]' <<<${foo:1})"
echo "$foo"
Shema

It looks like everyone really using transliteration for something
sensitive already have done it the userspace since at least 2006 when
this bug was first logged. So we won't brake the official use cases
where the capitalization should be done in a certain way. But we will
prevent new bugs due to collision if we use "two symbol capitalization"
indeed.

Happy to hear arguments to the contrary.

Bests,
Egor Kobylkin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (8 preceding siblings ...)
  2018-11-02  0:00   ` [PATCH v8] " Egor Kobylkin
@ 2018-11-14 21:25   ` Egor Kobylkin
  2018-11-16 22:17     ` Rafal Luzynski
  2018-11-19 11:10   ` [PATCH v10] " Egor Kobylkin
                     ` (3 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-14 21:25 UTC (permalink / raw)
  To: libc-alpha, libc-locales

[-- Attachment #1: Type: text/plain, Size: 5871 bytes --]

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch; name="0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch", Size: 64625 bytes --]

From a8ae30e0bf7484f4c0f034480110c81dd059b69e Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 14 Nov 2018 22:10:37 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* localedata/locales/translit_cyrillic: New file. Supports
	ISO 9.1995, GOST 7.79 System A transliteration System B
	transcription table from Cyrillic to Latin/ASCII.
	* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""'
	to LC_CTYPE translit section.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/ky_KG: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.
---
 localedata/locales/aa_DJ             |   1 +
 localedata/locales/af_ZA             |   1 +
 localedata/locales/ak_GH             |   1 +
 localedata/locales/am_ET             |   1 +
 localedata/locales/ar_EG             |   1 +
 localedata/locales/be_BY             |   1 +
 localedata/locales/bem_ZM            |   1 +
 localedata/locales/ber_DZ            |   1 +
 localedata/locales/ber_MA            |   1 +
 localedata/locales/bg_BG             |   1 +
 localedata/locales/bi_VU             |   1 +
 localedata/locales/bn_BD             |   1 +
 localedata/locales/bo_CN             |   1 +
 localedata/locales/ca_ES             |   1 +
 localedata/locales/ce_RU             |   1 +
 localedata/locales/cs_CZ             |   1 +
 localedata/locales/cv_RU             |   1 +
 localedata/locales/cy_GB             |   1 +
 localedata/locales/da_DK             |   1 +
 localedata/locales/de_DE             |   1 +
 localedata/locales/dv_MV             |   1 +
 localedata/locales/dz_BT             |   1 +
 localedata/locales/el_GR             |   1 +
 localedata/locales/en_GB             |   1 +
 localedata/locales/en_NG             |   1 +
 localedata/locales/en_ZM             |   1 +
 localedata/locales/es_CU             |   1 +
 localedata/locales/es_ES             |   1 +
 localedata/locales/et_EE             |   1 +
 localedata/locales/fa_IR             |   1 +
 localedata/locales/ff_SN             |   1 +
 localedata/locales/fi_FI             |   1 +
 localedata/locales/fr_FR             |   1 +
 localedata/locales/ga_IE             |   1 +
 localedata/locales/gd_GB             |   1 +
 localedata/locales/gu_IN             |   1 +
 localedata/locales/gv_GB             |   1 +
 localedata/locales/he_IL             |   1 +
 localedata/locales/hi_IN             |   1 +
 localedata/locales/hif_FJ            |   1 +
 localedata/locales/hr_HR             |   1 +
 localedata/locales/ht_HT             |   1 +
 localedata/locales/hu_HU             |   1 +
 localedata/locales/hy_AM             |   1 +
 localedata/locales/id_ID             |   1 +
 localedata/locales/is_IS             |   1 +
 localedata/locales/it_IT             |   1 +
 localedata/locales/ja_JP             |   1 +
 localedata/locales/kab_DZ            |   1 +
 localedata/locales/kk_KZ             |   1 +
 localedata/locales/km_KH             |   1 +
 localedata/locales/kn_IN             |   1 +
 localedata/locales/ko_KR             |   1 +
 localedata/locales/ks_IN             |   1 +
 localedata/locales/kw_GB             |   1 +
 localedata/locales/ky_KG             |   1 +
 localedata/locales/lb_LU             |   1 +
 localedata/locales/lg_UG             |   1 +
 localedata/locales/lij_IT            |   1 +
 localedata/locales/ln_CD             |   1 +
 localedata/locales/lo_LA             |   1 +
 localedata/locales/lt_LT             |   1 +
 localedata/locales/lv_LV             |   1 +
 localedata/locales/mg_MG             |   1 +
 localedata/locales/mhr_RU            |   1 +
 localedata/locales/mk_MK             |   1 +
 localedata/locales/ml_IN             |   1 +
 localedata/locales/ms_MY             |   1 +
 localedata/locales/mt_MT             |   1 +
 localedata/locales/nan_TW@latin      |   1 +
 localedata/locales/nb_NO             |   1 +
 localedata/locales/ne_NP             |   1 +
 localedata/locales/nhn_MX            |   1 +
 localedata/locales/niu_NU            |   1 +
 localedata/locales/niu_NZ            |   1 +
 localedata/locales/nl_NL             |   1 +
 localedata/locales/nr_ZA             |   1 +
 localedata/locales/oc_FR             |   1 +
 localedata/locales/om_KE             |   1 +
 localedata/locales/or_IN             |   1 +
 localedata/locales/os_RU             |   1 +
 localedata/locales/pa_IN             |   1 +
 localedata/locales/pa_PK             |   1 +
 localedata/locales/pl_PL             |   1 +
 localedata/locales/pt_PT             |   1 +
 localedata/locales/quz_PE            |   1 +
 localedata/locales/ro_RO             |   1 +
 localedata/locales/ru_RU             |   1 +
 localedata/locales/rw_RW             |   1 +
 localedata/locales/sa_IN             |   1 +
 localedata/locales/sd_IN             |   1 +
 localedata/locales/sd_IN@devanagari  |   1 +
 localedata/locales/se_NO             |   1 +
 localedata/locales/sgs_LT            |   1 +
 localedata/locales/shn_MM            |   1 +
 localedata/locales/si_LK             |   1 +
 localedata/locales/sk_SK             |   1 +
 localedata/locales/sl_SI             |   1 +
 localedata/locales/sm_WS             |   1 +
 localedata/locales/so_SO             |   1 +
 localedata/locales/sq_AL             |   1 +
 localedata/locales/ss_ZA             |   1 +
 localedata/locales/st_ZA             |   1 +
 localedata/locales/sv_SE             |   1 +
 localedata/locales/sw_KE             |   1 +
 localedata/locales/ta_IN             |   1 +
 localedata/locales/te_IN             |   1 +
 localedata/locales/th_TH             |   1 +
 localedata/locales/ti_ET             |   1 +
 localedata/locales/tn_ZA             |   1 +
 localedata/locales/to_TO             |   1 +
 localedata/locales/tpi_PG            |   1 +
 localedata/locales/tr_TR             |   1 +
 localedata/locales/translit_cyrillic | 383 +++++++++++++++++++++++++++
 localedata/locales/ts_ZA             |   1 +
 localedata/locales/unm_US            |   1 +
 localedata/locales/ur_IN             |   1 +
 localedata/locales/ur_PK             |   1 +
 localedata/locales/ve_ZA             |   1 +
 localedata/locales/vi_VN             |   1 +
 localedata/locales/wa_BE             |   1 +
 localedata/locales/wo_SN             |   1 +
 localedata/locales/xh_ZA             |   1 +
 localedata/locales/yi_US             |   1 +
 localedata/locales/yuw_PG            |   1 +
 localedata/locales/zh_CN             |   1 +
 localedata/locales/zu_ZA             |   1 +
 127 files changed, 509 insertions(+)
 create mode 100644 localedata/locales/translit_cyrillic

diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
 space <U1361>
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % hoy-sadis followed by a vowel
 <U1205><U12A0>    <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts.
 % LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 
 translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
 
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % Historicaly we used ISO-8869-2 and wrote digraphs
 % <U01C6> {dž}, <U01C9> {lj} and <U01CC> {nj}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 <U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
 <U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_cjk_variants";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_hangul";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts
 % LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
 
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
 %
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if t/scomma is not available, try first t/scedilla
 <U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
new file mode 100644
index 0000000000..82d9749e08
--- /dev/null
+++ b/localedata/locales/translit_cyrillic
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% h:ttps://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
 translit_start
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % A-bole -> A-circonflecse -> AU
 <U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if digraphs are not available (this is the case with iso-8859-8)
 % then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-14 21:25   ` [PATCH v9] " Egor Kobylkin
@ 2018-11-16 22:17     ` Rafal Luzynski
  2018-11-17 18:34       ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-11-16 22:17 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales

Thank you for working on this, Egor.

Before I start reviewing I would like to summarize the things which
I think are blocking for this patch.

1. I think we need tests for transliteration.  Currently there is only
   one test program which is similar to what we need,
   localedata/bug-iconv-trans.c.  It is old and it is not quite clear
   what bug it is trying to test.  Therefore I think we need a new
   framework to test transliteration.  Is it a good idea to base the
   test on the iconv(1) command line utility which is part of glibc?

2. I made few tests in the command line and it seems to me that the
   transliteration from "З" to "Z" (+ lowercase as well) in uk_UA does
   not work and has not been working for some time already because
   I've checked some older systems as well and the result is always
   the same.  I think that the reason is that uk_UA defines multiple
   transliteration rules for "З" depending on what is the letter following
   it.  It does not seem to work.  AFAIK the reason is that the syntax of
   transliteration rules says that a single non-Latin character may map
   one or more Latin strings, each consisting of one or more characters.
   There cannot be a rule transliterating multiple source characters into
   one or multiple destination characters.  Is it a bug in transliteration
   implementation?  Or maybe in the specification, including POSIX standard?
   The definition of transliteration says that it is one-to-one mapping
   of graphemes while a grapheme may be one or multiple characters.
   It does not have to be always mapping one-to-one character.  Should we
   fix this bug first, make uk_UA transliteration work, and only then
   add a generic Cyrillic transliteration?  Egor's patch already contains
   transliteration of "У" + combining acute accent to "Ú" which most
probably
   will not work.

I still think that in the longer term all existing custom transliterations
of Cyrillic alphabets should be ported to a modification of your patch.

Egor, while at this I was thinking about your idea to transliterate letters
like "Ш" (uppercase) to "SH" (always uppercase) in order to distinguish
between "Шема" (-> "SHema") and "Схема" (-> "Shema" or "Sxema").  Also
you include a rule to transliterate "Х" to "H" or "X" depending on which
destination characters are available, which I told you already that will
not work because both "H" and "X" are always available and therefore only
the first rule will always be used.  I still don't like the idea to
put two uppercase letters in a beginning of a word in titlecase only to
indicate that there was originally a single letter.  What if we:

* drop the rule of transliterating "Х" to "H" and transliterate always to
"X",
* transliterate uppercase "Ш" to "Sh" (so it will work fine for titlecase
  words)?

As a result the Latin letter "h" will only appear as part of a digraph and
never as a transliteration of "Х" and therefore will never cause a conflict.
Examples:

* "Шема" -> "Shema",
* "Схема" -> "Sxema".

Will this solve the problem?

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-16 22:17     ` Rafal Luzynski
@ 2018-11-17 18:34       ` Egor Kobylkin
  2018-11-19  7:13         ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-17 18:34 UTC (permalink / raw)
  To: Rafal Luzynski, libc-alpha, libc-locales, Marko Myllynen

Hi Rafal,
thanks for putting it into a clear issue statement on SH/Sh problem. I'm
totally with you on this being a good thing to discuss. It is orthogonal
to the tests so let me focus on SH/Sh and System A/B problematic here.

Looks like we have three issues:
1. lack of explicit control which transformation to use (System A or
System B) via //TRANSLIT
2. possibility of collision for System B if used CAP/low transcription
for capital letters
3. Cyrillic 'Х'/'х' (ha) never transcribes to 'H'/'h' as it should per
System B because it's equivalent 'X'/'x' from System A is always present
and takes precedence.

As a solution shouldn't we only keep System B in a new file
transcribe_cyrillic and put it in place as the explicit ASCII
transcription for targeted locales (as opposed to transliteration)?

We would keep System A as translit_cyrillic but won't include it into
this patch. Once you have resolved an issue of having two conflicting
rule-sets but only one key //TRANSLIT you could add the System A back.

The SH/Sh can be decided on either way - seems like an easy change any way.

Please see more discussion on your excellent points below:

On 16.11.18 23:17, Rafal Luzynski wrote:

> Egor, while at this I was thinking about your idea to transliterate
> letters like "Ш" (uppercase) to "SH" (always uppercase) in order to
> distinguish between "Шема" (-> "SHema") and "Схема" (-> "Shema" or
> "Sxema").

to clarify, this SH/Sh collision issue relates only to iconv -f UTF-8 -t
ASCII//TRANSLIT (i.e. System B transcription).
But it's not only SH/Sh, there are following combinations used to
transcribe capital letters:

YO, DJ, YE, TSH, DH, ZH, CZ, CH, SH, SHH, YU, YA, FH, YH, GH, NG, TCZ

Arguably any of them (if not in that CAP/CAP form) could collide with
their CAP/low equivalent from a different word. (there may be language
grammar rules that in fact prevent some but we don't know for sure)

With transcription we are basically striping information from the data,
mapping it into a smaller character set. The idea to keep them in
CAP/CAP is to try to preserve as much information as possible.


> Also you include a rule to transliterate "Х" to "H" or "X" depending
> on which destination characters are available, which I told you
> already that will not work because both "H" and "X" are always
> available and therefore only the first rule will always be used.

Just to have this here for reference, the idea was to have both rules in
one file so

iconv -f UTF-8 -t ASCII//TRANSLIT
will produce ASCII compatible _transcription_ (System B)

iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8
will produce Latin _transliteration_ as per ISO 9.1995. (System A)

So in fact we have two rules for each letter in the same file (System A
and System B), where System A takes precedence.

I have a question then: isn't this more like a hack than a right thing
to do?

Shouldn't we have two explicit rules for transcription and
transliteration not dependent on a destination character set?


> I still don't like the idea to
> put two uppercase letters in a beginning of a word in titlecase only
> to indicate that there was originally a single letter.  What if we:
> 
> * drop the rule of transliterating "Х" to "H" and transliterate
> always to "X",
This would contradict ISO 9.1995. (System A).
System A was added on Marko's request (so setting him on TO:) I am
neutral on keeping it or dropping it, just to be clear.

> * transliterate uppercase "Ш" to "Sh" (so it will work fine for
> titlecase words)?
> 
> As a result the Latin letter "h" will only appear as part of a
> digraph and never as a transliteration of "Х" and therefore will
> never cause a conflict. Examples:
> 
> * "Шема" -> "Shema", * "Схема" -> "Sxema".
> 
> Will this solve the problem?
This particular rule with h/x would make sense it's own.
But again - it would contradict the standards.
On the other hand, for my personal needs I care less about standards but
about current functionality and data loss because of missing
transcription altogether due to the BZ #2872.

Bests,
Egor


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-17 18:34       ` Egor Kobylkin
@ 2018-11-19  7:13         ` Marko Myllynen
  2018-11-19  9:21           ` Egor Kobylkin
  2018-12-01 22:07           ` Rafal Luzynski
  0 siblings, 2 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-11-19  7:13 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales

Hi,

On 17/11/2018 20.34, Egor Kobylkin wrote:
> 
> Looks like we have three issues:
> 1. lack of explicit control which transformation to use (System A or
> System B) via //TRANSLIT
> 2. possibility of collision for System B if used CAP/low transcription
> for capital letters
> 3. Cyrillic 'Х'/'х' (ha) never transcribes to 'H'/'h' as it should per
> System B because it's equivalent 'X'/'x' from System A is always present
> and takes precedence.
> 
> As a solution shouldn't we only keep System B in a new file
> transcribe_cyrillic and put it in place as the explicit ASCII
> transcription for targeted locales (as opposed to transliteration)?
> 
> We would keep System A as translit_cyrillic but won't include it into
> this patch. Once you have resolved an issue of having two conflicting
> rule-sets but only one key //TRANSLIT you could add the System A back.
> 
> The SH/Sh can be decided on either way - seems like an easy change any way.
> 
> I have a question then: isn't this more like a hack than a right thing
> to do?
> 
> Shouldn't we have two explicit rules for transcription and
> transliteration not dependent on a destination character set?
> 
> This would contradict ISO 9.1995. (System A).
> System A was added on Marko's request (so setting him on TO:) I am
> neutral on keeping it or dropping it, just to be clear.
> 
> This particular rule with h/x would make sense it's own.
> But again - it would contradict the standards.
> On the other hand, for my personal needs I care less about standards but
> about current functionality and data loss because of missing
> transcription altogether due to the BZ #2872.

Given the amount of questions above I think the way forward is to try
follow the relevant standards as closely as possible and also check what
the other implementations (i.e., uconv(1)) do. For example, checking the
case earlier mentioned case may or may not give some hints:

$ echo Шема  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
Šema
$ echo Схема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
Shema
$ uconv -V
uconv v2.1  ICU 50.1.2

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-19  7:13         ` Marko Myllynen
@ 2018-11-19  9:21           ` Egor Kobylkin
  2018-11-19 19:35             ` Marko Myllynen
  2018-12-01 22:07           ` Rafal Luzynski
  1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-19  9:21 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales

On 19.11.18 08:13, Marko Myllynen wrote:
> Hi,
> 
> On 17/11/2018 20.34, Egor Kobylkin wrote:

>>
>> Shouldn't we have two explicit rules for transcription and
>> transliteration not dependent on a destination character set?
>>
>> This would contradict ISO 9.1995. (System A).
>> System A was added on Marko's request (so setting him on TO:) I am
>> neutral on keeping it or dropping it, just to be clear.
>>
>> This particular rule with h/x would make sense it's own.
>> But again - it would contradict the standards.
>> On the other hand, for my personal needs I care less about standards but
>> about current functionality and data loss because of missing
>> transcription altogether due to the BZ #2872.
> 
> Given the amount of questions above I think the way forward is to try
> follow the relevant standards as closely as possible and also check what
> the other implementations (i.e., uconv(1)) do. For example, checking the
> case earlier mentioned case may or may not give some hints:
> 
> $ echo Шема  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Šema
> $ echo Схема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Shema
> $ uconv -V
> uconv v2.1  ICU 50.1.2

Marko,

Your example only covers _tansliteration_ to Latin Diacritics
iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
| iconv -f ISO-8859-15 -t UTF-8

while BZ #2872 is about _transcription_ to ASCII
iconv -f UTF-8 -t ASCII//TRANSLIT

The glibc wiki explicitly lists this use case (ASCII) as the test
example https://sourceware.org/glibc/wiki/Locales#Testing_Locales

So again, you are asking to have ISO 9.1995. System A but the bug is
about ISO 9.1995. System B (GOST 7.79-2000)


Bests,
Egor

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (9 preceding siblings ...)
  2018-11-14 21:25   ` [PATCH v9] " Egor Kobylkin
@ 2018-11-19 11:10   ` Egor Kobylkin
  2018-12-07 23:35     ` Rafal Luzynski
  2018-12-08 22:28   ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
                     ` (2 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-19 11:10 UTC (permalink / raw)
  To: libc-alpha, libc-locales

[-- Attachment #1: Type: text/plain, Size: 6581 bytes --]

Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file [7]
to localedata/locales/ and include it in all your locales going forward.

The patch is attached.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is in the file translit_cyrillic [7].
Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].

The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen. [11]

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically all locales that already have 'include .*translit.*;""'
string were identified and included into this patch.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3

Best regards,
Egor Kobylkin


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch; name="0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch", Size: 63434 bytes --]

From ce25f26f21918147f6444dac0fa03096368e6494 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Mon, 19 Nov 2018 12:03:14 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* localedata/locales/translit_cyrillic: New file. Supports
	ISO 9.1995, GOST 7.79 System B transcription table from Cyrillic
	to ASCII.
	* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""'
	to LC_CTYPE translit section.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/ky_KG: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.
---
 localedata/locales/aa_DJ             |   1 +
 localedata/locales/af_ZA             |   1 +
 localedata/locales/ak_GH             |   1 +
 localedata/locales/am_ET             |   1 +
 localedata/locales/ar_EG             |   1 +
 localedata/locales/be_BY             |   1 +
 localedata/locales/bem_ZM            |   1 +
 localedata/locales/ber_DZ            |   1 +
 localedata/locales/ber_MA            |   1 +
 localedata/locales/bg_BG             |   1 +
 localedata/locales/bi_VU             |   1 +
 localedata/locales/bn_BD             |   1 +
 localedata/locales/bo_CN             |   1 +
 localedata/locales/ca_ES             |   1 +
 localedata/locales/ce_RU             |   1 +
 localedata/locales/cmn_TW            |   1 +
 localedata/locales/cs_CZ             |   1 +
 localedata/locales/cv_RU             |   1 +
 localedata/locales/cy_GB             |   1 +
 localedata/locales/da_DK             |   1 +
 localedata/locales/de_DE             |   1 +
 localedata/locales/dv_MV             |   1 +
 localedata/locales/dz_BT             |   1 +
 localedata/locales/el_GR             |   1 +
 localedata/locales/en_GB             |   1 +
 localedata/locales/en_NG             |   1 +
 localedata/locales/en_ZM             |   1 +
 localedata/locales/es_CU             |   1 +
 localedata/locales/es_ES             |   1 +
 localedata/locales/et_EE             |   1 +
 localedata/locales/fa_IR             |   1 +
 localedata/locales/ff_SN             |   1 +
 localedata/locales/fi_FI             |   1 +
 localedata/locales/fr_FR             |   1 +
 localedata/locales/ga_IE             |   1 +
 localedata/locales/gd_GB             |   1 +
 localedata/locales/gu_IN             |   1 +
 localedata/locales/gv_GB             |   1 +
 localedata/locales/he_IL             |   1 +
 localedata/locales/hi_IN             |   1 +
 localedata/locales/hif_FJ            |   1 +
 localedata/locales/hr_HR             |   1 +
 localedata/locales/ht_HT             |   1 +
 localedata/locales/hu_HU             |   1 +
 localedata/locales/hy_AM             |   1 +
 localedata/locales/id_ID             |   1 +
 localedata/locales/is_IS             |   1 +
 localedata/locales/it_IT             |   1 +
 localedata/locales/ja_JP             |   1 +
 localedata/locales/kab_DZ            |   1 +
 localedata/locales/kk_KZ             |   1 +
 localedata/locales/km_KH             |   1 +
 localedata/locales/kn_IN             |   1 +
 localedata/locales/ko_KR             |   1 +
 localedata/locales/ks_IN             |   1 +
 localedata/locales/kw_GB             |   1 +
 localedata/locales/ky_KG             |   1 +
 localedata/locales/lb_LU             |   1 +
 localedata/locales/lg_UG             |   1 +
 localedata/locales/lij_IT            |   1 +
 localedata/locales/ln_CD             |   1 +
 localedata/locales/lo_LA             |   1 +
 localedata/locales/lt_LT             |   1 +
 localedata/locales/lv_LV             |   1 +
 localedata/locales/mg_MG             |   1 +
 localedata/locales/mhr_RU            |   1 +
 localedata/locales/mk_MK             |   1 +
 localedata/locales/ml_IN             |   1 +
 localedata/locales/ms_MY             |   1 +
 localedata/locales/mt_MT             |   1 +
 localedata/locales/nan_TW@latin      |   1 +
 localedata/locales/nb_NO             |   1 +
 localedata/locales/ne_NP             |   1 +
 localedata/locales/nhn_MX            |   1 +
 localedata/locales/niu_NU            |   1 +
 localedata/locales/niu_NZ            |   1 +
 localedata/locales/nl_NL             |   1 +
 localedata/locales/nr_ZA             |   1 +
 localedata/locales/oc_FR             |   1 +
 localedata/locales/om_KE             |   1 +
 localedata/locales/or_IN             |   1 +
 localedata/locales/os_RU             |   1 +
 localedata/locales/pa_IN             |   1 +
 localedata/locales/pa_PK             |   1 +
 localedata/locales/pl_PL             |   1 +
 localedata/locales/pt_PT             |   1 +
 localedata/locales/quz_PE            |   1 +
 localedata/locales/ro_RO             |   1 +
 localedata/locales/ru_RU             |   1 +
 localedata/locales/rw_RW             |   1 +
 localedata/locales/sa_IN             |   1 +
 localedata/locales/sd_IN             |   1 +
 localedata/locales/sd_IN@devanagari  |   1 +
 localedata/locales/se_NO             |   1 +
 localedata/locales/sgs_LT            |   1 +
 localedata/locales/shn_MM            |   1 +
 localedata/locales/si_LK             |   1 +
 localedata/locales/sk_SK             |   1 +
 localedata/locales/sl_SI             |   1 +
 localedata/locales/sm_WS             |   1 +
 localedata/locales/so_SO             |   1 +
 localedata/locales/sq_AL             |   1 +
 localedata/locales/ss_ZA             |   1 +
 localedata/locales/st_ZA             |   1 +
 localedata/locales/sv_SE             |   1 +
 localedata/locales/sw_KE             |   1 +
 localedata/locales/ta_IN             |   1 +
 localedata/locales/te_IN             |   1 +
 localedata/locales/th_TH             |   1 +
 localedata/locales/ti_ET             |   1 +
 localedata/locales/tn_ZA             |   1 +
 localedata/locales/to_TO             |   1 +
 localedata/locales/tpi_PG            |   1 +
 localedata/locales/tr_TR             |   1 +
 localedata/locales/translit_cyrillic | 378 +++++++++++++++++++++++++++
 localedata/locales/ts_ZA             |   1 +
 localedata/locales/unm_US            |   1 +
 localedata/locales/ur_IN             |   1 +
 localedata/locales/ur_PK             |   1 +
 localedata/locales/ve_ZA             |   1 +
 localedata/locales/vi_VN             |   1 +
 localedata/locales/wa_BE             |   1 +
 localedata/locales/wo_SN             |   1 +
 localedata/locales/xh_ZA             |   1 +
 localedata/locales/yi_US             |   1 +
 localedata/locales/yuw_PG            |   1 +
 localedata/locales/zh_CN             |   1 +
 localedata/locales/zu_ZA             |   1 +
 128 files changed, 505 insertions(+)
 create mode 100644 localedata/locales/translit_cyrillic

diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
 space <U1361>
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % hoy-sadis followed by a vowel
 <U1205><U12A0>    <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index cca7cc19af..3866f06004 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts.
 % LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 
 translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
 
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % Historicaly we used ISO-8869-2 and wrote digraphs
 % <U01C6> {dž}, <U01C9> {lj} and <U01CC> {nj}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 <U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
 <U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_cjk_variants";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
 
 include "translit_combining";""
 include "translit_hangul";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
 translit_start
 
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % German umlauts
 % LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
 
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
 %
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if t/scomma is not available, try first t/scedilla
 <U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
 <U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
new file mode 100644
index 0000000000..253f5c9618
--- /dev/null
+++ b/localedata/locales/translit_cyrillic
@@ -0,0 +1,378 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transcription of Cyrillic letters to ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000 System B.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% Check https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+% Capital Cyrillic letters that are transcribed with two ASCII letters
+% combination get both ASCII letters capitalized to avoid collisions.
+
+
+% Usage examples:
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bugs doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> "<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> "<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> "<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> "<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> "<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> <U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> "<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> "<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> "<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> "<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> "<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> "<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> "<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> "<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> "<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> "<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> "<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> "<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> "<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> "<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> "<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> "<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> "<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> "<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> "<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> "<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> "<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> "<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> "<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> "<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> "<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> "<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> "<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> "<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> "<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> "<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> "<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> "<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> "<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> "<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> "<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> "<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> "<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> "<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> "<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> "<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> "<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> "<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> "<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> "<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> "<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> "<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> "<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> "<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> "<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> "<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> "<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> "<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U0027>
+
+translit_end
+
+END LC_CTYPE
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % those two lettes are not in cp1256...
 
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
 translit_start
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % A-bole -> A-circonflecse -> AU
 <U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
+include "translit_cyrillic";""
 
 translit_end
 
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 
 % if digraphs are not available (this is the case with iso-8859-8)
 % then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-19  9:21           ` Egor Kobylkin
@ 2018-11-19 19:35             ` Marko Myllynen
  0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-11-19 19:35 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales

Hi,

On 19/11/2018 11.21, Egor Kobylkin wrote:
> On 19.11.18 08:13, Marko Myllynen wrote:
>> On 17/11/2018 20.34, Egor Kobylkin wrote:
> 
> Your example only covers _tansliteration_ to Latin Diacritics
> iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
> | iconv -f ISO-8859-15 -t UTF-8
> 
> while BZ #2872 is about _transcription_ to ASCII
> iconv -f UTF-8 -t ASCII//TRANSLIT

AFAICS v9 (unlike v10) supported both of the above cases.

> The glibc wiki explicitly lists this use case (ASCII) as the test
> example https://sourceware.org/glibc/wiki/Locales#Testing_Locales

I wrote that section and I certainly wasn't considering Cyrillic aspects
at that time (IIRC it was written even before Mike did the major update
for transliteration rules at the end of 2015). The context back then was
mostly about handling Latin letters like Å, Ä, Ö, Ø, etc.

> So again, you are asking to have ISO 9.1995. System A but the bug is
> about ISO 9.1995. System B (GOST 7.79-2000)

We certainly can decide here what's the best course of action, we do not
have to slavishly follow some old bug report when deciding the direction
for the implementation. But I think I've made my position clear by now
so I'm not going to repeat it anymore.

In any case once your patch lands I'm going to submit a follow-up patch
for fi_FI to make it compliant with the applicable national standard
(SFS 4900) which defines how to do Cyrillic transliteration /
transcription in the context Finnish.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-19  7:13         ` Marko Myllynen
  2018-11-19  9:21           ` Egor Kobylkin
@ 2018-12-01 22:07           ` Rafal Luzynski
  2018-12-01 22:53             ` Egor Kobylkin
  2018-12-03 22:19             ` Egor Kobylkin
  1 sibling, 2 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-01 22:07 UTC (permalink / raw)
  To: Marko Myllynen, Egor Kobylkin, libc-alpha, libc-locales

19.11.2018 08:13 Marko Myllynen <myllynen@redhat.com> wrote:
> [...]
> Given the amount of questions above I think the way forward is to try
> follow the relevant standards as closely as possible and also check what
> the other implementations (i.e., uconv(1)) do. For example, checking the
> case earlier mentioned case may or may not give some hints:
> 
> $ echo Шема  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Šema
> $ echo Схема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Shema
> $ uconv -V
> uconv v2.1  ICU 50.1.2

I've played a little with uconv and unfortunately it does not look good
to me.

It does not have any fallback transliteration to plain ASCII.  When it says
that 'Ш' is transliterated to 'Š' then it always uses 'Š' and if the target
charset does not have this character then crashes:

$ echo Шема  | uconv -f UTF-8 -t ASCII -x cyrillic-latin
Conversion from Unicode to codepage failed at output byte position 0.
Unicode: 0160 Error: Invalid character found
$ echo Шема  | uconv -f UTF-8 -t ISO-8859-1 -x cyrillic-latin
Conversion from Unicode to codepage failed at output byte position 0.
Unicode: 0160 Error: Invalid character found
$ echo Шема  | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin
�ema
$ echo Шема  | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin | uconv -f
ISO-8859-2 -t UTF-8
Šema

It seems to follow ISO 9 (GOST 7.79) System A.  However, the transliteration
of the hard sign is rather strange:

$ echo нъе  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
nʺe

The above was correct but:

$ echo НЪЕ  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin          
Nʺ̱E
$ echo Ъ  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
ʺ̱
$ echo Ъ  | uconv -f UTF-8 -t UTF-16 -x cyrillic-latin| hexdump -x
0000000    feff    02ba    0331    000a                                
0000008

So this generates:
02BA  MODIFIER LETTER DOUBLE PRIME
0331  COMBINING MACRON BELOW

There is are more transliteration methods, for example Russian-Latin/BGN:

$ echo Шема  | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
Shema
$ echo Схема  | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
Skhema

Converting 'х' to 'kh' seems to be common in English transliteration but
it does not follow any ISO standard.

$ echo ХА ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
KHA kha

This means that the choice whether a digraph in the output should be
all uppercase or maybe upper+lower is context based, something which we
probably cannot implement.  But definitely a good thing.

Two more tests:

$ echo Ещё | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
Yeshchë
$ echo Ещё | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
Conversion from Unicode to codepage failed at output byte position 6.
Unicode: 00eb Error: Invalid character found

So the output is not plain ASCII.

$ echo е же ле не | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
ye zhe le ne

Again this means that transliteration of 'е' is context based:
it is 'ye' in the beginning of a word and 'e' otherwise.

The version which I've tested:

$ uconv -V
uconv v2.1  ICU 60.2

It seems that uconv will not be a good hint about transliterating
to plain ASCII.

Also, the difference between uconv and iconv is that we can provide
multiple transliterations for any source character but we can't group
them into standards so we can't tell iconv to use this or another
system.  It will just choose the best fitting the current output
character set and the only thing we can choose is the locale.

This makes me think: should we add a locale like ru_RU@SystemA or
ru_RU@SystemB?

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-01 22:07           ` Rafal Luzynski
@ 2018-12-01 22:53             ` Egor Kobylkin
  2018-12-03 22:19             ` Egor Kobylkin
  1 sibling, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-01 22:53 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales

[-- Attachment #1: Type: text/plain, Size: 3084 bytes --]

On 01.12.18 23:07, Rafal Luzynski wrote:
> 
> Also, the difference between uconv and iconv is that we can provide
> multiple transliterations for any source character but we can't group
> them into standards so we can't tell iconv to use this or another
> system.  It will just choose the best fitting the current output
> character set and the only thing we can choose is the locale.
> 
> This makes me think: should we add a locale like ru_RU@SystemA or
> ru_RU@SystemB?

Wouldn't it require to create 3 versions of every locale that would
include the translit_cyrillic file then? I.e. en_US + en_US@SystemA,
en_US@SystemB etc.?

This in turn will make two of them optional (as cyrillic fonts are at
the moment). The highest value is in having the default locale being
able to transliterate, isn't it? So putting the transliteration to
optional locales kind of defeats the purpose.

An example from my experience as a user - a networked device or host
would often have the en_US as the default (only?) locale with no viable
way to change it or install cyrillic fonts. Anyway, this is the most
dire situation where the ASCII transliteration certainly helps most.
Having en_US@SystemA or en_US@SystemB theoretically available but not
compiled by the distributor wouldn't help here, would it?

So the only useful scenario here would be to ship your locales with the
transliteration already included by default in en_US. This way the
distributor won't have to get active to include transliteration as
en_US@SystemA or en_US@SystemB.

From my (however limited) point of view it is better to have the System
B in first, then see if some code need to be changed to accommodate
System A/System B problematic. Again, System B is _transcription_ to
ASCII and System A _transliteration_ to Latin with different use cases.

It's insightful to see your comparison of the uconv vs. iconv!
Similar to your checks this is what I was using to see whether any
locale fails the transliteration for any cyrillic letter:

echo
"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"|
LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
iconv -f UTF-8 -t ASCII//TRANSLIT

should give (can be asserted with bash string comparison):

AaOoUussYODJG`YeZ`IYiJL`N`TSHK`U`DhABVGDEZHZIJKLMNOPRSTUUFHCCHSHSHHA`Y`E`YUYAabvgdezhzijklmnoprstuufhcchshshh``y`e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FhfhYhyhE`e`
G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'

And I am attaching another file that has the Unicode Codepoints next to
the letters for easier identification of failures. (like  "U0401-Ё
U0402-Ђ U0403-Ѓ etc.) Hope it will be helpful in creating the tests.

Best regards,
Egor Kobylkin

[-- Attachment #2: translit-test-input.txt --]
[-- Type: text/plain, Size: 2249 bytes --]

CYRILLIC RUSSIAN Съешь ещё этих мягких французских булок, да выпей же чаю. СЪЕШЬ ЕЩЁ ЭТИХ МЯГКИХ ФРАНЦУЗСКИХ БУЛОК? ДА ВЫПЕЙ ЖЕ ЧАЮ!
CYRILLIC COMPLETE U0401-Ё U0402-Ђ U0403-Ѓ U0404-Є U0405-Ѕ U0406-І U0407-Ї U0408-Ј U0409-Љ U040A-Њ U040B-Ћ U040C-Ќ U040E-Ў U040F-Џ U0410-А U0411-Б U0412-В U0413-Г U0414-Д U0415-Е U0416-Ж U0417-З U0418-И U0419-Й U041A-К U041B-Л U041C-М U041D-Н U041E-О U041F-П U0420-Р U0421-С U0422-Т U0423-У U0423 0301-У́ U0424-Ф U0425-Х U0426-Ц U0427-Ч U0428-Ш U0429-Щ U042A-ъ U042B-Ы U042C-ь U042D-Э U042E-Ю U042F-Я U0430-а U0431-б U0432-в U0433-г U0434-д U0435-е U0436-ж U0437-з U0438-и U0439-й U043A-к U043B-л U043C-м U043D-н U043E-о U043F-п U0440-р U0441-с U0442-т U0443-у U0443 0301-у́ U0444-ф U0445-х U0446-ц U0447-ч U0448-ш U0449-щ U044A-Ъ U044B-ы U044C-Ь U044D-э U044E-ю U044F-я U0451-ё U0452-ђ U0453-ѓ U0454-є U0455-ѕ U0456-і U0457-ї U0458-ј U0459-љ U045A-њ U045B-ћ U045C-ќ U045E-ў U045F-џ U046A-Ѫ U046B-ѫ U0472-Ѳ U0473-ѳ U0474-Ѵ U0475-ѵ U048C-Ҍ U048D-ҍ U0490-Ґ U0491-ґ U0492-Ғ U0493-ғ U0494-Ҕ U0495-ҕ U0496-Җ U0497-җ U049A-Қ U049B-қ U049E-Ҟ U049F-ҟ U04A2-Ң U04A3-ң U04A4-Ҥ U04A5-ҥ U04A6-Ҧ U04A7-ҧ U04A8-Ҩ U04A9-ҩ U04AA-Ҫ U04AB-ҫ U04AC-Ҭ U04AD-ҭ U04AE-Ү U04AF-ү U04B2-Ҳ U04B3-ҳ U04B4-Ҵ U04B5-ҵ U04BA-Һ U04BB-һ U04BC-Ҽ U04BD-ҽ U04BE-Ҿ U04BF-ҿ U04C0-Ӏ U04C1-Ӂ U04C2-ӂ U04CB-Ӌ U04CC-ӌ U04D0-Ӑ U04D1-ӑ U04D2-Ӓ U04D3-ӓ U04D6-Ӗ U04D7-ӗ U04D8-Ә U04D9-ә U04DC-Ӝ U04DD-ӝ U04DE-Ӟ U04DF-ӟ U04E0-Ӡ U04E1-ӡ U04E4-Ӥ U04E5-ӥ U04E6-Ӧ U04E7-ӧ U04E8-Ө U04E9-ө U04F0-Ӱ U04F1-ӱ U04F2-Ӳ U04F3-ӳ U04F4-Ӵ U04F5-ӵ U04F8-Ӹ U04F9-ӹ U2019-’
GREEK Ελληνικό Ίδρυμα Ευρωπαϊκής και Εξωτερικής.
GERMAN Zwölf Boxkämpfer jagen Victor quer über den großen Sylter Deich.
FRENCH Dès Noël où un zéphyr haï me vêt de glaçons würmiens je dîne d’exquis rôtis de bœuf au kir à l’aÿ d’âge mûr \& cætera.
SPANISH El veloz murciélago hindú comía feliz cardillo y kiwi, la cigüeña tocaba el saxofón detrás del palenque de paja.
END


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-01 22:07           ` Rafal Luzynski
  2018-12-01 22:53             ` Egor Kobylkin
@ 2018-12-03 22:19             ` Egor Kobylkin
  2018-12-08  1:15               ` Rafal Luzynski
  1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-03 22:19 UTC (permalink / raw)
  To: libc-alpha, libc-locales; +Cc: Marko Myllynen

Rafal,

Just to touch base on this, what is the best way forward? Did you get
any input/feedback on your questions below? Are you expecting input from
anyone but myself?

On the blocking issue #2: I really don’t see the connection to the uk_UA
locale that has its transliteration table inline and is explicitly
excluded from my patch. It may be revealing  another issue you have with
glibc but wouldn’t that be better addressed in a new bug?
Again, in the v10 of my patch I have removed multicharacter source
graphemes, so that issue is moot there.

If you’d like to overhaul the glibc translit system wouldn’t it be
better to commit the simple text file with the Cyrillic
translit(transcription) table first, fix the bug from the year 2006 and
then proceed from there all due diligence?

The same with having both System A and System B.  Initially I went along
with the suggestion to include the system A but it is clear now that it
doesn’t make fixing [BZ #2872] more straightforward. So I’d also propose
to set it aside for the moment and use the v10 without the system A.
That is the whole reason I have submitted it, to be superclear on that.

Now you saw that uconv is transcribing «ХА» as KHA (cap/cap/cap) that
should mitigate your concern about that issue too (somewhat, anyway).
Making it context based would also be about adding new code, see above.

Let me know if there’s anything I can help with getting more progress
with the decision

Bests,
Egor


On 16.11.18 23:17, Rafal Luzynski wrote:

> 2. I made few tests in the command line and it seems to me that the 
> transliteration from "З" to "Z" (+ lowercase as well) in uk_UA does 
> not work and has not been working for some time already because I've
> checked some older systems as well and the result is always the same.
> I think that the reason is that uk_UA defines multiple 
> transliteration rules for "З" depending on what is the letter
> following it.  It does not seem to work.  AFAIK the reason is that
> the syntax of transliteration rules says that a single non-Latin
> character may map one or more Latin strings, each consisting of one
> or more characters. There cannot be a rule transliterating multiple
> source characters into one or multiple destination characters.  Is it
> a bug in transliteration implementation?  Or maybe in the
> specification, including POSIX standard?
> The definition of transliteration says that it is one-to-one mapping 
> of graphemes while a grapheme may be one or multiple characters. It
> does not have to be always mapping one-to-one character.  Should we 
> fix this bug first, make uk_UA transliteration work, and only then 
> add a generic Cyrillic transliteration?  Egor's patch already
> contains transliteration of "У" + combining acute accent to "Ú" which
> most probably will not work.
> 
> I still think that in the longer term all existing custom
> transliterations of Cyrillic alphabets should be ported to a
> modification of your patch.

On 01.12.18 23:07, Rafal Luzynski wrote:
> 19.11.2018 08:13 Marko Myllynen <myllynen@redhat.com> wrote:
>> [...]
>> Given the amount of questions above I think the way forward is to try
>> follow the relevant standards as closely as possible and also check what
>> the other implementations (i.e., uconv(1)) do. For example, checking the
>> case earlier mentioned case may or may not give some hints:
>>
>> $ echo Шема  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
>> Šema
>> $ echo Схема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
>> Shema
>> $ uconv -V
>> uconv v2.1  ICU 50.1.2
> 
> I've played a little with uconv and unfortunately it does not look good
> to me.
> 
> It does not have any fallback transliteration to plain ASCII.  When it says
> that 'Ш' is transliterated to 'Š' then it always uses 'Š' and if the target
> charset does not have this character then crashes:
> 
> $ echo Шема  | uconv -f UTF-8 -t ASCII -x cyrillic-latin
> Conversion from Unicode to codepage failed at output byte position 0.
> Unicode: 0160 Error: Invalid character found
> $ echo Шема  | uconv -f UTF-8 -t ISO-8859-1 -x cyrillic-latin
> Conversion from Unicode to codepage failed at output byte position 0.
> Unicode: 0160 Error: Invalid character found
> $ echo Шема  | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin
> �ema
> $ echo Шема  | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin | uconv -f
> ISO-8859-2 -t UTF-8
> Šema
> 
> It seems to follow ISO 9 (GOST 7.79) System A.  However, the transliteration
> of the hard sign is rather strange:
> 
> $ echo нъе  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> nʺe
> 
> The above was correct but:
> 
> $ echo НЪЕ  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin          
> Nʺ̱E
> $ echo Ъ  | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> ʺ̱
> $ echo Ъ  | uconv -f UTF-8 -t UTF-16 -x cyrillic-latin| hexdump -x
> 0000000    feff    02ba    0331    000a                                
> 0000008
> 
> So this generates:
> 02BA  MODIFIER LETTER DOUBLE PRIME
> 0331  COMBINING MACRON BELOW
> 
> There is are more transliteration methods, for example Russian-Latin/BGN:
> 
> $ echo Шема  | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> Shema
> $ echo Схема  | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> Skhema
> 
> Converting 'х' to 'kh' seems to be common in English transliteration but
> it does not follow any ISO standard.
> 
> $ echo ХА ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> KHA kha
> 
> This means that the choice whether a digraph in the output should be
> all uppercase or maybe upper+lower is context based, something which we
> probably cannot implement.  But definitely a good thing.
> 
> Two more tests:
> 
> $ echo Ещё | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> Yeshchë
> $ echo Ещё | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
> Conversion from Unicode to codepage failed at output byte position 6.
> Unicode: 00eb Error: Invalid character found
> 
> So the output is not plain ASCII.
> 
> $ echo е же ле не | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
> ye zhe le ne
> 
> Again this means that transliteration of 'е' is context based:
> it is 'ye' in the beginning of a word and 'e' otherwise.
> 
> The version which I've tested:
> 
> $ uconv -V
> uconv v2.1  ICU 60.2
> 
> It seems that uconv will not be a good hint about transliterating
> to plain ASCII.
> 
> Also, the difference between uconv and iconv is that we can provide
> multiple transliterations for any source character but we can't group
> them into standards so we can't tell iconv to use this or another
> system.  It will just choose the best fitting the current output
> character set and the only thing we can choose is the locale.
> 
> This makes me think: should we add a locale like ru_RU@SystemA or
> ru_RU@SystemB?
> 
> Regards,
> 
> Rafal
> 


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-11-19 11:10   ` [PATCH v10] " Egor Kobylkin
@ 2018-12-07 23:35     ` Rafal Luzynski
  2018-12-08 21:51       ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-07 23:35 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales

19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872

I'm in favor of implementing System A and dropping System B instead.
If I understand correctly, System A is actually ISO 9, therefore it is
international, universal, and neutral, while System B is a GOST standard
and therefore used only in Russia (also adopted in several other countries
as well).

It's true that we can't handle both System A and System B.  What we
would like to have is:


             System A
          /============> OUTPUT: Latin with diacritics
  INPUT <    System B
          \============> OUTPUT: Plain ASCII (fallback)

That means: use one system but if the output can't handle it then switch
to another system.

But what we can actually have is either:

          System A                                    Fallback
  INPUT ============> OUTPUT: Latin with diacritics ============> Plain
ASCII

or:

          System B
  INPUT ============> OUTPUT: Plain ASCII

That means, we can only provide a fallback for individual characters,
we can't provide a fallback algorithm (that is, we can't switch to
transliterating 'Х' as 'X' instead of 'H' just because we can't
transliterate
'Ш' as 'Š' and switch to 'SH' instead).

Wouldn't it be better to implement ISO 9 (System A) instead and provide
a fallback ASCII transliteration which could be similar but not identical
to System B?  Is it necessary to provide plain ASCII transliteration
conforming to System B even if that means that we would have not to
implement System A?  If yes, would it be correct to provide System B
for ru_RU (and maybe few more locales) but include System A in all other
locales (except few which we exclude already)?

> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics

OK, thank you, I like this change.

> [...]
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen. [11]

I'm not sure it should be actually called transcription.  IIUC,
transcription
reflects pronunciation, something we can't easily implement in glibc.
As long as we convert letters to letters (or group of letters to group
of letters) without taking pronunciation into account it should be
called transliteration.  OTOH, I agree that it is rather uncommon in
Russian language to find an example where pronunciation is not perfectly
reflected in spelling.

> +% Generated from UnicodeData.txt with a spreadsheet referenced
> +% in that bugs doclet

The previous versions of your patch had "in that bug's doclet" here
which I think is correct.

I like the version 9 of your patch more so I'm going to write a more
thorough review of it.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-03 22:19             ` Egor Kobylkin
@ 2018-12-08  1:15               ` Rafal Luzynski
  2018-12-10 21:20                 ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-08  1:15 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales; +Cc: Marko Myllynen

17.11.2018 19:34 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> Looks like we have three issues:
> 1. lack of explicit control which transformation to use (System A or
> System B) via //TRANSLIT
> 2. possibility of collision for System B if used CAP/low transcription
> for capital letters
> 3. Cyrillic 'Х'/'х' (ha) never transcribes to 'H'/'h' as it should per
> System B because it's equivalent 'X'/'x' from System A is always present
> and takes precedence.

True.

> As a solution shouldn't we only keep System B in a new file
> transcribe_cyrillic and put it in place as the explicit ASCII
> transcription for targeted locales (as opposed to transliteration)?
> 
> We would keep System A as translit_cyrillic but won't include it into
> this patch. Once you have resolved an issue of having two conflicting
> rule-sets but only one key //TRANSLIT you could add the System A back.

Sounds like a good idea to provide those two files:

* translit_cyrillic_system_a,
* translit_cyrillic_system_b,

(or any other pair of names) and let the individual locales choose whether
they want to include System A or System B.  For optimization, system_b
file could include system_a and modify it.

> The SH/Sh can be decided on either way - seems like an easy change any
> way.

I'm in favor of "Sh" because it will work fine for titlecased words
(where only the first letter is uppercase) but I'm aware it would be
a problem for uppercased words.  Unfortunately, I think we are unable
to satisfy both cases.

> On 16.11.18 23:17, Rafal Luzynski wrote:
> 
> > Egor, while at this I was thinking about your idea to transliterate
> > letters like "Ш" (uppercase) to "SH" (always uppercase) in order to
> > distinguish between "Шема" (-> "SHema") and "Схема" (-> "Shema" or
> > "Sxema").
> 
> to clarify, this SH/Sh collision issue relates only to iconv -f UTF-8 -t
> ASCII//TRANSLIT (i.e. System B transcription).

True.

> But it's not only SH/Sh, there are following combinations used to
> transcribe capital letters:
> 
> YO, DJ, YE, TSH, DH, ZH, CZ, CH, SH, SHH, YU, YA, FH, YH, GH, NG, TCZ

Absolutely true.  I skip the whole list only for the brevity: if we
find a solution for one letter the same solution will work fine for
all others.

> [...]
> With transcription we are basically striping information from the data,
> mapping it into a smaller character set. The idea to keep them in
> CAP/CAP is to try to preserve as much information as possible.

I'm only afraid that things like "TWo CApitals" or "CamelCase" are
common among us computer geeks while they do not look great when
working with natural language and when displaying them to regular users
and even non-computer people.

> [...]
> So in fact we have two rules for each letter in the same file (System A
> and System B), where System A takes precedence.
> 
> I have a question then: isn't this more like a hack than a right thing
> to do?
> 
> Shouldn't we have two explicit rules for transcription and
> transliteration not dependent on a destination character set?

It's impossible with the current API of iconv.  Maybe it would be
possible ever in future but that's a greater amount of work than what
we are doing here now.  Again, for now different set of rules = different
locale.

I have another question: is it really a job of transliteration to preserve
all original information, to ensure no collisions and have the ability to
restore the original text?  I'm afraid that as long as plain ASCII is the
destination charset whatever system we provide it will always be possible
to provide a malicious combination of the Cyrillic characters proving that
the system generates collisions.

> > I still don't like the idea to
> > put two uppercase letters in a beginning of a word in titlecase only
> > to indicate that there was originally a single letter.  What if we:
> > 
> > * drop the rule of transliterating "Х" to "H" and transliterate
> > always to "X",
> This would contradict ISO 9.1995. (System A).

Yes, it would.  I'm trying to find solution here since I think we have
proved that we can't implement a system which will handle System A,
System B, and ensure no collisions at the same time.  At least one
requirement must be dropped (at least partially).

> System A was added on Marko's request (so setting him on TO:) I am
> neutral on keeping it or dropping it, just to be clear.

I think I didn't see this Marko's request but I'm in favor of keeping
System A, too.

Marko, it would be good to hear your opinion about System A vs. System B
again.

> [...]
> On the other hand, for my personal needs I care less about standards but
> about current functionality and data loss because of missing
> transcription altogether due to the BZ #2872.

I read this that you are open to a solution which is inspired by some
standards but does not implement them fully due to our technical
limitations.


19.11.2018 10:21 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> Marko,
> 
> Your example only covers _tansliteration_ to Latin Diacritics
> [...]
> while BZ #2872 is about _transcription_ to ASCII
> [...]
> 
> So again, you are asking to have ISO 9.1995. System A but the bug is
> about ISO 9.1995. System B (GOST 7.79-2000)

It's hard to say what the original bug reporter meant but I think that the
problem is that there is no transliteration from Cyrillic to any variant of
Latin, except in few locales.  If System A was implemented but System B was
not then at least some characters would be handled correctly.  Currently no
Cyrillic characters are handled.


19.11.2018 20:35 Marko Myllynen <myllynen@redhat.com> wrote:
> [...]
> In any case once your patch lands I'm going to submit a follow-up patch
> for fi_FI to make it compliant with the applicable national standard
> (SFS 4900) which defines how to do Cyrillic transliteration /
> transcription in the context Finnish.

I totally agree.  As far as I can see, SFS 4900 is more similar to
System A (ISO 9) rather than System B, that is, it transliterates to Latin
characters with diacritics rather than plain ASCII.  Marko, what is your
opinion about possible implementation of SFS 4900 in these cases:

* When the destination charset does not contain required Latin diacritic
  characters (e.g., it is plain ASCII)?
* When the output is ambiguous, that means, when two different Cyrillic
  strings produce the same Latin (or ASCII) output?

At the moment I am not curious about SFS 4900 but we are facing the same
problems now with ISO 9 and GOST 7.79.


1.12.2018 23:07 Rafal Luzynski <digitalfreak@lingonborough.com> wrote:
> [...]
> $ echo ХА ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> KHA kha
> 
> This means that the choice whether a digraph in the output should be
> all uppercase or maybe upper+lower is context based, something which we
> probably cannot implement.  But definitely a good thing.

I forgot to include this test which is really interesting:

$ echo ХА Ха ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN    
KHA Kha kha

which again confirms that the choice of all uppercase or just the first
letter uppercased is context based, a thing which we can't implement now.


1.12.2018 23:53 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> On 01.12.18 23:07, Rafal Luzynski wrote:
> > 
> > [...]
> > This makes me think: should we add a locale like ru_RU@SystemA or
> > ru_RU@SystemB?
> 
> Wouldn't it require to create 3 versions of every locale that would
> include the translit_cyrillic file then? I.e. en_US + en_US@SystemA,
> en_US@SystemB etc.?

OK, please read this as another brainstorming idea and let's just
forget it.

> [...]
> An example from my experience as a user - a networked device or host
> would often have the en_US as the default (only?) locale with no viable
> way to change it or install cyrillic fonts. Anyway, this is the most
> dire situation where the ASCII transliteration certainly helps most.
> Having en_US@SystemA or en_US@SystemB theoretically available but not
> compiled by the distributor wouldn't help here, would it?
> 
> So the only useful scenario here would be to ship your locales with the
> transliteration already included by default in en_US. This way the
> distributor won't have to get active to include transliteration as
> en_US@SystemA or en_US@SystemB.

Having the idea of "@SystemA" and "@SystemB" dropped I don't think
implementing any solution in glibc would be helpful for your use case.
Two reasons:

1. I believe that sooner or later someone will develop a transliteration
   system for en_US which will follow English transliteration of Russian
   instead of any standard we are discussing here.  That means, it would
   transliterate 'Х' as 'Kh' rather than 'H' or 'X'.
2. Currently there is a trend not to install even en_US locales and leave
   only C which is hardcoded into glibc binaries.  OTOH, I wouldn't mind
   if ISO 9 was hardcoded into C as well.
3. That's beyond Russian language but transliteration according to Serbian
   or Bulgarian or Ukrainian or Kazakh rules still requires installing their
   proper locales.  I think that requiring ru_RU to be installed could be
   reasonable especially if we end up with ru_RU somehow differing from
   the default "translit_cyrillic".

BTW you don't need Cyrillic fonts to be installed on your server in order
to process the Cyrillic text correctly unless your server renders the text.


3.12.2018 23:19 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> Rafal,
> 
> Just to touch base on this, what is the best way forward? Did you get
> any input/feedback on your questions below? Are you expecting input from
> anyone but myself?

Yes, I expected some input from more experienced maintainers about whether
and how to write the tests but I'd rather start another thread about it
because this one is too long already.

> On the blocking issue #2: I really don’t see the connection to the uk_UA
> locale that has its transliteration table inline and is explicitly
> excluded from my patch. It may be revealing  another issue you have with
> glibc but wouldn’t that be better addressed in a new bug?

OK, I was not precise enough (I'm sorry about it) so I'd like to explain
here:

1. In the long term goal I would like to convert those excluded locales
   to use your translit_cyrillic as well.
2. In order to ensure that change is not destructive for them I will need
   automatic tests to prove that their transliteration rules work the
   same good before the change and after the change.
3. It does not matter that converting those other locales is in a distant
   future because we need the same tests for Russian language now.
4. Even although I have not started writing any tests I can see they
   will be failing for uk_UA.  The reason is that glibc transliteration
   rules can handle transliterating single characters into single
characters,
   single characters into multiple characters but not multiple characters
   into multiple (or even single) characters.
5. We can ignore uk_UA but we will face the same case in ru_RU where
   you had a case of 'У́ ' ('У' + 'COMBINING ACUTE ACCENT').
6. So the question was: how (and whether) to write the tests if we
   already know they would be failing?  Skip them?  Resolve the other
   issue first?  Mark them as XFAIL?

In the meantime, you have removed the controversial conversion rule
of 'У' with the acute accent:

> Again, in the v10 of my patch I have removed multicharacter source
> graphemes, so that issue is moot there.

so we can move to the next step.

> If you’d like to overhaul the glibc translit system wouldn’t it be
> better to commit the simple text file with the Cyrillic
> translit(transcription) table first, fix the bug from the year 2006 and
> then proceed from there all due diligence?

I agree and we are now one step forward.

> The same with having both System A and System B.  Initially I went along
> with the suggestion to include the system A but it is clear now that it
> doesn’t make fixing [BZ #2872] more straightforward. So I’d also propose
> to set it aside for the moment and use the v10 without the system A.
> That is the whole reason I have submitted it, to be superclear on that.

OK, I think that now I understand your reason to drop System A better.
But still I'd like to rethink implementing System A somehow and drop
(or rather: implement only partially) System B.

> Now you saw that uconv is transcribing «ХА» as KHA (cap/cap/cap) that
> should mitigate your concern about that issue too (somewhat, anyway).
> Making it context based would also be about adding new code, see above.

It would also require the changes in the syntax of the source code
of locale data and possibly breaking the POSIX compatibility which
I think would be unacceptable.

> Let me know if there’s anything I can help with getting more progress
> with the decision

I'm afraid you can't help more.  I'd like to hear some feedback from other
people.  Due to some minor obstacles we can't resolve this issue being only
two here.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-07 23:35     ` Rafal Luzynski
@ 2018-12-08 21:51       ` Egor Kobylkin
  2018-12-19 22:41         ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-08 21:51 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Dmitry V. Levin, Marko Myllynen,
	mfabian

Rafal, Dmitry, Marko, Mike

On 08.12.18 00:35, Rafal Luzynski wrote:
> 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
>> 
>> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
>> (transliteration to Latin with diacritics) as conflicting with
>> System B within glibc mechanics and not solving BZ #2872
> 
> I'm in favor of implementing System A and dropping System B instead.

The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
fails". The ISO 9 System A does not map to ASCII so it is not a solution
to BZ #2872 at all.

I was scratching my head as to how can we avoid the explosion of the
scope for this patch. And then it appeared to me that it was wrong to
target all the present locales for the ASCII translit. This seems to be
the root cause for this prolonged A vs. B discussions. The proper target
for my table is actually the C locale translit file
(locale/C-translit.h.in). I will submit a proper patch shortly.

If anyone wants to keep working on the implementation of the Latin
Diacritics transliteration of the Cyrillic letters (System A) you are
welcome to use the tables I have submitted before (v9). That would be a
new feature for glibc as per my understanding. Let's just make super
clear the distinction of the System A (Latin with Diacritics, non-ASCII)
to the ASCII translit as mentioned in BZ #2872 (System B).

My focus is super sharp on helping with Cyrillic -> ASCII translit
availability for a default installation with glibc.

Hope this helps,
Egor

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (10 preceding siblings ...)
  2018-11-19 11:10   ` [PATCH v10] " Egor Kobylkin
@ 2018-12-08 22:28   ` Egor Kobylkin
  2018-12-19 23:16     ` Egor Kobylkin
  2019-01-02 18:38   ` [PATCH v12] " Egor Kobylkin
  2019-03-19 10:39   ` ping " Egor Kobylkin
  13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-08 22:28 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Marko Myllynen, mfabian,
	Dmitry V. Levin

[-- Attachment #1: Type: text/plain, Size: 5665 bytes --]

Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.

Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration rows to locale/C-translit.h.in.

The patch is attached.


Current bug effect:

The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:

iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- it produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here.


COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].

The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.

Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3

Best regards,
Egor Kobylkin



[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 11581 bytes --]

From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Sat, 8 Dec 2018 22:08:59 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index e27f39e8fe..bd64edc609 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -2,6 +2,7 @@
    Copyright (C) 2000-2018 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 2000.
+   0401-04f9 contributed by Egor Kobylkin <Egor@Kobylkin.com>, 2018.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -56,6 +57,175 @@
 "\x02cd"	"_"	/* <U02CD> MODIFIER LETTER LOW MACRON */
 "\x02d0"	":"	/* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
 "\x02dc"	"~"	/* <U02DC> SMALL TILDE */
+"\x0401"	"YO"	/* <U0401> CYRILLIC CAPITAL LETTER IO */
+"\x0402"	"DJ"	/* <U0402> CYRILLIC CAPITAL LETTER DJE */
+"\x0403"	"G`"	/* <U0403> CYRILLIC CAPITAL LETTER GJE */
+"\x0404"	"YE"	/* <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE */
+"\x0405"	"Z`"	/* <U0405> CYRILLIC CAPITAL LETTER DZE */
+"\x0406"	"I"	/* <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0407"	"YI"	/* <U0407> CYRILLIC CAPITAL LETTER YI */
+"\x0408"	"J"	/* <U0408> CYRILLIC CAPITAL LETTER JE */
+"\x0409"	"L`"	/* <U0409> CYRILLIC CAPITAL LETTER LJE */
+"\x040a"	"N`"	/* <U040A> CYRILLIC CAPITAL LETTER NJE */
+"\x040b"	"TSH"	/* <U040B> CYRILLIC CAPITAL LETTER TSHE */
+"\x040c"	"K`"	/* <U040C> CYRILLIC CAPITAL LETTER KJE */
+"\x040e"	"U`"	/* <U040E> CYRILLIC CAPITAL LETTER SHORT U */
+"\x040f"	"DH"	/* <U040F> CYRILLIC CAPITAL LETTER DZHE */
+"\x0410"	"A"	/* <U0410> CYRILLIC CAPITAL LETTER A */
+"\x0411"	"B"	/* <U0411> CYRILLIC CAPITAL LETTER BE */
+"\x0412"	"V"	/* <U0412> CYRILLIC CAPITAL LETTER VE */
+"\x0413"	"G"	/* <U0413> CYRILLIC CAPITAL LETTER GHE */
+"\x0414"	"D"	/* <U0414> CYRILLIC CAPITAL LETTER DE */
+"\x0415"	"E"	/* <U0415> CYRILLIC CAPITAL LETTER IE */
+"\x0416"	"ZH"	/* <U0416> CYRILLIC CAPITAL LETTER ZHE */
+"\x0417"	"Z"	/* <U0417> CYRILLIC CAPITAL LETTER ZE */
+"\x0418"	"I"	/* <U0418> CYRILLIC CAPITAL LETTER I */
+"\x0419"	"J"	/* <U0419> CYRILLIC CAPITAL LETTER SHORT I */
+"\x041a"	"K"	/* <U041A> CYRILLIC CAPITAL LETTER KA */
+"\x041b"	"L"	/* <U041B> CYRILLIC CAPITAL LETTER EL */
+"\x041c"	"M"	/* <U041C> CYRILLIC CAPITAL LETTER EM */
+"\x041d"	"N"	/* <U041D> CYRILLIC CAPITAL LETTER EN */
+"\x041e"	"O"	/* <U041E> CYRILLIC CAPITAL LETTER O */
+"\x041f"	"P"	/* <U041F> CYRILLIC CAPITAL LETTER PE */
+"\x0420"	"R"	/* <U0420> CYRILLIC CAPITAL LETTER ER */
+"\x0421"	"S"	/* <U0421> CYRILLIC CAPITAL LETTER ES */
+"\x0422"	"T"	/* <U0422> CYRILLIC CAPITAL LETTER TE */
+"\x0423"	"U"	/* <U0423> CYRILLIC CAPITAL LETTER U */
+"\x0424"	"F"	/* <U0424> CYRILLIC CAPITAL LETTER EF */
+"\x0425"	"X"	/* <U0425> CYRILLIC CAPITAL LETTER HA */
+"\x0426"	"CZ"	/* <U0426> CYRILLIC CAPITAL LETTER TSE */
+"\x0427"	"CH"	/* <U0427> CYRILLIC CAPITAL LETTER CHE */
+"\x0428"	"SH"	/* <U0428> CYRILLIC CAPITAL LETTER SHA */
+"\x0429"	"SHH"	/* <U0429> CYRILLIC CAPITAL LETTER SHCHA */
+"\x042a"	"A`"	/* <U042A> CYRILLIC CAPITAL LETTER HARD SIGN */
+"\x042b"	"Y`"	/* <U042B> CYRILLIC CAPITAL LETTER YERU */
+"\x042c"	"`"	/* <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN */
+"\x042d"	"E`"	/* <U042D> CYRILLIC CAPITAL LETTER E */
+"\x042e"	"YU"	/* <U042E> CYRILLIC CAPITAL LETTER YU */
+"\x042f"	"YA"	/* <U042F> CYRILLIC CAPITAL LETTER YA */
+"\x0430"	"a"	/* <U0430> CYRILLIC SMALL LETTER A */
+"\x0431"	"b"	/* <U0431> CYRILLIC SMALL LETTER BE */
+"\x0432"	"v"	/* <U0432> CYRILLIC SMALL LETTER VE */
+"\x0433"	"g"	/* <U0433> CYRILLIC SMALL LETTER GHE */
+"\x0434"	"d"	/* <U0434> CYRILLIC SMALL LETTER DE */
+"\x0435"	"e"	/* <U0435> CYRILLIC SMALL LETTER IE */
+"\x0436"	"zh"	/* <U0436> CYRILLIC SMALL LETTER ZHE */
+"\x0437"	"z"	/* <U0437> CYRILLIC SMALL LETTER ZE */
+"\x0438"	"i"	/* <U0438> CYRILLIC SMALL LETTER I */
+"\x0439"	"j"	/* <U0439> CYRILLIC SMALL LETTER SHORT I */
+"\x043a"	"k"	/* <U043A> CYRILLIC SMALL LETTER KA */
+"\x043b"	"l"	/* <U043B> CYRILLIC SMALL LETTER EL */
+"\x043c"	"m"	/* <U043C> CYRILLIC SMALL LETTER EM */
+"\x043d"	"n"	/* <U043D> CYRILLIC SMALL LETTER EN */
+"\x043e"	"o"	/* <U043E> CYRILLIC SMALL LETTER O */
+"\x043f"	"p"	/* <U043F> CYRILLIC SMALL LETTER PE */
+"\x0440"	"r"	/* <U0440> CYRILLIC SMALL LETTER ER */
+"\x0441"	"s"	/* <U0441> CYRILLIC SMALL LETTER ES */
+"\x0442"	"t"	/* <U0442> CYRILLIC SMALL LETTER TE */
+"\x0443"	"u"	/* <U0443> CYRILLIC SMALL LETTER U */
+"\x0444"	"f"	/* <U0444> CYRILLIC SMALL LETTER EF */
+"\x0445"	"x"	/* <U0445> CYRILLIC SMALL LETTER HA */
+"\x0446"	"cz"	/* <U0446> CYRILLIC SMALL LETTER TSE */
+"\x0447"	"ch"	/* <U0447> CYRILLIC SMALL LETTER CHE */
+"\x0448"	"sh"	/* <U0448> CYRILLIC SMALL LETTER SHA */
+"\x0449"	"shh"	/* <U0449> CYRILLIC SMALL LETTER SHCHA */
+"\x044a"	"``"	/* <U044A> CYRILLIC SMALL LETTER HARD SIGN */
+"\x044b"	"y`"	/* <U044B> CYRILLIC SMALL LETTER YERU */
+"\x044c"	"`"	/* <U044C> CYRILLIC SMALL LETTER SOFT SIGN */
+"\x044d"	"e`"	/* <U044D> CYRILLIC SMALL LETTER E */
+"\x044e"	"yu"	/* <U044E> CYRILLIC SMALL LETTER YU */
+"\x044f"	"ya"	/* <U044F> CYRILLIC SMALL LETTER YA */
+"\x0451"	"yo"	/* <U0451> CYRILLIC SMALL LETTER IO */
+"\x0452"	"dj"	/* <U0452> CYRILLIC SMALL LETTER DJE */
+"\x0453"	"g`"	/* <U0453> CYRILLIC SMALL LETTER GJE */
+"\x0454"	"ye"	/* <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE */
+"\x0455"	"z`"	/* <U0455> CYRILLIC SMALL LETTER DZE */
+"\x0456"	"i"	/* <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0457"	"yi"	/* <U0457> CYRILLIC SMALL LETTER YI */
+"\x0458"	"j"	/* <U0458> CYRILLIC SMALL LETTER JE */
+"\x0459"	"l`"	/* <U0459> CYRILLIC SMALL LETTER LJE */
+"\x045a"	"n`"	/* <U045A> CYRILLIC SMALL LETTER NJE */
+"\x045b"	"tsh"	/* <U045B> CYRILLIC SMALL LETTER TSHE */
+"\x045c"	"k`"	/* <U045C> CYRILLIC SMALL LETTER KJE */
+"\x045e"	"u`"	/* <U045E> CYRILLIC SMALL LETTER SHORT U */
+"\x045f"	"dh"	/* <U045F> CYRILLIC SMALL LETTER DZHE */
+"\x046a"	"O`"	/* <U046A> CYRILLIC CAPITAL LETTER BIG YUS */
+"\x046b"	"o`"	/* <U046B> CYRILLIC SMALL LETTER BIG YUS */
+"\x0472"	"FH"	/* <U0472> CYRILLIC CAPITAL LETTER FITA */
+"\x0473"	"fh"	/* <U0473> CYRILLIC SMALL LETTER FITA */
+"\x0474"	"YH"	/* <U0474> CYRILLIC CAPITAL LETTER IZHITSA */
+"\x0475"	"yh"	/* <U0475> CYRILLIC SMALL LETTER IZHITSA */
+"\x048c"	"E`"	/* <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN */
+"\x048d"	"e`"	/* <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN */
+"\x0490"	"G`"	/* <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
+"\x0491"	"g`"	/* <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN */
+"\x0492"	"GH"	/* <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE */
+"\x0493"	"gh"	/* <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE */
+"\x0494"	"GH"	/* <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */
+"\x0495"	"gh"	/* <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */
+"\x0496"	"ZH`"	/* <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */
+"\x0497"	"zh`"	/* <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER */
+"\x049a"	"K`"	/* <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER */
+"\x049b"	"k`"	/* <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER */
+"\x049e"	"K`"	/* <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE */
+"\x049f"	"k`"	/* <U049F> CYRILLIC SMALL LETTER KA WITH STROKE */
+"\x04a2"	"N`"	/* <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER */
+"\x04a3"	"n`"	/* <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER */
+"\x04a4"	"NG"	/* <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE */
+"\x04a5"	"ng"	/* <U04A5> CYRILLIC SMALL LIGATURE EN GHE */
+"\x04a6"	"P`"	/* <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */
+"\x04a7"	"p`"	/* <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */
+"\x04a8"	"O`"	/* <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA */
+"\x04a9"	"o`"	/* <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA */
+"\x04aa"	"C`"	/* <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER */
+"\x04ab"	"C`"	/* <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER */
+"\x04ac"	"T`"	/* <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER */
+"\x04ad"	"t`"	/* <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER */
+"\x04ae"	"U"	/* <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U */
+"\x04af"	"u"	/* <U04AF> CYRILLIC SMALL LETTER STRAIGHT U */
+"\x04b2"	"H`"	/* <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER */
+"\x04b3"	"h`"	/* <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER */
+"\x04b4"	"TCZ"	/* <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE */
+"\x04b5"	"tcz"	/* <U04B5> CYRILLIC SMALL LIGATURE TE TSE */
+"\x04ba"	"SH`"	/* <U04BA> CYRILLIC CAPITAL LETTER SHHA */
+"\x04bb"	"SH`"	/* <U04BB> CYRILLIC SMALL LETTER SHHA */
+"\x04bc"	"CH`"	/* <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE */
+"\x04bd"	"ch`"	/* <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE */
+"\x04be"	"CH`"	/* <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04bf"	"ch`"	/* <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04c0"	"i"	/* <U04C0> CYRILLIC LETTER PALOCHKA */
+"\x04c1"	"ZH`"	/* <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE */
+"\x04c2"	"zh`"	/* <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE */
+"\x04cb"	"CH`"	/* <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */
+"\x04cc"	"ch`"	/* <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE */
+"\x04d0"	"A`"	/* <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE */
+"\x04d1"	"a`"	/* <U04D1> CYRILLIC SMALL LETTER A WITH BREVE */
+"\x04d2"	"A`"	/* <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS */
+"\x04d3"	"a`"	/* <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS */
+"\x04d6"	"E`"	/* <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE */
+"\x04d7"	"e`"	/* <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE */
+"\x04d8"	"A`"	/* <U04D8> CYRILLIC CAPITAL LETTER SCHWA */
+"\x04d9"	"a`"	/* <U04D9> CYRILLIC SMALL LETTER SCHWA */
+"\x04dc"	"ZH`"	/* <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */
+"\x04dd"	"zh`"	/* <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */
+"\x04de"	"Z`"	/* <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */
+"\x04df"	"z`"	/* <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS */
+"\x04e0"	"Z`"	/* <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE */
+"\x04e1"	"z`"	/* <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE */
+"\x04e4"	"I`"	/* <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS */
+"\x04e5"	"i`"	/* <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS */
+"\x04e6"	"O`"	/* <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS */
+"\x04e7"	"o`"	/* <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS */
+"\x04e8"	"O`"	/* <U04E8> CYRILLIC CAPITAL LETTER BARRED O */
+"\x04e9"	"o`"	/* <U04E9> CYRILLIC SMALL LETTER BARRED O */
+"\x04f0"	"U`"	/* <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS */
+"\x04f1"	"u`"	/* <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS */
+"\x04f2"	"U`"	/* <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */
+"\x04f3"	"u`"	/* <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */
+"\x04f4"	"CH`"	/* <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */
+"\x04f5"	"ch`"	/* <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS */
+"\x04f8"	"Y`"	/* <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */
+"\x04f9"	"y`"	/* <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS */
 "\x2002"	" "	/* <U2002> EN SPACE */
 "\x2003"	" "	/* <U2003> EM SPACE */
 "\x2004"	" "	/* <U2004> THREE-PER-EM SPACE */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-08  1:15               ` Rafal Luzynski
@ 2018-12-10 21:20                 ` Marko Myllynen
  2018-12-19 22:25                   ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-12-10 21:20 UTC (permalink / raw)
  To: Rafal Luzynski, Egor Kobylkin, libc-alpha, libc-locales
  Cc: Mike Fabian, Carlos O'Donell

Hi,

On 08/12/2018 03.15, Rafal Luzynski wrote:
> 17.11.2018 19:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> The SH/Sh can be decided on either way - seems like an easy change any
>> way.
> 
> I'm in favor of "Sh" because it will work fine for titlecased words
> (where only the first letter is uppercase) but I'm aware it would be
> a problem for uppercased words.  Unfortunately, I think we are unable
> to satisfy both cases.

I think I'm in favor of "Sh" as well, although not perfect I'd assume
it's probably going to be correct in more cases than SH.

>> System A was added on Marko's request (so setting him on TO:) I am
>> neutral on keeping it or dropping it, just to be clear.
> 
> I think I didn't see this Marko's request but I'm in favor of keeping
> System A, too.
> 
> Marko, it would be good to hear your opinion about System A vs. System B
> again.

I think System A is a better option as it should be the same as ISO 9
and perhaps also produces results in some cases which are more expected
than with System B (if the Wikipedia ISO 9 article is to be believed).

Wrt BZ #2872 I think it's good to keep it in mind but IMHO we can also
deviate from it if needed, however with System A + ASCII fallback
definitions the RFE should be satisfied as well?

> 19.11.2018 20:35 Marko Myllynen <myllynen@redhat.com> wrote:
>> [...]
>> In any case once your patch lands I'm going to submit a follow-up patch
>> for fi_FI to make it compliant with the applicable national standard
>> (SFS 4900) which defines how to do Cyrillic transliteration /
>> transcription in the context Finnish.
> 
> I totally agree.  As far as I can see, SFS 4900 is more similar to
> System A (ISO 9) rather than System B, that is, it transliterates to Latin
> characters with diacritics rather than plain ASCII.  Marko, what is your
> opinion about possible implementation of SFS 4900 in these cases:
> 
> * When the destination charset does not contain required Latin diacritic
>   characters (e.g., it is plain ASCII)?

This would be according to http://jkorpela.fi/iso9.html8 so for example
instead of ž -> zh and instead of štš -> shtsh.

> * When the output is ambiguous, that means, when two different Cyrillic
>   strings produce the same Latin (or ASCII) output?

This is a good point and one I haven't considered but I'm not sure is
there anything we can do about this (at least without major locale
system internals work)? Do you have any rough idea how frequently this
could happen or is this more a theoretical issue? (Sorry if I've missed
earlier comments about this, it's been a long thread.)

>> The same with having both System A and System B.  Initially I went along
>> with the suggestion to include the system A but it is clear now that it
>> doesn’t make fixing [BZ #2872] more straightforward. So I’d also propose
>> to set it aside for the moment and use the v10 without the system A.
>> That is the whole reason I have submitted it, to be superclear on that.
> 
> OK, I think that now I understand your reason to drop System A better.
> But still I'd like to rethink implementing System A somehow and drop
> (or rather: implement only partially) System B.

Yes, I also think System A AKA ISO 9 would be a better choice but I'll
leave the final decision for you two (and others who might weigh in).

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-10 21:20                 ` Marko Myllynen
@ 2018-12-19 22:25                   ` Rafal Luzynski
  2018-12-19 22:48                     ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-19 22:25 UTC (permalink / raw)
  To: Marko Myllynen, Egor Kobylkin, libc-alpha, libc-locales

10.12.2018 22:20 Marko Myllynen <myllynen@redhat.com> wrote:
> 
> Hi,
> 
> On 08/12/2018 03.15, Rafal Luzynski wrote:
> > [...]
> > Marko, it would be good to hear your opinion about System A vs. System B
> > again.
> 
> I think System A is a better option as it should be the same as ISO 9
> and perhaps also produces results in some cases which are more expected
> than with System B (if the Wikipedia ISO 9 article is to be believed).
> 
> Wrt BZ #2872 I think it's good to keep it in mind but IMHO we can also
> deviate from it if needed, however with System A + ASCII fallback
> definitions the RFE should be satisfied as well?

That's exactly what I meant (sorry if it was not clear before).

> > [...]  Marko, what is your
> > opinion about possible implementation of SFS 4900 in these cases:
> > 
> > * When the destination charset does not contain required Latin diacritic
> >   characters (e.g., it is plain ASCII)?
> 
> This would be according to http://jkorpela.fi/iso9.html8 so for example
> instead of ž -> zh and instead of štš -> shtsh.

Agree.

> > * When the output is ambiguous, that means, when two different Cyrillic
> >   strings produce the same Latin (or ASCII) output?
> 
> This is a good point and one I haven't considered but I'm not sure is
> there anything we can do about this (at least without major locale
> system internals work)?

I agree with the suggestion that we can't do much about it.  I mean,
there are possibly solutions (like using more punctuation characters)
but they don't look natural to me.

> Do you have any rough idea how frequently this
> could happen or is this more a theoretical issue? (Sorry if I've missed
> earlier comments about this, it's been a long thread.)

Yes, Egor provided this example many times:

"схема" -> "shema" (if "с" -> "s" and "х" -> "h")
"шема"  -> "shema" (if "ш" -> "sh")

I don't think that it matters how frequent are these cases.  I think that
the question is if ambiguity is a bug because if yes then even one corner
case proves that the solution is wrong.

> [...]
> Yes, I also think System A AKA ISO 9 would be a better choice but I'll
> leave the final decision for you two (and others who might weigh in).

Egor is a native speaker so I respect his opinion even if I'm not fully
convinced for technical reasons.  Sadly, nobody else provides any opinion
which could weigh.  I am going to write a separate email about it.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-08 21:51       ` Egor Kobylkin
@ 2018-12-19 22:41         ` Rafal Luzynski
  2018-12-19 23:02           ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-19 22:41 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Dmitry V. Levin,
	Marko Myllynen, mfabian

8.12.2018 22:51 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> Rafal, Dmitry, Marko, Mike
> 
> On 08.12.18 00:35, Rafal Luzynski wrote:
> > 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
> >> 
> >> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
> >> (transliteration to Latin with diacritics) as conflicting with
> >> System B within glibc mechanics and not solving BZ #2872
> > 
> > I'm in favor of implementing System A and dropping System B instead.
> 
> The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
> fails". The ISO 9 System A does not map to ASCII so it is not a solution
> to BZ #2872 at all.

I did not mean implementing System A and nothing more.  I meant implementing
System A and a fallback for ASCII which can be similar to System B but
we wouldn't be able to call it "System B" because it would differ in
few cases.

> I was scratching my head as to how can we avoid the explosion of the
> scope for this patch. And then it appeared to me that it was wrong to
> target all the present locales for the ASCII translit. This seems to be
> the root cause for this prolonged A vs. B discussions. The proper target
> for my table is actually the C locale translit file
> (locale/C-translit.h.in). I will submit a proper patch shortly.

I saw your patch v11 and now I must say I'm sorry for making noise because
it was me who said that I didn't mind adding Cyrillic -> ASCII
transliteration
to C locale.  I said so before taking a look at the current contents of
transliteration in C locale.  When I looked at this I realized that it does
not support any national characters, even from modified Latin alphabets
(like
used in most of western European languages).  It only contains mathematical,
physical, commercial, diacritical etc. characters.  So I'm no longer sure
it should support Cyrillic -> ASCII.  But maybe again I'm wrong, maybe
it should support but just nobody implemented it yet.

> If anyone wants to keep working on the implementation of the Latin
> Diacritics transliteration of the Cyrillic letters (System A) you are
> welcome to use the tables I have submitted before (v9). That would be a
> new feature for glibc as per my understanding. Let's just make super
> clear the distinction of the System A (Latin with Diacritics, non-ASCII)
> to the ASCII translit as mentioned in BZ #2872 (System B).

I liked your v9 patch more.  I really appreciate your work and I'm not
going to ask you to provide more patches because I think that so far you
have provided all possible versions.  I hope that your work will not be
lost.

> My focus is super sharp on helping with Cyrillic -> ASCII translit
> availability for a default installation with glibc.

I understand your aim and I agree to support ASCII.  Our disagreements are:

* whether to support conversion Cyrillic -> extended Latin as well,
* which standard to implement,
* what to do if the standard is ambiguous or if some details cannot be
  implemented for technical reasons.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-19 22:25                   ` Rafal Luzynski
@ 2018-12-19 22:48                     ` Egor Kobylkin
  2018-12-19 23:50                       ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-19 22:48 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales

On 19.12.18 23:25, Rafal Luzynski wrote:
> 10.12.2018 22:20 Marko Myllynen <myllynen@redhat.com> wrote:
> 
>> [...]
>> Yes, I also think System A AKA ISO 9 would be a better choice but I'll
>> leave the final decision for you two (and others who might weigh in).
> 
> Egor is a native speaker so I respect his opinion even if I'm not fully
> convinced for technical reasons.  Sadly, nobody else provides any opinion
> which could weigh.  I am going to write a separate email about it.
> 
> Regards,
> 
> Rafal
> 
It's not about which letter should be used for a particular
transliteration. I couldn't care less about that just to be clear.

May be I am missing something, could you tell how do you want to fit
System A to ASCII exactly?

Let's take the very first example from the table:
CyrillicUnicode	CyrillicLetter	CyrillicUnicodeName	LatinUnicode	System A
Latin Letter	System B ASCII Letter
0401	Ё	CYRILLIC CAPITAL LETTER IO	00CB	Ë	YO

so:
Cyrillic Ё U0401
System A - Ë U00CB -  _not_ ASCII
System B - YO (or Yo) "<U0059><U004F>" - ASCII

Could you explain how can we make System A "Ë" to be displayed or
processes somehow in a C locale? Or in a locale or program that doesn't
have "Ë" U00CB?

Bests,
Egor


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-19 22:41         ` Rafal Luzynski
@ 2018-12-19 23:02           ` Egor Kobylkin
  2018-12-20  0:05             ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-19 23:02 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Dmitry V. Levin, Marko Myllynen,
	mfabian

On 19.12.18 23:41, Rafal Luzynski wrote:
> 8.12.2018 22:51 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> Rafal, Dmitry, Marko, Mike
>>
>> On 08.12.18 00:35, Rafal Luzynski wrote:
>>> 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
>>>> (transliteration to Latin with diacritics) as conflicting with
>>>> System B within glibc mechanics and not solving BZ #2872
>>>
>>> I'm in favor of implementing System A and dropping System B instead.
>>
>> The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
>> fails". The ISO 9 System A does not map to ASCII so it is not a solution
>> to BZ #2872 at all.
> 
> I did not mean implementing System A and nothing more.  I meant implementing
> System A and a fallback for ASCII which can be similar to System B but
> we wouldn't be able to call it "System B" because it would differ in
> few cases.
Just for the record, I have no objection on my side to that (Using A as
a basis for ASCII as well).

But I'm not sure anymore that inserting a translit table into every
locale is the right solution for ASCII problem. Especially because
distributions may not include any locale but C.

> 
>> I was scratching my head as to how can we avoid the explosion of the
>> scope for this patch. And then it appeared to me that it was wrong to
>> target all the present locales for the ASCII translit. This seems to be
>> the root cause for this prolonged A vs. B discussions. The proper target
>> for my table is actually the C locale translit file
>> (locale/C-translit.h.in). I will submit a proper patch shortly.
> 
> I saw your patch v11 and now I must say I'm sorry for making noise because
> it was me who said that I didn't mind adding Cyrillic -> ASCII
> transliteration
> to C locale.  I said so before taking a look at the current contents of
> transliteration in C locale.  When I looked at this I realized that it does
> not support any national characters, even from modified Latin alphabets
> (like
> used in most of western European languages).  It only contains mathematical,
> physical, commercial, diacritical etc. characters.  So I'm no longer sure
> it should support Cyrillic -> ASCII.  But maybe again I'm wrong, maybe
> it should support but just nobody implemented it yet.

Actually there are quite a few letters already transliterated in
locale/C-translit.h.in. (Note the CAPCAP transliteration style for the
capitals, i.e. LATIN CAPITAL LETTER AE is mapped to AE, not to Ae.)

"\x00c6"	"AE"	/* <U00C6> LATIN CAPITAL LETTER AE */
"\x00d7"	"x"	/* <U00D7> MULTIPLICATION SIGN */
"\x00df"	"ss"	/* <U00DF> LATIN SMALL LETTER SHARP S */
"\x00e6"	"ae"	/* <U00E6> LATIN SMALL LETTER AE */
"\x0132"	"IJ"	/* <U0132> LATIN CAPITAL LIGATURE IJ */
"\x0133"	"ij"	/* <U0133> LATIN SMALL LIGATURE IJ */
"\x0149"	"'n"	/* <U0149> LATIN SMALL LETTER N PRECEDED BY APOSTROPHE */
"\x0152"	"OE"	/* <U0152> LATIN CAPITAL LIGATURE OE */
"\x0153"	"oe"	/* <U0153> LATIN SMALL LIGATURE OE */
"\x017f"	"s"	/* <U017F> LATIN SMALL LETTER LONG S */
"\x01c7"	"LJ"	/* <U01C7> LATIN CAPITAL LETTER LJ */
"\x01c8"	"Lj"	/* <U01C8> LATIN CAPITAL LETTER L WITH SMALL LETTER J */
"\x01c9"	"lj"	/* <U01C9> LATIN SMALL LETTER LJ */
"\x01ca"	"NJ"	/* <U01CA> LATIN CAPITAL LETTER NJ */
"\x01cb"	"Nj"	/* <U01CB> LATIN CAPITAL LETTER N WITH SMALL LETTER J */
"\x01cc"	"nj"	/* <U01CC> LATIN SMALL LETTER NJ */
"\x01f1"	"DZ"	/* <U01F1> LATIN CAPITAL LETTER DZ */
"\x01f2"	"Dz"	/* <U01F2> LATIN CAPITAL LETTER D WITH SMALL LETTER Z */
"\x01f3"	"dz"	/* <U01F3> LATIN SMALL LETTER DZ */


>> My focus is super sharp on helping with Cyrillic -> ASCII translit
>> availability for a default installation with glibc.
> 
> I understand your aim and I agree to support ASCII.  Our disagreements are:
> 
> * whether to support conversion Cyrillic -> extended Latin as well,
no contest on my side
> * which standard to implement,
no contest on my side
> * what to do if the standard is ambiguous or if some details cannot be
>   implemented for technical reasons.
no contest on my side either

I just think we may work around all those decisions with a smaller pure
ASCII patch first (more useful too if covers C locale).

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2018-12-08 22:28   ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
@ 2018-12-19 23:16     ` Egor Kobylkin
  2018-12-26 10:07       ` Siddhesh Poyarekar
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-19 23:16 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Marko Myllynen, Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 6415 bytes --]

Freeze ping.

I'd like to ping the list on this patch and to have some discussion on
moving ASCII transliteration to locale/C-translit.h.in before the freeze.

The wiki page for 2.29 [12] is set as "immutable" for newly registered
users, not sure it is so desired. I could not add this patch there as
"desired".
I have added 2.29 keyword to the bug entry.

Bests,
Egor Kobylkin


[12] https://sourceware.org/glibc/wiki/Release/2.29

On 08.12.18 23:28, Egor Kobylkin wrote:
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> 
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
> 
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
> 
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
> 
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.
> 
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
>   to sequences of all uppercase Latin letters in all languages (whenever
>   a Cyrillic letter is transliterated to more than one Latin letter),
>   for example "Ї" is now transliterated as "YI" rather than "Yi".
> 
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> 
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
> 
> The patch is attached.
> 
> 
> Current bug effect:
> 
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
> 
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
> 
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> 
> - it produces a string of question marks and spaces.
> 
> This is what it should produce and it does so after the patch applied:
> 
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
> 
> 
> The root problem and the fix:
> 
> The root problem is the missing transliteration table that I am
> supplying here.
> 
> 
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> 
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
> 
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
> 
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
> 
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
> 
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
> 
> Links:
> 
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
> 
> Best regards,
> Egor Kobylkin
> 
> 


[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 11581 bytes --]

From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Sat, 8 Dec 2018 22:08:59 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index e27f39e8fe..bd64edc609 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -2,6 +2,7 @@
    Copyright (C) 2000-2018 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 2000.
+   0401-04f9 contributed by Egor Kobylkin <Egor@Kobylkin.com>, 2018.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -56,6 +57,175 @@
 "\x02cd"	"_"	/* <U02CD> MODIFIER LETTER LOW MACRON */
 "\x02d0"	":"	/* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
 "\x02dc"	"~"	/* <U02DC> SMALL TILDE */
+"\x0401"	"YO"	/* <U0401> CYRILLIC CAPITAL LETTER IO */
+"\x0402"	"DJ"	/* <U0402> CYRILLIC CAPITAL LETTER DJE */
+"\x0403"	"G`"	/* <U0403> CYRILLIC CAPITAL LETTER GJE */
+"\x0404"	"YE"	/* <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE */
+"\x0405"	"Z`"	/* <U0405> CYRILLIC CAPITAL LETTER DZE */
+"\x0406"	"I"	/* <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0407"	"YI"	/* <U0407> CYRILLIC CAPITAL LETTER YI */
+"\x0408"	"J"	/* <U0408> CYRILLIC CAPITAL LETTER JE */
+"\x0409"	"L`"	/* <U0409> CYRILLIC CAPITAL LETTER LJE */
+"\x040a"	"N`"	/* <U040A> CYRILLIC CAPITAL LETTER NJE */
+"\x040b"	"TSH"	/* <U040B> CYRILLIC CAPITAL LETTER TSHE */
+"\x040c"	"K`"	/* <U040C> CYRILLIC CAPITAL LETTER KJE */
+"\x040e"	"U`"	/* <U040E> CYRILLIC CAPITAL LETTER SHORT U */
+"\x040f"	"DH"	/* <U040F> CYRILLIC CAPITAL LETTER DZHE */
+"\x0410"	"A"	/* <U0410> CYRILLIC CAPITAL LETTER A */
+"\x0411"	"B"	/* <U0411> CYRILLIC CAPITAL LETTER BE */
+"\x0412"	"V"	/* <U0412> CYRILLIC CAPITAL LETTER VE */
+"\x0413"	"G"	/* <U0413> CYRILLIC CAPITAL LETTER GHE */
+"\x0414"	"D"	/* <U0414> CYRILLIC CAPITAL LETTER DE */
+"\x0415"	"E"	/* <U0415> CYRILLIC CAPITAL LETTER IE */
+"\x0416"	"ZH"	/* <U0416> CYRILLIC CAPITAL LETTER ZHE */
+"\x0417"	"Z"	/* <U0417> CYRILLIC CAPITAL LETTER ZE */
+"\x0418"	"I"	/* <U0418> CYRILLIC CAPITAL LETTER I */
+"\x0419"	"J"	/* <U0419> CYRILLIC CAPITAL LETTER SHORT I */
+"\x041a"	"K"	/* <U041A> CYRILLIC CAPITAL LETTER KA */
+"\x041b"	"L"	/* <U041B> CYRILLIC CAPITAL LETTER EL */
+"\x041c"	"M"	/* <U041C> CYRILLIC CAPITAL LETTER EM */
+"\x041d"	"N"	/* <U041D> CYRILLIC CAPITAL LETTER EN */
+"\x041e"	"O"	/* <U041E> CYRILLIC CAPITAL LETTER O */
+"\x041f"	"P"	/* <U041F> CYRILLIC CAPITAL LETTER PE */
+"\x0420"	"R"	/* <U0420> CYRILLIC CAPITAL LETTER ER */
+"\x0421"	"S"	/* <U0421> CYRILLIC CAPITAL LETTER ES */
+"\x0422"	"T"	/* <U0422> CYRILLIC CAPITAL LETTER TE */
+"\x0423"	"U"	/* <U0423> CYRILLIC CAPITAL LETTER U */
+"\x0424"	"F"	/* <U0424> CYRILLIC CAPITAL LETTER EF */
+"\x0425"	"X"	/* <U0425> CYRILLIC CAPITAL LETTER HA */
+"\x0426"	"CZ"	/* <U0426> CYRILLIC CAPITAL LETTER TSE */
+"\x0427"	"CH"	/* <U0427> CYRILLIC CAPITAL LETTER CHE */
+"\x0428"	"SH"	/* <U0428> CYRILLIC CAPITAL LETTER SHA */
+"\x0429"	"SHH"	/* <U0429> CYRILLIC CAPITAL LETTER SHCHA */
+"\x042a"	"A`"	/* <U042A> CYRILLIC CAPITAL LETTER HARD SIGN */
+"\x042b"	"Y`"	/* <U042B> CYRILLIC CAPITAL LETTER YERU */
+"\x042c"	"`"	/* <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN */
+"\x042d"	"E`"	/* <U042D> CYRILLIC CAPITAL LETTER E */
+"\x042e"	"YU"	/* <U042E> CYRILLIC CAPITAL LETTER YU */
+"\x042f"	"YA"	/* <U042F> CYRILLIC CAPITAL LETTER YA */
+"\x0430"	"a"	/* <U0430> CYRILLIC SMALL LETTER A */
+"\x0431"	"b"	/* <U0431> CYRILLIC SMALL LETTER BE */
+"\x0432"	"v"	/* <U0432> CYRILLIC SMALL LETTER VE */
+"\x0433"	"g"	/* <U0433> CYRILLIC SMALL LETTER GHE */
+"\x0434"	"d"	/* <U0434> CYRILLIC SMALL LETTER DE */
+"\x0435"	"e"	/* <U0435> CYRILLIC SMALL LETTER IE */
+"\x0436"	"zh"	/* <U0436> CYRILLIC SMALL LETTER ZHE */
+"\x0437"	"z"	/* <U0437> CYRILLIC SMALL LETTER ZE */
+"\x0438"	"i"	/* <U0438> CYRILLIC SMALL LETTER I */
+"\x0439"	"j"	/* <U0439> CYRILLIC SMALL LETTER SHORT I */
+"\x043a"	"k"	/* <U043A> CYRILLIC SMALL LETTER KA */
+"\x043b"	"l"	/* <U043B> CYRILLIC SMALL LETTER EL */
+"\x043c"	"m"	/* <U043C> CYRILLIC SMALL LETTER EM */
+"\x043d"	"n"	/* <U043D> CYRILLIC SMALL LETTER EN */
+"\x043e"	"o"	/* <U043E> CYRILLIC SMALL LETTER O */
+"\x043f"	"p"	/* <U043F> CYRILLIC SMALL LETTER PE */
+"\x0440"	"r"	/* <U0440> CYRILLIC SMALL LETTER ER */
+"\x0441"	"s"	/* <U0441> CYRILLIC SMALL LETTER ES */
+"\x0442"	"t"	/* <U0442> CYRILLIC SMALL LETTER TE */
+"\x0443"	"u"	/* <U0443> CYRILLIC SMALL LETTER U */
+"\x0444"	"f"	/* <U0444> CYRILLIC SMALL LETTER EF */
+"\x0445"	"x"	/* <U0445> CYRILLIC SMALL LETTER HA */
+"\x0446"	"cz"	/* <U0446> CYRILLIC SMALL LETTER TSE */
+"\x0447"	"ch"	/* <U0447> CYRILLIC SMALL LETTER CHE */
+"\x0448"	"sh"	/* <U0448> CYRILLIC SMALL LETTER SHA */
+"\x0449"	"shh"	/* <U0449> CYRILLIC SMALL LETTER SHCHA */
+"\x044a"	"``"	/* <U044A> CYRILLIC SMALL LETTER HARD SIGN */
+"\x044b"	"y`"	/* <U044B> CYRILLIC SMALL LETTER YERU */
+"\x044c"	"`"	/* <U044C> CYRILLIC SMALL LETTER SOFT SIGN */
+"\x044d"	"e`"	/* <U044D> CYRILLIC SMALL LETTER E */
+"\x044e"	"yu"	/* <U044E> CYRILLIC SMALL LETTER YU */
+"\x044f"	"ya"	/* <U044F> CYRILLIC SMALL LETTER YA */
+"\x0451"	"yo"	/* <U0451> CYRILLIC SMALL LETTER IO */
+"\x0452"	"dj"	/* <U0452> CYRILLIC SMALL LETTER DJE */
+"\x0453"	"g`"	/* <U0453> CYRILLIC SMALL LETTER GJE */
+"\x0454"	"ye"	/* <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE */
+"\x0455"	"z`"	/* <U0455> CYRILLIC SMALL LETTER DZE */
+"\x0456"	"i"	/* <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0457"	"yi"	/* <U0457> CYRILLIC SMALL LETTER YI */
+"\x0458"	"j"	/* <U0458> CYRILLIC SMALL LETTER JE */
+"\x0459"	"l`"	/* <U0459> CYRILLIC SMALL LETTER LJE */
+"\x045a"	"n`"	/* <U045A> CYRILLIC SMALL LETTER NJE */
+"\x045b"	"tsh"	/* <U045B> CYRILLIC SMALL LETTER TSHE */
+"\x045c"	"k`"	/* <U045C> CYRILLIC SMALL LETTER KJE */
+"\x045e"	"u`"	/* <U045E> CYRILLIC SMALL LETTER SHORT U */
+"\x045f"	"dh"	/* <U045F> CYRILLIC SMALL LETTER DZHE */
+"\x046a"	"O`"	/* <U046A> CYRILLIC CAPITAL LETTER BIG YUS */
+"\x046b"	"o`"	/* <U046B> CYRILLIC SMALL LETTER BIG YUS */
+"\x0472"	"FH"	/* <U0472> CYRILLIC CAPITAL LETTER FITA */
+"\x0473"	"fh"	/* <U0473> CYRILLIC SMALL LETTER FITA */
+"\x0474"	"YH"	/* <U0474> CYRILLIC CAPITAL LETTER IZHITSA */
+"\x0475"	"yh"	/* <U0475> CYRILLIC SMALL LETTER IZHITSA */
+"\x048c"	"E`"	/* <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN */
+"\x048d"	"e`"	/* <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN */
+"\x0490"	"G`"	/* <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
+"\x0491"	"g`"	/* <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN */
+"\x0492"	"GH"	/* <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE */
+"\x0493"	"gh"	/* <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE */
+"\x0494"	"GH"	/* <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */
+"\x0495"	"gh"	/* <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */
+"\x0496"	"ZH`"	/* <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */
+"\x0497"	"zh`"	/* <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER */
+"\x049a"	"K`"	/* <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER */
+"\x049b"	"k`"	/* <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER */
+"\x049e"	"K`"	/* <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE */
+"\x049f"	"k`"	/* <U049F> CYRILLIC SMALL LETTER KA WITH STROKE */
+"\x04a2"	"N`"	/* <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER */
+"\x04a3"	"n`"	/* <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER */
+"\x04a4"	"NG"	/* <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE */
+"\x04a5"	"ng"	/* <U04A5> CYRILLIC SMALL LIGATURE EN GHE */
+"\x04a6"	"P`"	/* <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */
+"\x04a7"	"p`"	/* <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */
+"\x04a8"	"O`"	/* <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA */
+"\x04a9"	"o`"	/* <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA */
+"\x04aa"	"C`"	/* <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER */
+"\x04ab"	"C`"	/* <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER */
+"\x04ac"	"T`"	/* <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER */
+"\x04ad"	"t`"	/* <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER */
+"\x04ae"	"U"	/* <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U */
+"\x04af"	"u"	/* <U04AF> CYRILLIC SMALL LETTER STRAIGHT U */
+"\x04b2"	"H`"	/* <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER */
+"\x04b3"	"h`"	/* <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER */
+"\x04b4"	"TCZ"	/* <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE */
+"\x04b5"	"tcz"	/* <U04B5> CYRILLIC SMALL LIGATURE TE TSE */
+"\x04ba"	"SH`"	/* <U04BA> CYRILLIC CAPITAL LETTER SHHA */
+"\x04bb"	"SH`"	/* <U04BB> CYRILLIC SMALL LETTER SHHA */
+"\x04bc"	"CH`"	/* <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE */
+"\x04bd"	"ch`"	/* <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE */
+"\x04be"	"CH`"	/* <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04bf"	"ch`"	/* <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04c0"	"i"	/* <U04C0> CYRILLIC LETTER PALOCHKA */
+"\x04c1"	"ZH`"	/* <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE */
+"\x04c2"	"zh`"	/* <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE */
+"\x04cb"	"CH`"	/* <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */
+"\x04cc"	"ch`"	/* <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE */
+"\x04d0"	"A`"	/* <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE */
+"\x04d1"	"a`"	/* <U04D1> CYRILLIC SMALL LETTER A WITH BREVE */
+"\x04d2"	"A`"	/* <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS */
+"\x04d3"	"a`"	/* <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS */
+"\x04d6"	"E`"	/* <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE */
+"\x04d7"	"e`"	/* <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE */
+"\x04d8"	"A`"	/* <U04D8> CYRILLIC CAPITAL LETTER SCHWA */
+"\x04d9"	"a`"	/* <U04D9> CYRILLIC SMALL LETTER SCHWA */
+"\x04dc"	"ZH`"	/* <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */
+"\x04dd"	"zh`"	/* <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */
+"\x04de"	"Z`"	/* <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */
+"\x04df"	"z`"	/* <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS */
+"\x04e0"	"Z`"	/* <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE */
+"\x04e1"	"z`"	/* <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE */
+"\x04e4"	"I`"	/* <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS */
+"\x04e5"	"i`"	/* <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS */
+"\x04e6"	"O`"	/* <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS */
+"\x04e7"	"o`"	/* <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS */
+"\x04e8"	"O`"	/* <U04E8> CYRILLIC CAPITAL LETTER BARRED O */
+"\x04e9"	"o`"	/* <U04E9> CYRILLIC SMALL LETTER BARRED O */
+"\x04f0"	"U`"	/* <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS */
+"\x04f1"	"u`"	/* <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS */
+"\x04f2"	"U`"	/* <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */
+"\x04f3"	"u`"	/* <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */
+"\x04f4"	"CH`"	/* <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */
+"\x04f5"	"ch`"	/* <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS */
+"\x04f8"	"Y`"	/* <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */
+"\x04f9"	"y`"	/* <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS */
 "\x2002"	" "	/* <U2002> EN SPACE */
 "\x2003"	" "	/* <U2003> EM SPACE */
 "\x2004"	" "	/* <U2004> THREE-PER-EM SPACE */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-19 22:48                     ` Egor Kobylkin
@ 2018-12-19 23:50                       ` Rafal Luzynski
  0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-19 23:50 UTC (permalink / raw)
  To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales

19.12.2018 23:48 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> May be I am missing something, could you tell how do you want to fit
> System A to ASCII exactly?
> 
> Let's take the very first example from the table:
> CyrillicUnicode	CyrillicLetter	CyrillicUnicodeName	LatinUnicode	System A
> Latin Letter	System B ASCII Letter
> 0401	Ё	CYRILLIC CAPITAL LETTER IO	00CB	Ë	YO
> 
> so:
> Cyrillic Ё U0401
> System A - Ë U00CB -  _not_ ASCII
> System B - YO (or Yo) "<U0059><U004F>" - ASCII
> 
> Could you explain how can we make System A "Ë" to be displayed or
> processes somehow in a C locale? Or in a locale or program that doesn't
> have "Ë" U00CB?

It should be "YO" (or "Yo").  Exactly as you provided in your previous
patches.

I am afraid that my description "Cyrillic -> Latin -> ASCII" was too
ambiguous, I am sorry about it.  Actually it is a list which says:
Convert Cyrillic "Ё" into Latin "Ë" if possible, otherwise to "YO" ("Yo").
We may stop using "Cyrillic -> Latin -> ASCII" picture as too ambiguous
and invent a better one.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
  2018-12-19 23:02           ` Egor Kobylkin
@ 2018-12-20  0:05             ` Rafal Luzynski
  0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-20  0:05 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Dmitry V. Levin,
	Marko Myllynen, mfabian

20.12.2018 00:02 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> But I'm not sure anymore that inserting a translit table into every
> locale is the right solution for ASCII problem. Especially because
> distributions may not include any locale but C.

My question (and my doubt) is whether they want to support Cyrillic
transliteration in that case.  If yes then maybe they also want more
transliterations as well.  I'm not telling we will include them now,
just wonder what is the reason why they were not yet included in C.

> [...]
> Actually there are quite a few letters already transliterated in
> locale/C-translit.h.in.

Sure, my list was not complete and I did not mean there are no Latin
characters supported.  But there is nothing from the long list of
á, à, ä, ã, ǎ, å, ā, ą, ạ, ȧ, ć, ĉ, ç, é, è, ë, ...

> (Note the CAPCAP transliteration style for the
> capitals, i.e. LATIN CAPITAL LETTER AE is mapped to AE, not to Ae.)

Sure, because they are ligatures: "A" + "E", not "A" + "e".  Note that
where three variants of ligatures exist, like "LJ", "Lj", "lj" then
all three are supported.

> [ cut the list ]
> 
> > [...]
> > I understand your aim and I agree to support ASCII.  Our disagreements
> > are:
> > 
> > * whether to support conversion Cyrillic -> extended Latin as well,
> no contest on my side
> > * which standard to implement,
> no contest on my side
> > * what to do if the standard is ambiguous or if some details cannot be
> >   implemented for technical reasons.
> no contest on my side either

Good, three steps forward.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2018-12-19 23:16     ` Egor Kobylkin
@ 2018-12-26 10:07       ` Siddhesh Poyarekar
  2018-12-26 12:13         ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Siddhesh Poyarekar @ 2018-12-26 10:07 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Marko Myllynen,
	Carlos O'Donell, digitalfreak

On 20/12/18 4:46 AM, Egor Kobylkin wrote:
> Freeze ping.
> 
> I'd like to ping the list on this patch and to have some discussion on
> moving ASCII transliteration to locale/C-translit.h.in before the freeze.
> 
> The wiki page for 2.29 [12] is set as "immutable" for newly registered
> users, not sure it is so desired. I could not add this patch there as
> "desired".
> I have added 2.29 keyword to the bug entry.
> 
> Bests,
> Egor Kobylkin
> 
> 
> [12] https://sourceware.org/glibc/wiki/Release/2.29

cc'd Rafal since I am not equipped to review this.  Only nit I can point 
out is that you need to remove the "Contributed by" line that you added; 
we don't do that any more.  You can remove the earlier contributed by 
line too since it's no longer part of our process.

Also, if you'd like edit access to the wiki then please tell me your 
username (assuming you've created an account on the wiki, please do if 
you haven't) and I'll add you to the editor group.  It's a measure we 
added to counter the high amounts of spam we faced on the wiki.

Thanks,
Siddhesh

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2018-12-26 10:07       ` Siddhesh Poyarekar
@ 2018-12-26 12:13         ` Egor Kobylkin
  2018-12-27  1:30           ` Siddhesh Poyarekar
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-26 12:13 UTC (permalink / raw)
  To: Siddhesh Poyarekar, libc-alpha, libc-locales, Marko Myllynen,
	Carlos O'Donell, digitalfreak

On 26.12.18 11:07, Siddhesh Poyarekar wrote:
> On 20/12/18 4:46 AM, Egor Kobylkin wrote:
>> Freeze ping.
>>
>> I'd like to ping the list on this patch and to have some discussion on
>> moving ASCII transliteration to locale/C-translit.h.in before the freeze.
>>
>> The wiki page for 2.29 [12] is set as "immutable" for newly registered
>> users, not sure it is so desired. I could not add this patch there as
>> "desired".
>> I have added 2.29 keyword to the bug entry.
>>
>> Bests,
>> Egor Kobylkin
>>
>>
>> [12] https://sourceware.org/glibc/wiki/Release/2.29
> 
> cc'd Rafal since I am not equipped to review this.  Only nit I can point
> out is that you need to remove the "Contributed by" line that you added;
> we don't do that any more.  You can remove the earlier contributed by
> line too since it's no longer part of our process.
> 
> Also, if you'd like edit access to the wiki then please tell me your
> username (assuming you've created an account on the wiki, please do if
> you haven't) and I'll add you to the editor group.  It's a measure we
> added to counter the high amounts of spam we faced on the wiki.
> 
> Thanks,
> Siddhesh

Thanks, Siddhesh, yes, please could you add my username EgorKobylkin to
the editors group.

Rafal has requested help and guidance about this patch in another email
to this list [1]. I hope other members would chime in on that in time
for 2.29. I understand we need input from those involved in C locale
that is compiled into the libc binaries (as opposed to the rest of
locales that are shipped in plain text, not compiled).

@Rafal - I know you have asked to drop your email from To: as you are
getting them through your list subscription and so twice. But I guess
To: is still helpful to see who is involved. I am not subscribed to the
list myself, so I would like my email to be kept on To: or CC: for this.

Bests,
Egor

[1] https://sourceware.org/ml/libc-alpha/2018-12/msg00787.html

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2018-12-26 12:13         ` Egor Kobylkin
@ 2018-12-27  1:30           ` Siddhesh Poyarekar
  2018-12-27 11:28             ` Rafal Luzynski
  0 siblings, 1 reply; 111+ messages in thread
From: Siddhesh Poyarekar @ 2018-12-27  1:30 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Marko Myllynen,
	Carlos O'Donell, digitalfreak

On 26/12/18 5:43 PM, Egor Kobylkin wrote:
> Thanks, Siddhesh, yes, please could you add my username EgorKobylkin to
> the editors group.

Done.  Here's a weird statistic: you're the first user on that wiki with 
name starting with E!

> Rafal has requested help and guidance about this patch in another email
> to this list [1]. I hope other members would chime in on that in time
> for 2.29. I understand we need input from those involved in C locale
> that is compiled into the libc binaries (as opposed to the rest of
> locales that are shipped in plain text, not compiled).

Ah OK, I missed that email.  It'll have to wait for more inputs though 
because like I said, I don't have enough experience in locales to make 
an intelligent comment, definitely not for Cyrillic.

> @Rafal - I know you have asked to drop your email from To: as you are
> getting them through your list subscription and so twice. But I guess
> To: is still helpful to see who is involved. I am not subscribed to the
> list myself, so I would like my email to be kept on To: or CC: for this.

I added @Rafal because it's kinda standard practice to do that to get an 
individual's attention since otherwise an email could get lost in the 
traffic.  @Rafal, I'll remove it if you object.

Siddhesh

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2018-12-27  1:30           ` Siddhesh Poyarekar
@ 2018-12-27 11:28             ` Rafal Luzynski
  0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-27 11:28 UTC (permalink / raw)
  To: Siddhesh Poyarekar, Egor Kobylkin, libc-alpha, libc-locales,
	Marko Myllynen, Carlos O'Donell

27.12.2018 02:30 Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 26/12/18 5:43 PM, Egor Kobylkin wrote:
> [...]
> > Rafal has requested help and guidance about this patch in another email
> > to this list [1]. I hope other members would chime in on that in time
> > for 2.29. I understand we need input from those involved in C locale
> > that is compiled into the libc binaries (as opposed to the rest of
> > locales that are shipped in plain text, not compiled).
> 
> Ah OK, I missed that email.  It'll have to wait for more inputs though 
> because like I said, I don't have enough experience in locales to make 
> an intelligent comment, definitely not for Cyrillic.

My email is here:

https://sourceware.org/ml/libc-alpha/2018-12/msg00787.html

My questions are not related with Cyrillic but in general how
transliteration should be implemented.  You may replace "Cyrillic"
with any other script you know and ask yourself "how would I implement
transliteration from Foo Alphabet to ASCII".

I think that so far there was no transliteration common for all
locales except translit_combine which just removes the combining
diacritic characters.

Can we have any live meeting, like on IRC?  I think that we could have
more questions answered in direct conversation.  By email we can have
little more than one question and answer per day.

> > @Rafal - I know you have asked to drop your email from To: as you are
> > getting them through your list subscription and so twice. But I guess
> > To: is still helpful to see who is involved. I am not subscribed to the
> > list myself, so I would like my email to be kept on To: or CC: for this.
> 
> I added @Rafal because it's kinda standard practice to do that to get an 
> individual's attention since otherwise an email could get lost in the 
> traffic.  @Rafal, I'll remove it if you object.

I don't object here.  Previously I was complaining about large patches
which arrive in two copies and tend to exceed my email quota.  Regular
conversation does not cause much problem for me.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (11 preceding siblings ...)
  2018-12-08 22:28   ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
@ 2019-01-02 18:38   ` Egor Kobylkin
  2019-01-05 14:35     ` Rafal Luzynski
  2019-04-09  1:04     ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
  2019-03-19 10:39   ` ping " Egor Kobylkin
  13 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2019-01-02 18:38 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Carlos O'Donell, Siddhesh Poyarekar,
	Rafal Luzynski
  Cc: Marko Myllynen, mfabian

[-- Attachment #1: Type: text/plain, Size: 5979 bytes --]

Changelog v12:
* Adjusted to the new comment style suddenly appearing in the target 
file locale/C-translit.h.in (the original file changed on the master 
branch from /* style to # style since v11)
* Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to 
"sh`" instead of erroneous "SH`" in v11

Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.

Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
   to sequences of all uppercase Latin letters in all languages (whenever
   a Cyrillic letter is transliterated to more than one Latin letter),
   for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration rows to locale/C-translit.h.in.

The patch is attached.


Current bug effect:

The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:

iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- it produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here.


COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].

The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.

Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3

Best regards,
Egor Kobylkin




[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 10494 bytes --]

From 46e0d0e3d07805ec853fdd72dc3793995cb5593c Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 2 Jan 2019 05:50:13 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 169 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 169 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index d5f00df0f3..758171c394 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -56,6 +56,175 @@
 "\x02cd"	"_"	# <U02CD> MODIFIER LETTER LOW MACRON
 "\x02d0"	":"	# <U02D0> MODIFIER LETTER TRIANGULAR COLON
 "\x02dc"	"~"	# <U02DC> SMALL TILDE
+"\x0401"	"YO"	# <U0401> CYRILLIC CAPITAL LETTER IO
+"\x0402"	"DJ"	# <U0402> CYRILLIC CAPITAL LETTER DJE
+"\x0403"	"G`"	# <U0403> CYRILLIC CAPITAL LETTER GJE
+"\x0404"	"YE"	# <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE
+"\x0405"	"Z`"	# <U0405> CYRILLIC CAPITAL LETTER DZE
+"\x0406"	"I"	# <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0407"	"YI"	# <U0407> CYRILLIC CAPITAL LETTER YI
+"\x0408"	"J"	# <U0408> CYRILLIC CAPITAL LETTER JE
+"\x0409"	"L`"	# <U0409> CYRILLIC CAPITAL LETTER LJE
+"\x040a"	"N`"	# <U040A> CYRILLIC CAPITAL LETTER NJE
+"\x040b"	"TSH"	# <U040B> CYRILLIC CAPITAL LETTER TSHE
+"\x040c"	"K`"	# <U040C> CYRILLIC CAPITAL LETTER KJE
+"\x040e"	"U`"	# <U040E> CYRILLIC CAPITAL LETTER SHORT U
+"\x040f"	"DH"	# <U040F> CYRILLIC CAPITAL LETTER DZHE
+"\x0410"	"A"	# <U0410> CYRILLIC CAPITAL LETTER A
+"\x0411"	"B"	# <U0411> CYRILLIC CAPITAL LETTER BE
+"\x0412"	"V"	# <U0412> CYRILLIC CAPITAL LETTER VE
+"\x0413"	"G"	# <U0413> CYRILLIC CAPITAL LETTER GHE
+"\x0414"	"D"	# <U0414> CYRILLIC CAPITAL LETTER DE
+"\x0415"	"E"	# <U0415> CYRILLIC CAPITAL LETTER IE
+"\x0416"	"ZH"	# <U0416> CYRILLIC CAPITAL LETTER ZHE
+"\x0417"	"Z"	# <U0417> CYRILLIC CAPITAL LETTER ZE
+"\x0418"	"I"	# <U0418> CYRILLIC CAPITAL LETTER I
+"\x0419"	"J"	# <U0419> CYRILLIC CAPITAL LETTER SHORT I
+"\x041a"	"K"	# <U041A> CYRILLIC CAPITAL LETTER KA
+"\x041b"	"L"	# <U041B> CYRILLIC CAPITAL LETTER EL
+"\x041c"	"M"	# <U041C> CYRILLIC CAPITAL LETTER EM
+"\x041d"	"N"	# <U041D> CYRILLIC CAPITAL LETTER EN
+"\x041e"	"O"	# <U041E> CYRILLIC CAPITAL LETTER O
+"\x041f"	"P"	# <U041F> CYRILLIC CAPITAL LETTER PE
+"\x0420"	"R"	# <U0420> CYRILLIC CAPITAL LETTER ER
+"\x0421"	"S"	# <U0421> CYRILLIC CAPITAL LETTER ES
+"\x0422"	"T"	# <U0422> CYRILLIC CAPITAL LETTER TE
+"\x0423"	"U"	# <U0423> CYRILLIC CAPITAL LETTER U
+"\x0424"	"F"	# <U0424> CYRILLIC CAPITAL LETTER EF
+"\x0425"	"X"	# <U0425> CYRILLIC CAPITAL LETTER HA
+"\x0426"	"CZ"	# <U0426> CYRILLIC CAPITAL LETTER TSE
+"\x0427"	"CH"	# <U0427> CYRILLIC CAPITAL LETTER CHE
+"\x0428"	"SH"	# <U0428> CYRILLIC CAPITAL LETTER SHA
+"\x0429"	"SHH"	# <U0429> CYRILLIC CAPITAL LETTER SHCHA
+"\x042a"	"A`"	# <U042A> CYRILLIC CAPITAL LETTER HARD SIGN
+"\x042b"	"Y`"	# <U042B> CYRILLIC CAPITAL LETTER YERU
+"\x042c"	"`"	# <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN
+"\x042d"	"E`"	# <U042D> CYRILLIC CAPITAL LETTER E
+"\x042e"	"YU"	# <U042E> CYRILLIC CAPITAL LETTER YU
+"\x042f"	"YA"	# <U042F> CYRILLIC CAPITAL LETTER YA
+"\x0430"	"a"	# <U0430> CYRILLIC SMALL LETTER A
+"\x0431"	"b"	# <U0431> CYRILLIC SMALL LETTER BE
+"\x0432"	"v"	# <U0432> CYRILLIC SMALL LETTER VE
+"\x0433"	"g"	# <U0433> CYRILLIC SMALL LETTER GHE
+"\x0434"	"d"	# <U0434> CYRILLIC SMALL LETTER DE
+"\x0435"	"e"	# <U0435> CYRILLIC SMALL LETTER IE
+"\x0436"	"zh"	# <U0436> CYRILLIC SMALL LETTER ZHE
+"\x0437"	"z"	# <U0437> CYRILLIC SMALL LETTER ZE
+"\x0438"	"i"	# <U0438> CYRILLIC SMALL LETTER I
+"\x0439"	"j"	# <U0439> CYRILLIC SMALL LETTER SHORT I
+"\x043a"	"k"	# <U043A> CYRILLIC SMALL LETTER KA
+"\x043b"	"l"	# <U043B> CYRILLIC SMALL LETTER EL
+"\x043c"	"m"	# <U043C> CYRILLIC SMALL LETTER EM
+"\x043d"	"n"	# <U043D> CYRILLIC SMALL LETTER EN
+"\x043e"	"o"	# <U043E> CYRILLIC SMALL LETTER O
+"\x043f"	"p"	# <U043F> CYRILLIC SMALL LETTER PE
+"\x0440"	"r"	# <U0440> CYRILLIC SMALL LETTER ER
+"\x0441"	"s"	# <U0441> CYRILLIC SMALL LETTER ES
+"\x0442"	"t"	# <U0442> CYRILLIC SMALL LETTER TE
+"\x0443"	"u"	# <U0443> CYRILLIC SMALL LETTER U
+"\x0444"	"f"	# <U0444> CYRILLIC SMALL LETTER EF
+"\x0445"	"x"	# <U0445> CYRILLIC SMALL LETTER HA
+"\x0446"	"cz"	# <U0446> CYRILLIC SMALL LETTER TSE
+"\x0447"	"ch"	# <U0447> CYRILLIC SMALL LETTER CHE
+"\x0448"	"sh"	# <U0448> CYRILLIC SMALL LETTER SHA
+"\x0449"	"shh"	# <U0449> CYRILLIC SMALL LETTER SHCHA
+"\x044a"	"``"	# <U044A> CYRILLIC SMALL LETTER HARD SIGN
+"\x044b"	"y`"	# <U044B> CYRILLIC SMALL LETTER YERU
+"\x044c"	"`"	# <U044C> CYRILLIC SMALL LETTER SOFT SIGN
+"\x044d"	"e`"	# <U044D> CYRILLIC SMALL LETTER E
+"\x044e"	"yu"	# <U044E> CYRILLIC SMALL LETTER YU
+"\x044f"	"ya"	# <U044F> CYRILLIC SMALL LETTER YA
+"\x0451"	"yo"	# <U0451> CYRILLIC SMALL LETTER IO
+"\x0452"	"dj"	# <U0452> CYRILLIC SMALL LETTER DJE
+"\x0453"	"g`"	# <U0453> CYRILLIC SMALL LETTER GJE
+"\x0454"	"ye"	# <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE
+"\x0455"	"z`"	# <U0455> CYRILLIC SMALL LETTER DZE
+"\x0456"	"i"	# <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0457"	"yi"	# <U0457> CYRILLIC SMALL LETTER YI
+"\x0458"	"j"	# <U0458> CYRILLIC SMALL LETTER JE
+"\x0459"	"l`"	# <U0459> CYRILLIC SMALL LETTER LJE
+"\x045a"	"n`"	# <U045A> CYRILLIC SMALL LETTER NJE
+"\x045b"	"tsh"	# <U045B> CYRILLIC SMALL LETTER TSHE
+"\x045c"	"k`"	# <U045C> CYRILLIC SMALL LETTER KJE
+"\x045e"	"u`"	# <U045E> CYRILLIC SMALL LETTER SHORT U
+"\x045f"	"dh"	# <U045F> CYRILLIC SMALL LETTER DZHE
+"\x046a"	"O`"	# <U046A> CYRILLIC CAPITAL LETTER BIG YUS
+"\x046b"	"o`"	# <U046B> CYRILLIC SMALL LETTER BIG YUS
+"\x0472"	"FH"	# <U0472> CYRILLIC CAPITAL LETTER FITA
+"\x0473"	"fh"	# <U0473> CYRILLIC SMALL LETTER FITA
+"\x0474"	"YH"	# <U0474> CYRILLIC CAPITAL LETTER IZHITSA
+"\x0475"	"yh"	# <U0475> CYRILLIC SMALL LETTER IZHITSA
+"\x048c"	"E`"	# <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+"\x048d"	"e`"	# <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN
+"\x0490"	"G`"	# <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+"\x0491"	"g`"	# <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN
+"\x0492"	"GH"	# <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE
+"\x0493"	"gh"	# <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE
+"\x0494"	"GH"	# <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+"\x0495"	"gh"	# <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+"\x0496"	"ZH`"	# <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+"\x0497"	"zh`"	# <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+"\x049a"	"K`"	# <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+"\x049b"	"k`"	# <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER
+"\x049e"	"K`"	# <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE
+"\x049f"	"k`"	# <U049F> CYRILLIC SMALL LETTER KA WITH STROKE
+"\x04a2"	"N`"	# <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+"\x04a3"	"n`"	# <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER
+"\x04a4"	"NG"	# <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE
+"\x04a5"	"ng"	# <U04A5> CYRILLIC SMALL LIGATURE EN GHE
+"\x04a6"	"P`"	# <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+"\x04a7"	"p`"	# <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+"\x04a8"	"O`"	# <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA
+"\x04a9"	"o`"	# <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA
+"\x04aa"	"C`"	# <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+"\x04ab"	"C`"	# <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER
+"\x04ac"	"T`"	# <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+"\x04ad"	"t`"	# <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER
+"\x04ae"	"U"	# <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U
+"\x04af"	"u"	# <U04AF> CYRILLIC SMALL LETTER STRAIGHT U
+"\x04b2"	"H`"	# <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+"\x04b3"	"h`"	# <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER
+"\x04b4"	"TCZ"	# <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE
+"\x04b5"	"tcz"	# <U04B5> CYRILLIC SMALL LIGATURE TE TSE
+"\x04ba"	"SH`"	# <U04BA> CYRILLIC CAPITAL LETTER SHHA
+"\x04bb"	"sh`"	# <U04BB> CYRILLIC SMALL LETTER SHHA
+"\x04bc"	"CH`"	# <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+"\x04bd"	"ch`"	# <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE
+"\x04be"	"CH`"	# <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04bf"	"ch`"	# <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04c0"	"i"	# <U04C0> CYRILLIC LETTER PALOCHKA
+"\x04c1"	"ZH`"	# <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+"\x04c2"	"zh`"	# <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE
+"\x04cb"	"CH`"	# <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+"\x04cc"	"ch`"	# <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE
+"\x04d0"	"A`"	# <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE
+"\x04d1"	"a`"	# <U04D1> CYRILLIC SMALL LETTER A WITH BREVE
+"\x04d2"	"A`"	# <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+"\x04d3"	"a`"	# <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS
+"\x04d6"	"E`"	# <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE
+"\x04d7"	"e`"	# <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE
+"\x04d8"	"A`"	# <U04D8> CYRILLIC CAPITAL LETTER SCHWA
+"\x04d9"	"a`"	# <U04D9> CYRILLIC SMALL LETTER SCHWA
+"\x04dc"	"ZH`"	# <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+"\x04dd"	"zh`"	# <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+"\x04de"	"Z`"	# <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+"\x04df"	"z`"	# <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+"\x04e0"	"Z`"	# <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+"\x04e1"	"z`"	# <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE
+"\x04e4"	"I`"	# <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+"\x04e5"	"i`"	# <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS
+"\x04e6"	"O`"	# <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+"\x04e7"	"o`"	# <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS
+"\x04e8"	"O`"	# <U04E8> CYRILLIC CAPITAL LETTER BARRED O
+"\x04e9"	"o`"	# <U04E9> CYRILLIC SMALL LETTER BARRED O
+"\x04f0"	"U`"	# <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+"\x04f1"	"u`"	# <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS
+"\x04f2"	"U`"	# <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+"\x04f3"	"u`"	# <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+"\x04f4"	"CH`"	# <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+"\x04f5"	"ch`"	# <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+"\x04f8"	"Y`"	# <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+"\x04f9"	"y`"	# <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS
 "\x2002"	" "	# <U2002> EN SPACE
 "\x2003"	" "	# <U2003> EM SPACE
 "\x2004"	" "	# <U2004> THREE-PER-EM SPACE
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-01-02 18:38   ` [PATCH v12] " Egor Kobylkin
@ 2019-01-05 14:35     ` Rafal Luzynski
  2019-01-05 21:12       ` Egor Kobylkin
  2019-04-09  1:04     ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
  1 sibling, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2019-01-05 14:35 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell,
	Siddhesh Poyarekar
  Cc: Marko Myllynen, mfabian

2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> Changelog v12:
> [...]
> 
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> [...]

I have tested this and, unfortunately, now this transliteration
works *only* in C locale, that is, only when no locale is set or when
it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
is set to anything different, including en_US, ru_RU, etc.

I'm sorry for confusing you.  I think that either we should revert back
to the older versions of your patch to make all locales supported or
merge those two versions to make the transliteration work both in
C and in all (almost all) other locales.  Unfortunately, C locale is
not a base for all other locales and is not included, it is only a fallback
when a locale does not provide its own data (that is, when it does not
provide any transliteration table at all).

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-01-05 14:35     ` Rafal Luzynski
@ 2019-01-05 21:12       ` Egor Kobylkin
  2019-01-07 20:37         ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-01-05 21:12 UTC (permalink / raw)
  To: Rafal Luzynski, libc-alpha, libc-locales, Carlos O'Donell,
	Siddhesh Poyarekar
  Cc: Marko Myllynen, mfabian

On 05.01.19 15:35, Rafal Luzynski wrote:
> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> Changelog v12:
>> [...]
>>
>> Changelog v11:
>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>> file for the ASCII translit table.
>> * Correspondingly the patch now only contains the additional
>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>> The 'include "translit_cyrillic";""' directives are not necessary in the
>> locale files and they are now all left intact.
>> * Also the file translit_cyrillic is not longer needed and is omitted.
>> * Edited below email, commit message.
>> [...]
> 
> I have tested this and, unfortunately, now this transliteration
> works *only* in C locale, that is, only when no locale is set or when
> it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
> is set to anything different, including en_US, ru_RU, etc.
> 
> I'm sorry for confusing you.  I think that either we should revert back
> to the older versions of your patch to make all locales supported or
> merge those two versions to make the transliteration work both in
> C and in all (almost all) other locales.  Unfortunately, C locale is
> not a base for all other locales and is not included, it is only a fallback
> when a locale does not provide its own data (that is, when it does not
> provide any transliteration table at all).

Good catch! Should we maybe split this into two patches, one for C and 
the other for "country" locales? They have different codes and 
functionality so it looks like it would be easier to keep focus.

My understanding is that locale/C-translit.h.in is still the proper 
locale for the sole ASCII translit table. It is also the only solution 
for many use cases where there is no locale available (not compiled or 
not set).
"Country" locales in localedata/locales/ can then have the exact same 
translit table included or they can have any other flavor - I don't see 
a problem here.

Best regards,
Egor

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-01-05 21:12       ` Egor Kobylkin
@ 2019-01-07 20:37         ` Marko Myllynen
  2019-01-09  0:46           ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2019-01-07 20:37 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar
  Cc: Mike Fabian

Hi,

On 05/01/2019 23.12, Egor Kobylkin wrote:
> On 05.01.19 15:35, Rafal Luzynski wrote:
>> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> Changelog v12:
>>> [...]
>>>
>>> Changelog v11:
>>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>>> file for the ASCII translit table.
>>> * Correspondingly the patch now only contains the additional
>>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>>> The 'include "translit_cyrillic";""' directives are not necessary in the
>>> locale files and they are now all left intact.
>>> * Also the file translit_cyrillic is not longer needed and is omitted.
>>> * Edited below email, commit message.
>>> [...]
>>
>> I have tested this and, unfortunately, now this transliteration
>> works *only* in C locale, that is, only when no locale is set or when
>> it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
>> is set to anything different, including en_US, ru_RU, etc.
> 
> Good catch! Should we maybe split this into two patches, one for C and
> the other for "country" locales? They have different codes and
> functionality so it looks like it would be easier to keep focus.

That would probably make sense, the standard C/POSIX locale won't
support System A so it also narrows down solution alternatives with it.

(If the C.UTF-8 locale (see
https://sourceware.org/bugzilla/show_bug.cgi?id=17318) materializes one
day I'm not sure would transliteration be applicable in that context.)

> My understanding is that locale/C-translit.h.in is still the proper
> locale for the sole ASCII translit table. It is also the only solution
> for many use cases where there is no locale available (not compiled or
> not set).

Correct, as Siddhesh mentioned those rules will end up to the built-in
C/POSIX locale which is ASCII and will be used if no other locales are
available or set properly. The translit_* files won't affect to it.

> "Country" locales in localedata/locales/ can then have the exact same
> translit table included or they can have any other flavor - I don't see
> a problem here.

Indeed, and since those files are not limited to ASCII, perhaps we could
now reconsider the v9 approach for them, i.e., prefer System A if
possible, otherwise use System B / ASCII (just need to make sure that
the ASCII fall-back for them will match the built-in C ASCII rule)?

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-01-07 20:37         ` Marko Myllynen
@ 2019-01-09  0:46           ` Egor Kobylkin
  2019-01-09 20:03             ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-01-09  0:46 UTC (permalink / raw)
  To: Marko Myllynen, Rafal Luzynski, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar
  Cc: Mike Fabian

On 07.01.19 21:37, Marko Myllynen wrote:
> Hi,
> 
> On 05/01/2019 23.12, Egor Kobylkin wrote:
>> On 05.01.19 15:35, Rafal Luzynski wrote:
>>> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> Changelog v12:
>>>> [...]
>>>>
>>>> Changelog v11:
>>>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>>>> file for the ASCII translit table.
>>>> * Correspondingly the patch now only contains the additional
>>>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>>>> The 'include "translit_cyrillic";""' directives are not necessary in the
>>>> locale files and they are now all left intact.
>>>> * Also the file translit_cyrillic is not longer needed and is omitted.
>>>> * Edited below email, commit message.
>>>> [...]
>>>
>>> I have tested this and, unfortunately, now this transliteration
>>> works *only* in C locale, that is, only when no locale is set or when
>>> it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
>>> is set to anything different, including en_US, ru_RU, etc.
>>
>> Good catch! Should we maybe split this into two patches, one for C and
>> the other for "country" locales? They have different codes and
>> functionality so it looks like it would be easier to keep focus.
> 
> That would probably make sense, the standard C/POSIX locale won't
> support System A so it also narrows down solution alternatives with it.
> 

[SNIP]

>> "Country" locales in localedata/locales/ can then have the exact same
>> translit table included or they can have any other flavor - I don't see
>> a problem here.
> 
> Indeed, and since those files are not limited to ASCII, perhaps we could
> now reconsider the v9 approach for them, i.e., prefer System A if
> possible, otherwise use System B / ASCII (just need to make sure that
> the ASCII fall-back for them will match the built-in C ASCII rule)?
> 

Happy to hear the split seems to be a clear cut one.
How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]... 
C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report 
(number) and title for clarity in communication?

The bug report for [PATCH v9] ("Countries" locales) should then ideally 
have your (and others) explicit requirements as to the GOST System A/B 
fall-back, which countries to include etc. Again, myself I have no other 
req. here but just to have _any_ translit in place.

This way it would probably be easier to have the decision making process 
tied up for both patches (separately). We may want to get the v12 POSIX 
out of the door in 2.30 then and can take all the time we need to set up 
the rules for "Countries" locales as you need them to be.

Bests,
Egor


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-01-09  0:46           ` Egor Kobylkin
@ 2019-01-09 20:03             ` Marko Myllynen
  2019-02-04  7:14               ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2019-01-09 20:03 UTC (permalink / raw)
  To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar
  Cc: Mike Fabian

Hi,

On 09/01/2019 02.46, Egor Kobylkin wrote:
> On 07.01.19 21:37, Marko Myllynen wrote:
>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>
>>> Good catch! Should we maybe split this into two patches, one for C and
>>> the other for "country" locales? They have different codes and
>>> functionality so it looks like it would be easier to keep focus.
>>
>> That would probably make sense, the standard C/POSIX locale won't
>> support System A so it also narrows down solution alternatives with it.
>>
>>> "Country" locales in localedata/locales/ can then have the exact same
>>> translit table included or they can have any other flavor - I don't see
>>> a problem here.
>>
>> Indeed, and since those files are not limited to ASCII, perhaps we could
>> now reconsider the v9 approach for them, i.e., prefer System A if
>> possible, otherwise use System B / ASCII (just need to make sure that
>> the ASCII fall-back for them will match the built-in C ASCII rule)?
> 
> Happy to hear the split seems to be a clear cut one.
> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
> (number) and title for clarity in communication?

I'm not sure is a new BZ really needed for such an addition, perhaps a
NEWS entry might be more appropriate (with the full details explained in
the commit messages of course) but I'll leave this to others to decide.

> This way it would probably be easier to have the decision making process
> tied up for both patches (separately). We may want to get the v12 POSIX
> out of the door in 2.30 then and can take all the time we need to set up
> the rules for "Countries" locales as you need them to be.

Perhaps Rafal or Carlos have better suggestions but I would think we
could have a patch series where the patch 1/3 adds the C/POSIX locale
part (that would be what you posted as v12), then patch 2/3 adds
translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
System A and GOST 7.79 System B as a fall-back (which would match the
C/POSIX rules)), and finally the patch 3/3 updates locales to use
translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
alternative suggestions so it might be best to wait for their feedback
before doing anything yet (it's unfortunate you've had to do so many
iterations around this already but I think we've all learned something
during the process and the end result will be more correct than any of
the earlier versions).

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-01-09 20:03             ` Marko Myllynen
@ 2019-02-04  7:14               ` Egor Kobylkin
  2019-02-14 16:48                 ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-02-04  7:14 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Carlos O'Donell
  Cc: Marko Myllynen, Rafal Luzynski, Siddhesh Poyarekar, Mike Fabian

Carlos,
are you comfortable to pick this up again this month?

I would really love to have a reliable action plan to get this committed 
for 2.30. Maybe cut out a subset that is undisputed and commit only that 
first. It looks kinda like an eternal moving target otherwise.

for you reference:
https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html

Bests,
Egor Kobylkin

On 09.01.19 21:03, Marko Myllynen wrote:
> Hi,
> 
> On 09/01/2019 02.46, Egor Kobylkin wrote:
>> On 07.01.19 21:37, Marko Myllynen wrote:
>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>
>>>> Good catch! Should we maybe split this into two patches, one for C and
>>>> the other for "country" locales? They have different codes and
>>>> functionality so it looks like it would be easier to keep focus.
>>>
>>> That would probably make sense, the standard C/POSIX locale won't
>>> support System A so it also narrows down solution alternatives with it.
>>>
>>>> "Country" locales in localedata/locales/ can then have the exact same
>>>> translit table included or they can have any other flavor - I don't see
>>>> a problem here.
>>>
>>> Indeed, and since those files are not limited to ASCII, perhaps we could
>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>> possible, otherwise use System B / ASCII (just need to make sure that
>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>
>> Happy to hear the split seems to be a clear cut one.
>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>> (number) and title for clarity in communication?
> 
> I'm not sure is a new BZ really needed for such an addition, perhaps a
> NEWS entry might be more appropriate (with the full details explained in
> the commit messages of course) but I'll leave this to others to decide.
> 
>> This way it would probably be easier to have the decision making process
>> tied up for both patches (separately). We may want to get the v12 POSIX
>> out of the door in 2.30 then and can take all the time we need to set up
>> the rules for "Countries" locales as you need them to be.
> 
> Perhaps Rafal or Carlos have better suggestions but I would think we
> could have a patch series where the patch 1/3 adds the C/POSIX locale
> part (that would be what you posted as v12), then patch 2/3 adds
> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
> System A and GOST 7.79 System B as a fall-back (which would match the
> C/POSIX rules)), and finally the patch 3/3 updates locales to use
> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
> alternative suggestions so it might be best to wait for their feedback
> before doing anything yet (it's unfortunate you've had to do so many
> iterations around this already but I think we've all learned something
> during the process and the end result will be more correct than any of
> the earlier versions).
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-02-04  7:14               ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
@ 2019-02-14 16:48                 ` Marko Myllynen
  2019-03-04 22:11                   ` Egor Kobylkin
  2019-04-19 22:24                   ` Rafal Luzynski
  0 siblings, 2 replies; 111+ messages in thread
From: Marko Myllynen @ 2019-02-14 16:48 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell
  Cc: Rafal Luzynski, Siddhesh Poyarekar, Mike Fabian

Hi Carlos, Mike, Rafal,

It seems clear that you all are currently too busy to have a look at
this but would you have any estimate when you might be able to review
this so that we could consider merging?

FWIW, I chatted with Egor off-list and we're on the same page wrt the
following, hopefully this gives you a bit off jump start for this
subject when you have time to dig deeper:

1) Built-in C locale doesn't read/use any translit_* files and it can't
have any fallback mechanisms and it only supports ASCII so using GOST
7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
be the appropriate way to implement Cyrillic transliteration for the
built-in C locale (it adds some 8KB to the binary).

2) Other locales read/use translit_* files and with them fallbacks and
non-ASCII are possible so it would seem preferable to first try ISO 9 /
GOST 7.79 System A and only if that fails then use GOST 7.79 System B
(in which case the end result should match with the built-in C locale).
For this the translit_cyrillic file should be added (as per patch v9 +
changes mentioned in patches v10 and v12).

3) Individual locale files can then be updated to use translit_cyrillic
as appropriate (see patch v9) and language/national specific conventions
(e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.

Thanks,

On 04/02/2019 09.14, Egor Kobylkin wrote:
> Carlos,
> are you comfortable to pick this up again this month?
> 
> I would really love to have a reliable action plan to get this committed
> for 2.30. Maybe cut out a subset that is undisputed and commit only that
> first. It looks kinda like an eternal moving target otherwise.
> 
> for you reference:
> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
> 
> Bests,
> Egor Kobylkin
> 
> On 09.01.19 21:03, Marko Myllynen wrote:
>> Hi,
>>
>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>
>>>>> Good catch! Should we maybe split this into two patches, one for C and
>>>>> the other for "country" locales? They have different codes and
>>>>> functionality so it looks like it would be easier to keep focus.
>>>>
>>>> That would probably make sense, the standard C/POSIX locale won't
>>>> support System A so it also narrows down solution alternatives with it.
>>>>
>>>>> "Country" locales in localedata/locales/ can then have the exact same
>>>>> translit table included or they can have any other flavor - I don't
>>>>> see
>>>>> a problem here.
>>>>
>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>> could
>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>> possible, otherwise use System B / ASCII (just need to make sure that
>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>
>>> Happy to hear the split seems to be a clear cut one.
>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>> (number) and title for clarity in communication?
>>
>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>> NEWS entry might be more appropriate (with the full details explained in
>> the commit messages of course) but I'll leave this to others to decide.
>>
>>> This way it would probably be easier to have the decision making process
>>> tied up for both patches (separately). We may want to get the v12 POSIX
>>> out of the door in 2.30 then and can take all the time we need to set up
>>> the rules for "Countries" locales as you need them to be.
>>
>> Perhaps Rafal or Carlos have better suggestions but I would think we
>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>> part (that would be what you posted as v12), then patch 2/3 adds
>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>> System A and GOST 7.79 System B as a fall-back (which would match the
>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
>> alternative suggestions so it might be best to wait for their feedback
>> before doing anything yet (it's unfortunate you've had to do so many
>> iterations around this already but I think we've all learned something
>> during the process and the end result will be more correct than any of
>> the earlier versions).
>>
>> Thanks,
>>


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-02-14 16:48                 ` Marko Myllynen
@ 2019-03-04 22:11                   ` Egor Kobylkin
  2019-03-11 13:59                     ` PING " Egor Kobylkin
  2019-04-19 22:24                   ` Rafal Luzynski
  1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-04 22:11 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
	Rafal Luzynski, Mike Fabian
  Cc: Siddhesh Poyarekar, Dmitry V. Levin

ping

On 14.02.19 17:48, Marko Myllynen wrote:
> Hi Carlos, Mike, Rafal,
> 
> It seems clear that you all are currently too busy to have a look at
> this but would you have any estimate when you might be able to review
> this so that we could consider merging?
> 
> FWIW, I chatted with Egor off-list and we're on the same page wrt the
> following, hopefully this gives you a bit off jump start for this
> subject when you have time to dig deeper:
> 
> 1) Built-in C locale doesn't read/use any translit_* files and it can't
> have any fallback mechanisms and it only supports ASCII so using GOST
> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
> be the appropriate way to implement Cyrillic transliteration for the
> built-in C locale (it adds some 8KB to the binary).
> 
> 2) Other locales read/use translit_* files and with them fallbacks and
> non-ASCII are possible so it would seem preferable to first try ISO 9 /
> GOST 7.79 System A and only if that fails then use GOST 7.79 System B
> (in which case the end result should match with the built-in C locale).
> For this the translit_cyrillic file should be added (as per patch v9 +
> changes mentioned in patches v10 and v12).
> 
> 3) Individual locale files can then be updated to use translit_cyrillic
> as appropriate (see patch v9) and language/national specific conventions
> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
> 
> Thanks,
> 
> On 04/02/2019 09.14, Egor Kobylkin wrote:
>> Carlos,
>> are you comfortable to pick this up again this month?
>>
>> I would really love to have a reliable action plan to get this committed
>> for 2.30. Maybe cut out a subset that is undisputed and commit only that
>> first. It looks kinda like an eternal moving target otherwise.
>>
>> for you reference:
>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>>
>> Bests,
>> Egor Kobylkin
>>
>> On 09.01.19 21:03, Marko Myllynen wrote:
>>> Hi,
>>>
>>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>>
>>>>>> Good catch! Should we maybe split this into two patches, one for C and
>>>>>> the other for "country" locales? They have different codes and
>>>>>> functionality so it looks like it would be easier to keep focus.
>>>>>
>>>>> That would probably make sense, the standard C/POSIX locale won't
>>>>> support System A so it also narrows down solution alternatives with it.
>>>>>
>>>>>> "Country" locales in localedata/locales/ can then have the exact same
>>>>>> translit table included or they can have any other flavor - I don't
>>>>>> see
>>>>>> a problem here.
>>>>>
>>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>>> could
>>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>>> possible, otherwise use System B / ASCII (just need to make sure that
>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>>
>>>> Happy to hear the split seems to be a clear cut one.
>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>>> (number) and title for clarity in communication?
>>>
>>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>>> NEWS entry might be more appropriate (with the full details explained in
>>> the commit messages of course) but I'll leave this to others to decide.
>>>
>>>> This way it would probably be easier to have the decision making process
>>>> tied up for both patches (separately). We may want to get the v12 POSIX
>>>> out of the door in 2.30 then and can take all the time we need to set up
>>>> the rules for "Countries" locales as you need them to be.
>>>
>>> Perhaps Rafal or Carlos have better suggestions but I would think we
>>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>>> part (that would be what you posted as v12), then patch 2/3 adds
>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>>> System A and GOST 7.79 System B as a fall-back (which would match the
>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
>>> alternative suggestions so it might be best to wait for their feedback
>>> before doing anything yet (it's unfortunate you've had to do so many
>>> iterations around this already but I think we've all learned something
>>> during the process and the end result will be more correct than any of
>>> the earlier versions).
>>>
>>> Thanks,
>>>
> 
> 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-03-04 22:11                   ` Egor Kobylkin
@ 2019-03-11 13:59                     ` Egor Kobylkin
  2019-03-14 19:48                       ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-11 13:59 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
	Rafal Luzynski, Mike Fabian
  Cc: Siddhesh Poyarekar, Dmitry V. Levin



On 04.03.19 23:11, Egor Kobylkin wrote:
> ping
> 
> On 14.02.19 17:48, Marko Myllynen wrote:
>> Hi Carlos, Mike, Rafal,
>>
>> It seems clear that you all are currently too busy to have a look at
>> this but would you have any estimate when you might be able to review
>> this so that we could consider merging?
>>
>> FWIW, I chatted with Egor off-list and we're on the same page wrt the
>> following, hopefully this gives you a bit off jump start for this
>> subject when you have time to dig deeper:
>>
>> 1) Built-in C locale doesn't read/use any translit_* files and it can't
>> have any fallback mechanisms and it only supports ASCII so using GOST
>> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
>> be the appropriate way to implement Cyrillic transliteration for the
>> built-in C locale (it adds some 8KB to the binary).
>>
>> 2) Other locales read/use translit_* files and with them fallbacks and
>> non-ASCII are possible so it would seem preferable to first try ISO 9 /
>> GOST 7.79 System A and only if that fails then use GOST 7.79 System B
>> (in which case the end result should match with the built-in C locale).
>> For this the translit_cyrillic file should be added (as per patch v9 +
>> changes mentioned in patches v10 and v12).
>>
>> 3) Individual locale files can then be updated to use translit_cyrillic
>> as appropriate (see patch v9) and language/national specific conventions
>> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
>>
>> Thanks,
>>
>> On 04/02/2019 09.14, Egor Kobylkin wrote:
>>> Carlos,
>>> are you comfortable to pick this up again this month?
>>>
>>> I would really love to have a reliable action plan to get this committed
>>> for 2.30. Maybe cut out a subset that is undisputed and commit only that
>>> first. It looks kinda like an eternal moving target otherwise.
>>>
>>> for you reference:
>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>>>
>>> Bests,
>>> Egor Kobylkin
>>>
>>> On 09.01.19 21:03, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>>>
>>>>>>> Good catch! Should we maybe split this into two patches, one for 
>>>>>>> C and
>>>>>>> the other for "country" locales? They have different codes and
>>>>>>> functionality so it looks like it would be easier to keep focus.
>>>>>>
>>>>>> That would probably make sense, the standard C/POSIX locale won't
>>>>>> support System A so it also narrows down solution alternatives 
>>>>>> with it.
>>>>>>
>>>>>>> "Country" locales in localedata/locales/ can then have the exact 
>>>>>>> same
>>>>>>> translit table included or they can have any other flavor - I don't
>>>>>>> see
>>>>>>> a problem here.
>>>>>>
>>>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>>>> could
>>>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>>>> possible, otherwise use System B / ASCII (just need to make sure that
>>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>>>
>>>>> Happy to hear the split seems to be a clear cut one.
>>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>>>> (number) and title for clarity in communication?
>>>>
>>>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>>>> NEWS entry might be more appropriate (with the full details 
>>>> explained in
>>>> the commit messages of course) but I'll leave this to others to decide.
>>>>
>>>>> This way it would probably be easier to have the decision making 
>>>>> process
>>>>> tied up for both patches (separately). We may want to get the v12 
>>>>> POSIX
>>>>> out of the door in 2.30 then and can take all the time we need to 
>>>>> set up
>>>>> the rules for "Countries" locales as you need them to be.
>>>>
>>>> Perhaps Rafal or Carlos have better suggestions but I would think we
>>>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>>>> part (that would be what you posted as v12), then patch 2/3 adds
>>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>>>> System A and GOST 7.79 System B as a fall-back (which would match the
>>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
>>>> alternative suggestions so it might be best to wait for their feedback
>>>> before doing anything yet (it's unfortunate you've had to do so many
>>>> iterations around this already but I think we've all learned something
>>>> during the process and the end result will be more correct than any of
>>>> the earlier versions).
>>>>
>>>> Thanks,
>>>>
>>
>>

^ permalink raw reply	[flat|nested] 111+ messages in thread

* PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-03-11 13:59                     ` PING " Egor Kobylkin
@ 2019-03-14 19:48                       ` Egor Kobylkin
  0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-14 19:48 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Carlos O'Donell
  Cc: Marko Myllynen, Rafal Luzynski, Mike Fabian, Siddhesh Poyarekar,
	Dmitry V. Levin



On 11.03.19 14:59, Egor Kobylkin wrote:
> 
> 
> On 04.03.19 23:11, Egor Kobylkin wrote:
>> ping
>>
>> On 14.02.19 17:48, Marko Myllynen wrote:
>>> Hi Carlos, Mike, Rafal,
>>>
>>> It seems clear that you all are currently too busy to have a look at
>>> this but would you have any estimate when you might be able to review
>>> this so that we could consider merging?
>>>
>>> FWIW, I chatted with Egor off-list and we're on the same page wrt the
>>> following, hopefully this gives you a bit off jump start for this
>>> subject when you have time to dig deeper:
>>>
>>> 1) Built-in C locale doesn't read/use any translit_* files and it can't
>>> have any fallback mechanisms and it only supports ASCII so using GOST
>>> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
>>> be the appropriate way to implement Cyrillic transliteration for the
>>> built-in C locale (it adds some 8KB to the binary).
>>>
>>> 2) Other locales read/use translit_* files and with them fallbacks and
>>> non-ASCII are possible so it would seem preferable to first try ISO 9 /
>>> GOST 7.79 System A and only if that fails then use GOST 7.79 System B
>>> (in which case the end result should match with the built-in C locale).
>>> For this the translit_cyrillic file should be added (as per patch v9 +
>>> changes mentioned in patches v10 and v12).
>>>
>>> 3) Individual locale files can then be updated to use translit_cyrillic
>>> as appropriate (see patch v9) and language/national specific conventions
>>> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
>>>
>>> Thanks,
>>>
>>> On 04/02/2019 09.14, Egor Kobylkin wrote:
>>>> Carlos,
>>>> are you comfortable to pick this up again this month?
>>>>
>>>> I would really love to have a reliable action plan to get this 
>>>> committed
>>>> for 2.30. Maybe cut out a subset that is undisputed and commit only 
>>>> that
>>>> first. It looks kinda like an eternal moving target otherwise.
>>>>
>>>> for you reference:
>>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
>>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>>>>
>>>> Bests,
>>>> Egor Kobylkin
>>>>
>>>> On 09.01.19 21:03, Marko Myllynen wrote:
>>>>> Hi,
>>>>>
>>>>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>>>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>>>>
>>>>>>>> Good catch! Should we maybe split this into two patches, one for 
>>>>>>>> C and
>>>>>>>> the other for "country" locales? They have different codes and
>>>>>>>> functionality so it looks like it would be easier to keep focus.
>>>>>>>
>>>>>>> That would probably make sense, the standard C/POSIX locale won't
>>>>>>> support System A so it also narrows down solution alternatives 
>>>>>>> with it.
>>>>>>>
>>>>>>>> "Country" locales in localedata/locales/ can then have the exact 
>>>>>>>> same
>>>>>>>> translit table included or they can have any other flavor - I don't
>>>>>>>> see
>>>>>>>> a problem here.
>>>>>>>
>>>>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>>>>> could
>>>>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>>>>> possible, otherwise use System B / ASCII (just need to make sure 
>>>>>>> that
>>>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>>>>
>>>>>> Happy to hear the split seems to be a clear cut one.
>>>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>>>>> (number) and title for clarity in communication?
>>>>>
>>>>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>>>>> NEWS entry might be more appropriate (with the full details 
>>>>> explained in
>>>>> the commit messages of course) but I'll leave this to others to 
>>>>> decide.
>>>>>
>>>>>> This way it would probably be easier to have the decision making 
>>>>>> process
>>>>>> tied up for both patches (separately). We may want to get the v12 
>>>>>> POSIX
>>>>>> out of the door in 2.30 then and can take all the time we need to 
>>>>>> set up
>>>>>> the rules for "Countries" locales as you need them to be.
>>>>>
>>>>> Perhaps Rafal or Carlos have better suggestions but I would think we
>>>>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>>>>> part (that would be what you posted as v12), then patch 2/3 adds
>>>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>>>>> System A and GOST 7.79 System B as a fall-back (which would match the
>>>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>>>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may 
>>>>> have
>>>>> alternative suggestions so it might be best to wait for their feedback
>>>>> before doing anything yet (it's unfortunate you've had to do so many
>>>>> iterations around this already but I think we've all learned something
>>>>> during the process and the end result will be more correct than any of
>>>>> the earlier versions).
>>>>>
>>>>> Thanks,
>>>>>
>>>
>>>

^ permalink raw reply	[flat|nested] 111+ messages in thread

* ping [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
       [not found] ` <20180412224352.GB2911@altlinux.org>
                     ` (12 preceding siblings ...)
  2019-01-02 18:38   ` [PATCH v12] " Egor Kobylkin
@ 2019-03-19 10:39   ` Egor Kobylkin
  2019-03-28 16:20     ` [PING^4][PATCH " Marko Myllynen
                       ` (2 more replies)
  13 siblings, 3 replies; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-19 10:39 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Carlos O'Donell, Siddhesh Poyarekar,
	Rafal Luzynski
  Cc: Marko Myllynen, mfabian

[-- Attachment #1: Type: text/plain, Size: 5983 bytes --]

Changelog v12:
* Adjusted to the new comment style suddenly appearing in the target 
file locale/C-translit.h.in (the original file changed on the master 
branch from /* style to # style since v11)
* Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to 
"sh`" instead of erroneous "SH`" in v11

Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.

Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
    to sequences of all uppercase Latin letters in all languages (whenever
    a Cyrillic letter is transliterated to more than one Latin letter),
    for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration rows to locale/C-translit.h.in.

The patch is attached.


Current bug effect:

The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:

iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- it produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here.


COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].

The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.

Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3

Best regards,
Egor Kobylkin





[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 10495 bytes --]

From 46e0d0e3d07805ec853fdd72dc3793995cb5593c Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 2 Jan 2019 05:50:13 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 169 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 169 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index d5f00df0f3..758171c394 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -56,6 +56,175 @@
 "\x02cd"	"_"	# <U02CD> MODIFIER LETTER LOW MACRON
 "\x02d0"	":"	# <U02D0> MODIFIER LETTER TRIANGULAR COLON
 "\x02dc"	"~"	# <U02DC> SMALL TILDE
+"\x0401"	"YO"	# <U0401> CYRILLIC CAPITAL LETTER IO
+"\x0402"	"DJ"	# <U0402> CYRILLIC CAPITAL LETTER DJE
+"\x0403"	"G`"	# <U0403> CYRILLIC CAPITAL LETTER GJE
+"\x0404"	"YE"	# <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE
+"\x0405"	"Z`"	# <U0405> CYRILLIC CAPITAL LETTER DZE
+"\x0406"	"I"	# <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0407"	"YI"	# <U0407> CYRILLIC CAPITAL LETTER YI
+"\x0408"	"J"	# <U0408> CYRILLIC CAPITAL LETTER JE
+"\x0409"	"L`"	# <U0409> CYRILLIC CAPITAL LETTER LJE
+"\x040a"	"N`"	# <U040A> CYRILLIC CAPITAL LETTER NJE
+"\x040b"	"TSH"	# <U040B> CYRILLIC CAPITAL LETTER TSHE
+"\x040c"	"K`"	# <U040C> CYRILLIC CAPITAL LETTER KJE
+"\x040e"	"U`"	# <U040E> CYRILLIC CAPITAL LETTER SHORT U
+"\x040f"	"DH"	# <U040F> CYRILLIC CAPITAL LETTER DZHE
+"\x0410"	"A"	# <U0410> CYRILLIC CAPITAL LETTER A
+"\x0411"	"B"	# <U0411> CYRILLIC CAPITAL LETTER BE
+"\x0412"	"V"	# <U0412> CYRILLIC CAPITAL LETTER VE
+"\x0413"	"G"	# <U0413> CYRILLIC CAPITAL LETTER GHE
+"\x0414"	"D"	# <U0414> CYRILLIC CAPITAL LETTER DE
+"\x0415"	"E"	# <U0415> CYRILLIC CAPITAL LETTER IE
+"\x0416"	"ZH"	# <U0416> CYRILLIC CAPITAL LETTER ZHE
+"\x0417"	"Z"	# <U0417> CYRILLIC CAPITAL LETTER ZE
+"\x0418"	"I"	# <U0418> CYRILLIC CAPITAL LETTER I
+"\x0419"	"J"	# <U0419> CYRILLIC CAPITAL LETTER SHORT I
+"\x041a"	"K"	# <U041A> CYRILLIC CAPITAL LETTER KA
+"\x041b"	"L"	# <U041B> CYRILLIC CAPITAL LETTER EL
+"\x041c"	"M"	# <U041C> CYRILLIC CAPITAL LETTER EM
+"\x041d"	"N"	# <U041D> CYRILLIC CAPITAL LETTER EN
+"\x041e"	"O"	# <U041E> CYRILLIC CAPITAL LETTER O
+"\x041f"	"P"	# <U041F> CYRILLIC CAPITAL LETTER PE
+"\x0420"	"R"	# <U0420> CYRILLIC CAPITAL LETTER ER
+"\x0421"	"S"	# <U0421> CYRILLIC CAPITAL LETTER ES
+"\x0422"	"T"	# <U0422> CYRILLIC CAPITAL LETTER TE
+"\x0423"	"U"	# <U0423> CYRILLIC CAPITAL LETTER U
+"\x0424"	"F"	# <U0424> CYRILLIC CAPITAL LETTER EF
+"\x0425"	"X"	# <U0425> CYRILLIC CAPITAL LETTER HA
+"\x0426"	"CZ"	# <U0426> CYRILLIC CAPITAL LETTER TSE
+"\x0427"	"CH"	# <U0427> CYRILLIC CAPITAL LETTER CHE
+"\x0428"	"SH"	# <U0428> CYRILLIC CAPITAL LETTER SHA
+"\x0429"	"SHH"	# <U0429> CYRILLIC CAPITAL LETTER SHCHA
+"\x042a"	"A`"	# <U042A> CYRILLIC CAPITAL LETTER HARD SIGN
+"\x042b"	"Y`"	# <U042B> CYRILLIC CAPITAL LETTER YERU
+"\x042c"	"`"	# <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN
+"\x042d"	"E`"	# <U042D> CYRILLIC CAPITAL LETTER E
+"\x042e"	"YU"	# <U042E> CYRILLIC CAPITAL LETTER YU
+"\x042f"	"YA"	# <U042F> CYRILLIC CAPITAL LETTER YA
+"\x0430"	"a"	# <U0430> CYRILLIC SMALL LETTER A
+"\x0431"	"b"	# <U0431> CYRILLIC SMALL LETTER BE
+"\x0432"	"v"	# <U0432> CYRILLIC SMALL LETTER VE
+"\x0433"	"g"	# <U0433> CYRILLIC SMALL LETTER GHE
+"\x0434"	"d"	# <U0434> CYRILLIC SMALL LETTER DE
+"\x0435"	"e"	# <U0435> CYRILLIC SMALL LETTER IE
+"\x0436"	"zh"	# <U0436> CYRILLIC SMALL LETTER ZHE
+"\x0437"	"z"	# <U0437> CYRILLIC SMALL LETTER ZE
+"\x0438"	"i"	# <U0438> CYRILLIC SMALL LETTER I
+"\x0439"	"j"	# <U0439> CYRILLIC SMALL LETTER SHORT I
+"\x043a"	"k"	# <U043A> CYRILLIC SMALL LETTER KA
+"\x043b"	"l"	# <U043B> CYRILLIC SMALL LETTER EL
+"\x043c"	"m"	# <U043C> CYRILLIC SMALL LETTER EM
+"\x043d"	"n"	# <U043D> CYRILLIC SMALL LETTER EN
+"\x043e"	"o"	# <U043E> CYRILLIC SMALL LETTER O
+"\x043f"	"p"	# <U043F> CYRILLIC SMALL LETTER PE
+"\x0440"	"r"	# <U0440> CYRILLIC SMALL LETTER ER
+"\x0441"	"s"	# <U0441> CYRILLIC SMALL LETTER ES
+"\x0442"	"t"	# <U0442> CYRILLIC SMALL LETTER TE
+"\x0443"	"u"	# <U0443> CYRILLIC SMALL LETTER U
+"\x0444"	"f"	# <U0444> CYRILLIC SMALL LETTER EF
+"\x0445"	"x"	# <U0445> CYRILLIC SMALL LETTER HA
+"\x0446"	"cz"	# <U0446> CYRILLIC SMALL LETTER TSE
+"\x0447"	"ch"	# <U0447> CYRILLIC SMALL LETTER CHE
+"\x0448"	"sh"	# <U0448> CYRILLIC SMALL LETTER SHA
+"\x0449"	"shh"	# <U0449> CYRILLIC SMALL LETTER SHCHA
+"\x044a"	"``"	# <U044A> CYRILLIC SMALL LETTER HARD SIGN
+"\x044b"	"y`"	# <U044B> CYRILLIC SMALL LETTER YERU
+"\x044c"	"`"	# <U044C> CYRILLIC SMALL LETTER SOFT SIGN
+"\x044d"	"e`"	# <U044D> CYRILLIC SMALL LETTER E
+"\x044e"	"yu"	# <U044E> CYRILLIC SMALL LETTER YU
+"\x044f"	"ya"	# <U044F> CYRILLIC SMALL LETTER YA
+"\x0451"	"yo"	# <U0451> CYRILLIC SMALL LETTER IO
+"\x0452"	"dj"	# <U0452> CYRILLIC SMALL LETTER DJE
+"\x0453"	"g`"	# <U0453> CYRILLIC SMALL LETTER GJE
+"\x0454"	"ye"	# <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE
+"\x0455"	"z`"	# <U0455> CYRILLIC SMALL LETTER DZE
+"\x0456"	"i"	# <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0457"	"yi"	# <U0457> CYRILLIC SMALL LETTER YI
+"\x0458"	"j"	# <U0458> CYRILLIC SMALL LETTER JE
+"\x0459"	"l`"	# <U0459> CYRILLIC SMALL LETTER LJE
+"\x045a"	"n`"	# <U045A> CYRILLIC SMALL LETTER NJE
+"\x045b"	"tsh"	# <U045B> CYRILLIC SMALL LETTER TSHE
+"\x045c"	"k`"	# <U045C> CYRILLIC SMALL LETTER KJE
+"\x045e"	"u`"	# <U045E> CYRILLIC SMALL LETTER SHORT U
+"\x045f"	"dh"	# <U045F> CYRILLIC SMALL LETTER DZHE
+"\x046a"	"O`"	# <U046A> CYRILLIC CAPITAL LETTER BIG YUS
+"\x046b"	"o`"	# <U046B> CYRILLIC SMALL LETTER BIG YUS
+"\x0472"	"FH"	# <U0472> CYRILLIC CAPITAL LETTER FITA
+"\x0473"	"fh"	# <U0473> CYRILLIC SMALL LETTER FITA
+"\x0474"	"YH"	# <U0474> CYRILLIC CAPITAL LETTER IZHITSA
+"\x0475"	"yh"	# <U0475> CYRILLIC SMALL LETTER IZHITSA
+"\x048c"	"E`"	# <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+"\x048d"	"e`"	# <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN
+"\x0490"	"G`"	# <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+"\x0491"	"g`"	# <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN
+"\x0492"	"GH"	# <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE
+"\x0493"	"gh"	# <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE
+"\x0494"	"GH"	# <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+"\x0495"	"gh"	# <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+"\x0496"	"ZH`"	# <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+"\x0497"	"zh`"	# <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+"\x049a"	"K`"	# <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+"\x049b"	"k`"	# <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER
+"\x049e"	"K`"	# <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE
+"\x049f"	"k`"	# <U049F> CYRILLIC SMALL LETTER KA WITH STROKE
+"\x04a2"	"N`"	# <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+"\x04a3"	"n`"	# <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER
+"\x04a4"	"NG"	# <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE
+"\x04a5"	"ng"	# <U04A5> CYRILLIC SMALL LIGATURE EN GHE
+"\x04a6"	"P`"	# <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+"\x04a7"	"p`"	# <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+"\x04a8"	"O`"	# <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA
+"\x04a9"	"o`"	# <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA
+"\x04aa"	"C`"	# <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+"\x04ab"	"C`"	# <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER
+"\x04ac"	"T`"	# <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+"\x04ad"	"t`"	# <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER
+"\x04ae"	"U"	# <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U
+"\x04af"	"u"	# <U04AF> CYRILLIC SMALL LETTER STRAIGHT U
+"\x04b2"	"H`"	# <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+"\x04b3"	"h`"	# <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER
+"\x04b4"	"TCZ"	# <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE
+"\x04b5"	"tcz"	# <U04B5> CYRILLIC SMALL LIGATURE TE TSE
+"\x04ba"	"SH`"	# <U04BA> CYRILLIC CAPITAL LETTER SHHA
+"\x04bb"	"sh`"	# <U04BB> CYRILLIC SMALL LETTER SHHA
+"\x04bc"	"CH`"	# <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+"\x04bd"	"ch`"	# <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE
+"\x04be"	"CH`"	# <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04bf"	"ch`"	# <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04c0"	"i"	# <U04C0> CYRILLIC LETTER PALOCHKA
+"\x04c1"	"ZH`"	# <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+"\x04c2"	"zh`"	# <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE
+"\x04cb"	"CH`"	# <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+"\x04cc"	"ch`"	# <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE
+"\x04d0"	"A`"	# <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE
+"\x04d1"	"a`"	# <U04D1> CYRILLIC SMALL LETTER A WITH BREVE
+"\x04d2"	"A`"	# <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+"\x04d3"	"a`"	# <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS
+"\x04d6"	"E`"	# <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE
+"\x04d7"	"e`"	# <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE
+"\x04d8"	"A`"	# <U04D8> CYRILLIC CAPITAL LETTER SCHWA
+"\x04d9"	"a`"	# <U04D9> CYRILLIC SMALL LETTER SCHWA
+"\x04dc"	"ZH`"	# <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+"\x04dd"	"zh`"	# <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+"\x04de"	"Z`"	# <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+"\x04df"	"z`"	# <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+"\x04e0"	"Z`"	# <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+"\x04e1"	"z`"	# <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE
+"\x04e4"	"I`"	# <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+"\x04e5"	"i`"	# <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS
+"\x04e6"	"O`"	# <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+"\x04e7"	"o`"	# <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS
+"\x04e8"	"O`"	# <U04E8> CYRILLIC CAPITAL LETTER BARRED O
+"\x04e9"	"o`"	# <U04E9> CYRILLIC SMALL LETTER BARRED O
+"\x04f0"	"U`"	# <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+"\x04f1"	"u`"	# <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS
+"\x04f2"	"U`"	# <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+"\x04f3"	"u`"	# <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+"\x04f4"	"CH`"	# <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+"\x04f5"	"ch`"	# <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+"\x04f8"	"Y`"	# <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+"\x04f9"	"y`"	# <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS
 "\x2002"	" "	# <U2002> EN SPACE
 "\x2003"	" "	# <U2003> EM SPACE
 "\x2004"	" "	# <U2004> THREE-PER-EM SPACE
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PING^4][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-03-19 10:39   ` ping " Egor Kobylkin
@ 2019-03-28 16:20     ` Marko Myllynen
  2019-04-04 19:44     ` [PING^5][PATCH " Egor Kobylkin
  2019-04-16  7:15     ` [PING^6][PATCH " Marko Myllynen
  2 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2019-03-28 16:20 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell,
	Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian

Ping?

On 19/03/2019 12.39, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target
> file locale/C-translit.h.in (the original file changed on the master
> branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
> "sh`" instead of erroneous "SH`" in v11
> 
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> 
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
> 
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
> 
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
> 
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.
> 
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
>    to sequences of all uppercase Latin letters in all languages (whenever
>    a Cyrillic letter is transliterated to more than one Latin letter),
>    for example "Ї" is now transliterated as "YI" rather than "Yi".
> 
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> 
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
> 
> The patch is attached.
> 
> 
> Current bug effect:
> 
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
> 
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
> 
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> 
> - it produces a string of question marks and spaces.
> 
> This is what it should produce and it does so after the patch applied:
> 
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
> 
> 
> The root problem and the fix:
> 
> The root problem is the missing transliteration table that I am
> supplying here.
> 
> 
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> 
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
> 
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
> 
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
> 
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
> 
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
> 
> Links:
> 
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> 
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
> 
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
> 
> 
> Best regards,
> Egor Kobylkin
> 
> 
> 
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PING^5][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-03-19 10:39   ` ping " Egor Kobylkin
  2019-03-28 16:20     ` [PING^4][PATCH " Marko Myllynen
@ 2019-04-04 19:44     ` Egor Kobylkin
  2019-04-06  1:36       ` Siddhesh Poyarekar
  2019-04-16  7:15     ` [PING^6][PATCH " Marko Myllynen
  2 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-04-04 19:44 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
	Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian

Ping?

On 19/03/2019 12.39, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target
> file locale/C-translit.h.in (the original file changed on the master
> branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
> "sh`" instead of erroneous "SH`" in v11
> 
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> 
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
> 
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
> 
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
> 
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.
> 
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
>    to sequences of all uppercase Latin letters in all languages (whenever
>    a Cyrillic letter is transliterated to more than one Latin letter),
>    for example "Ї" is now transliterated as "YI" rather than "Yi".
> 
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> 
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
> 
> The patch is attached.
> 
> 
> Current bug effect:
> 
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
> 
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
> 
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> 
> - it produces a string of question marks and spaces.
> 
> This is what it should produce and it does so after the patch applied:
> 
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
> 
> 
> The root problem and the fix:
> 
> The root problem is the missing transliteration table that I am
> supplying here.
> 
> 
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> 
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
> 
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
> 
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
> 
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
> 
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
> 
> Links:
> 
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> 
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
> 
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
> 
> 
> Best regards,
> Egor Kobylkin
> 
> 
> 
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^5][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-04 19:44     ` [PING^5][PATCH " Egor Kobylkin
@ 2019-04-06  1:36       ` Siddhesh Poyarekar
  0 siblings, 0 replies; 111+ messages in thread
From: Siddhesh Poyarekar @ 2019-04-06  1:36 UTC (permalink / raw)
  To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales,
	Carlos O'Donell, Rafal Luzynski
  Cc: Mike Fabian

On 05/04/19 1:14 AM, Egor Kobylkin wrote:
> Ping?
> 

I'm committing to looking at this on Monday if nobody gets to it ovevr
the weekend.

Siddhesh

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-01-02 18:38   ` [PATCH v12] " Egor Kobylkin
  2019-01-05 14:35     ` Rafal Luzynski
@ 2019-04-09  1:04     ` Carlos O'Donell
  1 sibling, 0 replies; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-09  1:04 UTC (permalink / raw)
  To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell,
	Siddhesh Poyarekar, Rafal Luzynski
  Cc: Marko Myllynen, mfabian

On 1/2/19 1:38 PM, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target file locale/C-translit.h.in (the original file changed on the master branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to "sh`" instead of erroneous "SH`" in v11

I have installed this patch and I'm testing some transliterations.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-03-19 10:39   ` ping " Egor Kobylkin
  2019-03-28 16:20     ` [PING^4][PATCH " Marko Myllynen
  2019-04-04 19:44     ` [PING^5][PATCH " Egor Kobylkin
@ 2019-04-16  7:15     ` Marko Myllynen
  2019-04-16 13:17       ` Carlos O'Donell
  2 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2019-04-16  7:15 UTC (permalink / raw)
  To: libc-alpha, libc-locales, Carlos O'Donell, Siddhesh Poyarekar,
	Rafal Luzynski
  Cc: Mike Fabian, Egor Kobylkin

Ping?

On 19/03/2019 12.39, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target
> file locale/C-translit.h.in (the original file changed on the master
> branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
> "sh`" instead of erroneous "SH`" in v11
> 
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> 
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
> 
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
> 
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
> 
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.
> 
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
>    to sequences of all uppercase Latin letters in all languages (whenever
>    a Cyrillic letter is transliterated to more than one Latin letter),
>    for example "Ї" is now transliterated as "YI" rather than "Yi".
> 
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> 
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
> 
> The patch is attached.
> 
> 
> Current bug effect:
> 
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
> 
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
> 
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> 
> - it produces a string of question marks and spaces.
> 
> This is what it should produce and it does so after the patch applied:
> 
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
> 
> 
> The root problem and the fix:
> 
> The root problem is the missing transliteration table that I am
> supplying here.
> 
> 
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> 
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
> 
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
> 
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
> 
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
> 
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
> 
> Links:
> 
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> 
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
> 
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
> 
> 
> Best regards,
> Egor Kobylkin
> 
> 
> 
> 


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-16  7:15     ` [PING^6][PATCH " Marko Myllynen
@ 2019-04-16 13:17       ` Carlos O'Donell
  2019-04-16 17:06         ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-16 13:17 UTC (permalink / raw)
  To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
	Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian, Egor Kobylkin

On 4/16/19 3:15 AM, Marko Myllynen wrote:
> Ping?

I have this patch applied locally and I'm working through some
comparisons for the transliteration.


-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-16 13:17       ` Carlos O'Donell
@ 2019-04-16 17:06         ` Egor Kobylkin
  2019-04-16 17:58           ` Carlos O'Donell
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-04-16 17:06 UTC (permalink / raw)
  To: Carlos O'Donell, Marko Myllynen, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian

Just FYI, this what I was testing: ./testrun.sh /usr/bin/iconv -f UTF-8 
-t ASCII//TRANSLIT <<< 
"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ 
ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"

And this is the expected result ("" added by myself):
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU?FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu?fxczchshshh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e` 
G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`sh`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'"

Bests,
Egor Kobylkin


On 16.04.19 15:17, Carlos O'Donell wrote:
> On 4/16/19 3:15 AM, Marko Myllynen wrote:
>> Ping?
> 
> I have this patch applied locally and I'm working through some
> comparisons for the transliteration.
> 
> 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-16 17:06         ` Egor Kobylkin
@ 2019-04-16 17:58           ` Carlos O'Donell
  2019-04-16 18:41             ` Egor Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-16 17:58 UTC (permalink / raw)
  To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian

On 4/16/19 1:06 PM, Egor Kobylkin wrote:
> Just FYI, this what I was testing: ./testrun.sh /usr/bin/iconv -f UTF-8 -t ASCII//TRANSLIT <<< "ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
> 
> And this is the expected result ("" added by myself):
> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU?FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu?fxczchshshh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e` G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`sh`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'"

Thanks.

I was using CyrTranslit (python translater) to review other work done in this area,
but it wasn't very fruitful.

$ python3
Python 3.7.3 (default, Mar 27 2019, 13:36:35)
[GCC 9.0.1 20190227 (Red Hat 9.0.1-0.8)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cyrtranslit
>>> cyrtranslit.supported()
dict_keys(['sr', 'me', 'mk', 'ru'])
>>> cyrtranslit.to_latin("ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’")
'ЁĐЃЄЅІЇJLjNjĆЌЎDžABVGDEŽZIЙKLMNOPRSTUÚFHCČŠЩЪЫЬЭЮЯabvgdežziйklmnoprstuúfhcčšщъыьэюяёđѓєѕіїjljnjćќўdžѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’'
>>> 

"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
'ЁĐЃЄЅІЇJLjNjĆЌЎDžABVGDEŽZIЙKLMNOPRSTUÚFHCČŠЩЪЫЬЭЮЯabvgdežziйklmnoprstuúfhcčšщъыьэюяёđѓєѕіїjljnjćќўdžѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’'

Which doesn't give a good transliteration.

But the table is better:
https://github.com/opendatakosovo/cyrillic-transliteration/blob/master/cyrtranslit/mapping.py#L138-L155

Ё -> YO.

Which is a good cross-check for me.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-16 17:58           ` Carlos O'Donell
@ 2019-04-16 18:41             ` Egor Kobylkin
  2019-04-16 19:06               ` Carlos O'Donell
  0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-04-16 18:41 UTC (permalink / raw)
  To: Carlos O'Donell, Marko Myllynen, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian



On 16.04.19 19:58, Carlos O'Donell wrote:
> On 4/16/19 1:06 PM, Egor Kobylkin wrote:
>> Just FYI, this what I was testing: ./testrun.sh /usr/bin/iconv -f 
>> UTF-8 -t ASCII//TRANSLIT <<< 
>> "ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ 
>> ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
>>
>> And this is the expected result ("" added by myself):
>> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU?FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu?fxczchshshh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e` 
>> G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`sh`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'" 
>>
> 
> Thanks.
> 
> I was using CyrTranslit (python translater) to review other work done in 
> this area,
> but it wasn't very fruitful.
> 
> $ python3
> Python 3.7.3 (default, Mar 27 2019, 13:36:35)
> [GCC 9.0.1 20190227 (Red Hat 9.0.1-0.8)] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import cyrtranslit
>>>> cyrtranslit.supported()
> dict_keys(['sr', 'me', 'mk', 'ru'])
>>>> cyrtranslit.to_latin("ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’") 
>>>>
> 'ЁĐЃЄЅІЇJLjNjĆЌЎDžABVGDEŽZIЙKLMNOPRSTUÚFHCČŠЩЪЫЬЭЮЯabvgdežziйklmnoprstuúfhcčšщъыьэюяёđѓєѕіїjljnjćќўdžѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’' 
> 
>>>>
> 
> "ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’" 
> 
> 'ЁĐЃЄЅІЇJLjNjĆЌЎDžABVGDEŽZIЙKLMNOPRSTUÚFHCČŠЩЪЫЬЭЮЯabvgdežziйklmnoprstuúfhcčšщъыьэюяёđѓєѕіїjljnjćќўdžѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’' 
> 
> 
> Which doesn't give a good transliteration.

I guess the reason for that is that it is using the first key 'sr' from 
your list that stands for Serbian. And Serbian doesn't have those 
characters that are omitted ( "Щ" for example).

> But the table is better:
> https://github.com/opendatakosovo/cyrillic-transliteration/blob/master/cyrtranslit/mapping.py#L138-L155 
> 
> 
> Ё -> YO.
> 
> Which is a good cross-check for me.

Yet the closest one from that codebase should be this 
https://github.com/opendatakosovo/cyrillic-transliteration/blob/master/cyrtranslit/mapping.py#L88

It is exactly the reason we had 12 iterations on this patch - we wanted 
to cover the most complete yet workable standard for the table. What we 
reference in the bug memo is the actual accepted standard. It is 
coalesced with the extended standard for further outdated cyrillic letters.

Bests,
Egor Kobylkin




^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-16 18:41             ` Egor Kobylkin
@ 2019-04-16 19:06               ` Carlos O'Donell
  2019-05-10 12:19                 ` Marko Myllynen
  0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-16 19:06 UTC (permalink / raw)
  To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian

On 4/16/19 2:41 PM, Egor Kobylkin wrote:
> It is exactly the reason we had 12 iterations on this patch - we
> wanted to cover the most complete yet workable standard for the
> table. What we reference in the bug memo is the actual accepted
> standard. It is coalesced with the extended standard for further
> outdated cyrillic letters.

I agree, and this is what makes review complicated and time
consuming. I'm relying on you as the expert, and my goal is only
to spot check for any inconsistencies.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-02-14 16:48                 ` Marko Myllynen
  2019-03-04 22:11                   ` Egor Kobylkin
@ 2019-04-19 22:24                   ` Rafal Luzynski
       [not found]                     ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
  1 sibling, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2019-04-19 22:24 UTC (permalink / raw)
  To: Marko Myllynen, Egor Kobylkin, libc-alpha, libc-locales,
	Carlos O'Donell
  Cc: Siddhesh Poyarekar, Mike Fabian

Thank you Siddhesh and Carlos for your involvement in testing this
patch and I apologize Egor and Marko and everyone else who need this
patch to be pushed for my poor involvement.  I'd like to reply to
this email from Marko because it summarizes all issues.  Also I hope
I will explain the problems which made me stuck.

14.02.2019 17:48 Marko Myllynen <myllynen@redhat.com> wrote:
> [...]
> 1) Built-in C locale doesn't read/use any translit_* files and it can't
> have any fallback mechanisms and it only supports ASCII so using GOST
> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
> be the appropriate way to implement Cyrillic transliteration for the
> built-in C locale (it adds some 8KB to the binary).

This sounds like a good idea.

Also, C locale is probably a good way to enforce the plain ASCII
transliteration without any fallback.

> 2) Other locales read/use translit_* files and with them fallbacks and
> non-ASCII are possible so it would seem preferable to first try ISO 9 /
> GOST 7.79 System A

OK, we agree here.

> and only if that fails then use GOST 7.79 System B
> (in which case the end result should match with the built-in C locale).

This is impossible due to this case.  System A transliterates the Cyrillic
"Х" to Latin "H", system B transliterates it to Latin "X".  Transliteration
as implemented in glibc supports a simple fallback algorithm: transliterate
the letter "X" to "YY" but if it is not available then to "ZZ".  It can't
support the complex algorithm which we need here: transliterate "X" to "YY"
but if "Q" cannot be transliterated to "RR" then transliterate "X" to "ZZ".
In our case we would like to transliterate "Х" to "X" if "Ш" cannot be
transliterated to "Š".  The only thing we can implement is a fallback
transliteration which is similar to System B but not 100% compatible.

This is not the case if we are going to implement only System B in C locale
because we know already that "Š" is unavailable so we have to transliterate
"Х" to "X" always.

> For this the translit_cyrillic file should be added (as per patch v9 +
> changes mentioned in patches v10 and v12).
> 
> 3) Individual locale files can then be updated to use translit_cyrillic
> as appropriate (see patch v9) and language/national specific conventions
> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.

Sometimes I wonder whether really any other locale than a language which
uses the Cyrillic script should want to have a Cyrillic transliteration
but on the other hand - why not.

Also I'd like to reiterate other disagreements which we have here:

1. How to handle upper/lower case in System B?  Should we transliterate
   "Ш" to "SH" or "Sh"?  Should we maybe implement a smart context based
   casing algorithm first?  I mean the algorithm which would detect if
   an uppercase letter appears as the first letter of otherwise lowercase
   word so should be transliterated as "Sh", or maybe it's in a context
   of a fully uppercase word so should be transliterated as "SH".
   I think that uconv implements this algorithm.
2. How to handle ambiguous transliterations like "Схема" -> "Shema"
   vs. "Шема" -> "Shema"? "SHema"?
3. How to handle the characters which are proper letters in Cyrillic
   and have an upper and lower case like a hard and soft sign but are
   transliterated to punctuation characters (grave accent "`")?
   Should we transliterate upper and lower case to the same character
   or should we mark them somehow?  uconv adds Unicode combining low
   line to the grave accent (so the output is "`̲") if the original
   Cyrillic character was uppercase.  But this is unavailable if
   our target charset is ASCII.

Regarding the test cases which I mentioned the other day I discussed
this with Dmitry and he convinced me that requiring the test cases is
the bar set too high so I agree we don't need to require them already.

Regards,

Rafal

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
       [not found]                     ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
@ 2019-04-27  2:51                       ` Siddhesh Poyarekar
  2019-04-27  7:34                         ` Diego (Egor) Kobylkin
  0 siblings, 1 reply; 111+ messages in thread
From: Siddhesh Poyarekar @ 2019-04-27  2:51 UTC (permalink / raw)
  To: Diego (Egor) Kobylkin, Rafal Luzynski, Marko Myllynen, libc-alpha,
	libc-locales, Carlos O'Donell
  Cc: Mike Fabian, Dmitry V. Levin

On 27/04/19 4:19 AM, Diego (Egor) Kobylkin wrote:
> Dear all, 
> I think Rafal is making good points again. And  the best thing is that
> we actually seem to have full consensus from everyone involved about
> current limited ASCII patch V12 (GOST 7.79 System B in
> locale/C-translit.h.in).  
> So let’s just for the time being concentrate on getting this committed? 
> 
> We can get to further issues in the next release and having a base to
> start with will make them much clearer by the contrast of what’s already
> in. 
> 
> Please let me know if you see any entanglement between the V12 patch
> content and other issues listed below. I believe Carlos can test the
> patch in isolation and hopefully have it approved for the next release. 

Please put it as a release blocker:

https://sourceware.org/glibc/wiki/Release/2.30

Siddhesh


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
  2019-04-27  2:51                       ` Siddhesh Poyarekar
@ 2019-04-27  7:34                         ` Diego (Egor) Kobylkin
  0 siblings, 0 replies; 111+ messages in thread
From: Diego (Egor) Kobylkin @ 2019-04-27  7:34 UTC (permalink / raw)
  To: Siddhesh Poyarekar
  Cc: Rafal Luzynski, Marko Myllynen, libc-alpha@sourceware.org,
	libc-locales@sourceware.org, Carlos O'Donell, Mike Fabian,
	Dmitry V. Levin

Thanks, Siddhesh, it's in.

Bests,
Egor Kobylkin

P.S. just for the historians: I have noticed that my quoted message below didn't go to the lists because it was in html format. But I believe all involved have received it directly.


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, April 27, 2019 4:51 AM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:

> On 27/04/19 4:19 AM, Diego (Egor) Kobylkin wrote:

> > current limited ASCII patch V12 (GOST 7.79 System B in
> > locale/C-translit.h.in).  
> > So let’s just for the time being concentrate on getting this committed?

>
> Please put it as a release blocker:
>
> https://sourceware.org/glibc/wiki/Release/2.30
>
> Siddhesh



^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
  2019-04-16 19:06               ` Carlos O'Donell
@ 2019-05-10 12:19                 ` Marko Myllynen
  0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2019-05-10 12:19 UTC (permalink / raw)
  To: Carlos O'Donell, Egor Kobylkin, libc-alpha, libc-locales,
	Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
  Cc: Mike Fabian

Hi Carlos,

On 16/04/2019 22.06, Carlos O'Donell wrote:
> On 4/16/19 2:41 PM, Egor Kobylkin wrote:
>> It is exactly the reason we had 12 iterations on this patch - we
>> wanted to cover the most complete yet workable standard for the
>> table. What we reference in the bug memo is the actual accepted
>> standard. It is coalesced with the extended standard for further
>> outdated cyrillic letters.
> 
> I agree, and this is what makes review complicated and time
> consuming. I'm relying on you as the expert, and my goal is only
> to spot check for any inconsistencies.

I know you've been very busy with everything else but did you happen to
have any chance to check this further, shall we still wait for your
results or how would you suggests us to proceed?

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 111+ messages in thread

end of thread, other threads:[~2019-05-10 12:19 UTC | newest]

Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com>
     [not found] ` <20180412224352.GB2911@altlinux.org>
2018-07-17 19:34   ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-07-17 19:40     ` Carlos O'Donell
2018-07-17 19:50       ` Egor Kobylkin
2018-07-17 19:59         ` Carlos O'Donell
2018-08-06 19:00   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
2018-10-03  8:26     ` Egor Kobylkin
2018-10-03  9:19       ` Keld Simonsen
2018-10-03  9:32         ` Egor Kobylkin
2018-10-05  8:43           ` Marko Myllynen
2018-10-05  9:20           ` Rafal Luzynski
2018-10-05 10:36             ` Egor Kobylkin
2018-10-08 22:04               ` Rafal Luzynski
2018-10-08 22:52                 ` Egor Kobylkin
2018-10-09 21:43                   ` Rafal Luzynski
2018-10-08 23:20                 ` Zack Weinberg
2018-10-09 15:26                   ` Carlos O'Donell
2018-10-09 21:51                     ` Rafal Luzynski
2018-10-09 16:10                 ` Marko Myllynen
2018-10-09 16:22                   ` Egor Kobylkin
2018-10-09 16:49                     ` Marko Myllynen
2018-10-09 22:08                   ` Rafal Luzynski
2018-10-10 11:21                     ` Marko Myllynen
2018-10-11 10:10                   ` Marko Myllynen
     [not found]             ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
2018-10-05 11:54               ` Marko Myllynen
2018-10-05 12:00                 ` Egor Kobylkin
2018-10-05 12:21                   ` Marko Myllynen
2018-10-05 20:47                     ` Egor Kobylkin
2018-10-08 12:40                       ` Marko Myllynen
2018-10-08 22:23                         ` Rafal Luzynski
2018-10-08 23:35                           ` Egor Kobylkin
2018-10-09 13:18                             ` Egor Kobylkin
2018-10-09 18:34                               ` Egor Kobylkin
2018-10-09 22:17                                 ` Rafal Luzynski
2018-10-09 22:40                                   ` Egor Kobylkin
2018-10-09 22:42                                     ` Egor Kobylkin
2018-10-10 11:22                                       ` Marko Myllynen
2018-10-10 12:19                                         ` Egor Kobylkin
2018-10-10 12:34                                           ` Marko Myllynen
2018-10-10 22:29   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
2018-10-11  9:59     ` Marko Myllynen
2018-10-11 11:04     ` Rafal Luzynski
2018-10-11 13:10       ` Marko Myllynen
2018-10-11 13:50       ` Volodymyr Lisivka
2018-10-11 14:59       ` Egor Kobylkin
2018-10-11 21:30         ` Egor Kobylkin
2018-10-11 15:05       ` Egor Kobylkin
2018-10-11 15:44   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
2018-10-11 21:33   ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
2018-10-12 14:05   ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-10-13  0:59     ` Rafal Luzynski
2018-10-13 16:58       ` Egor Kobylkin
2018-10-15 11:04         ` Marko Myllynen
2018-10-15 11:54           ` Egor Kobylkin
2018-10-23 23:08         ` Rafal Luzynski
2018-10-17 14:16   ` [PATCH v6] " Egor Kobylkin
2018-11-01 22:51   ` [PATCH v7] " Egor Kobylkin
2018-11-02  0:00   ` [PATCH v8] " Egor Kobylkin
2018-11-02 22:22     ` Rafal Luzynski
2018-11-02 23:27       ` Egor Kobylkin
2018-11-14 21:25   ` [PATCH v9] " Egor Kobylkin
2018-11-16 22:17     ` Rafal Luzynski
2018-11-17 18:34       ` Egor Kobylkin
2018-11-19  7:13         ` Marko Myllynen
2018-11-19  9:21           ` Egor Kobylkin
2018-11-19 19:35             ` Marko Myllynen
2018-12-01 22:07           ` Rafal Luzynski
2018-12-01 22:53             ` Egor Kobylkin
2018-12-03 22:19             ` Egor Kobylkin
2018-12-08  1:15               ` Rafal Luzynski
2018-12-10 21:20                 ` Marko Myllynen
2018-12-19 22:25                   ` Rafal Luzynski
2018-12-19 22:48                     ` Egor Kobylkin
2018-12-19 23:50                       ` Rafal Luzynski
2018-11-19 11:10   ` [PATCH v10] " Egor Kobylkin
2018-12-07 23:35     ` Rafal Luzynski
2018-12-08 21:51       ` Egor Kobylkin
2018-12-19 22:41         ` Rafal Luzynski
2018-12-19 23:02           ` Egor Kobylkin
2018-12-20  0:05             ` Rafal Luzynski
2018-12-08 22:28   ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
2018-12-19 23:16     ` Egor Kobylkin
2018-12-26 10:07       ` Siddhesh Poyarekar
2018-12-26 12:13         ` Egor Kobylkin
2018-12-27  1:30           ` Siddhesh Poyarekar
2018-12-27 11:28             ` Rafal Luzynski
2019-01-02 18:38   ` [PATCH v12] " Egor Kobylkin
2019-01-05 14:35     ` Rafal Luzynski
2019-01-05 21:12       ` Egor Kobylkin
2019-01-07 20:37         ` Marko Myllynen
2019-01-09  0:46           ` Egor Kobylkin
2019-01-09 20:03             ` Marko Myllynen
2019-02-04  7:14               ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
2019-02-14 16:48                 ` Marko Myllynen
2019-03-04 22:11                   ` Egor Kobylkin
2019-03-11 13:59                     ` PING " Egor Kobylkin
2019-03-14 19:48                       ` Egor Kobylkin
2019-04-19 22:24                   ` Rafal Luzynski
     [not found]                     ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
2019-04-27  2:51                       ` Siddhesh Poyarekar
2019-04-27  7:34                         ` Diego (Egor) Kobylkin
2019-04-09  1:04     ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
2019-03-19 10:39   ` ping " Egor Kobylkin
2019-03-28 16:20     ` [PING^4][PATCH " Marko Myllynen
2019-04-04 19:44     ` [PING^5][PATCH " Egor Kobylkin
2019-04-06  1:36       ` Siddhesh Poyarekar
2019-04-16  7:15     ` [PING^6][PATCH " Marko Myllynen
2019-04-16 13:17       ` Carlos O'Donell
2019-04-16 17:06         ` Egor Kobylkin
2019-04-16 17:58           ` Carlos O'Donell
2019-04-16 18:41             ` Egor Kobylkin
2019-04-16 19:06               ` Carlos O'Donell
2019-05-10 12:19                 ` Marko Myllynen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).