bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* Update to Unicode 15.1.0
@ 2024-01-30 22:19 Bruno Haible
  0 siblings, 0 replies; only message in thread
From: Bruno Haible @ 2024-01-30 22:19 UTC (permalink / raw)
  To: bug-gnulib

This patch updates Gnulib to Unicode 15.1.0.

It's too large for the mailing list; please view the commit
a93e0da1d9fb13b6340ebd73eb36841bda0ebc0d .


2024-01-30  Bruno Haible  <bruno@clisp.org>

	Update to Unicode 15.1.0.

	* lib/gen-uni-tables.c (PROP_SENTENCE_TERMINAL): Renamed from
	PROP_STERM.
	(PROP_IDS_UNARY_OPERATOR, PROP_ID_COMPAT_MATH_CONTINUE,
	PROP_ID_COMPAT_MATH_START): New enum items.
	(UC_INDIC_CONJUNCT_BREAK_*): New enum items.
	(unicode_indic_conjunct_break): New variable.
	(fill_properties): Rename local variable propvalue to propcode. Handle
	the properties IDS_Unary_Operator, ID_Compat_Math_Continue,
	ID_Compat_Math_Start. Parse the InCB values from file
	DerivedCoreProperties.txt.
	(indic_conjunct_break_as_c_identifier,
	output_indic_conjunct_break_test): New functions.
	(indic_conjunct_break_table): New variable.
	(output_indic_conjunct_break): New function.
	(fill_width): Accept spaces at the end of field0 and at the start and
	end of field1.
	(LBP_QU1, LBP_QU2, LBP_QU3): New enum items, for Unicode TR #14 rules
	(LB15a) and (LB15b).
	(LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF): New enum items, for Brahmic
	scripts.
	(get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected.
	(debug_output_lbp): Print either LBP_QU1 or LBP_QU2 or LBP_QU3 as
	LBP_QU. Handle LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF.
	(fill_org_lbp): Accept spaces at the end of field0 and at the start and
	end of field1. Recognize LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF.
	(debug_output_org_lbp): Handle LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF.
	(lbp_value_to_string): Handle LBP_QU1, LBP_QU2, LBP_QU3 instead of
	LBP_QU. Handle LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF.
	(output_lbrk_rules_as_tables): Treat LBP_QU as macro that maps to three
	table rows/columns. Replace rule (LB15) with rules (LB15b) and (LB15a).
	(get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected.
	(main): Invoke output_indic_conjunct_break_test and
	output_indic_conjunct_break.

	* All generated files under lib/uni* and tests/uni*: Regenerate.
	* tests/uniname/NameAliases.txt: Update.
	* tests/uniname/UnicodeData.txt: Update.
	* tests/uninorm/NormalizationTest.txt: Update.
	* tests/unigbrk/GraphemeBreakTest.txt: Update.
	* tests/uniwbrk/WordBreakTest.txt: Update.

	* lib/unilbrk/lbrktables.h (LBP_QU1, LBP_QU2, LBP_QU3): New enum items,
	for Unicode TR #14 rules (LB15a) and (LB15b).
	(LBP_QU): Remove enum item.
	(LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF): New enum items, for Brahmic
	scripts.
	(unilbrk_table): Update array bounds.
	* lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop):
	Conditionally replace LBP_QU2 with LBP_QU1, for rule (LB15a).
	Conditionally replace LBP_QU3 with LBP_QU1, for rule (LB15b).
	* lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop):
	Likewise.
	* lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop):
	Likewise.

	* lib/unictype.in.h (UC_INDIC_CONJUNCT_BREAK_*): New enum values.
	(uc_indic_conjunct_break_name, uc_indic_conjunct_break_byname,
	uc_indic_conjunct_break): New declarations.
	* lib/unictype/incb_byname.c: New file.
	* lib/unictype/incb_byname.gperf: New file.
	* lib/unictype/incb_name.c: New file.
	* lib/unictype/incb_name.h: New file.
	* lib/unictype/incb_of.c: New file.
	* lib/unictype/incb_of.h: New generated file.
	* modules/unictype/incb-all: New file.
	* modules/unictype/incb-byname: New file.
	* modules/unictype/incb-name: New file.
	* modules/unictype/incb-of: New file.
	* tests/unictype/test-incb_byname.c: New file.
	* tests/unictype/test-incb_name.c: New file.
	* tests/unictype/test-incb_of.c: New file.
	* tests/unictype/test-incb_of.h: New generated file.
	* modules/unictype/incb-byname-tests: New file.
	* modules/unictype/incb-name-tests: New file.
	* modules/unictype/incb-of-tests: New file.

	* lib/unigbrk.in.h (uc_is_grapheme_break, u*_grapheme_next,
	u*_grapheme_prev): Add comments.
	* lib/unigbrk/u-grapheme-breaks.h (FUNC): Add local variables
	incb_consonant_extended, incb_consonant_extended_linker,
	incb_consonant_extended_linker_extended. Implement rule (GB9c).
	* modules/unigbrk/u8-grapheme-breaks (Depends-on): Add unictype/incb-of.
	* modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise.
	* modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise.
	* modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise.
	* tests/unigbrk/test-uc-is-grapheme-break.c (main): Add local variables
	incb_consonant_extended, incb_consonant_extended_linker,
	incb_consonant_extended_linker_extended. Skip test cases that match rule
	(GB9c).
	* modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add
	unictype/incb-of.

	* All the affected modules: Bump required libunistring version.






^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-01-30 22:19 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-30 22:19 Update to Unicode 15.1.0 Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).