From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-3.0 required=3.0 tests=AWL,BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_DNSWL_HI, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=no autolearn_force=no version=3.4.2 Received: from nue.mailmanlists.eu (nue.mailmanlists.eu [IPv6:2a01:4f8:1c0c:6b10::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 0AE971F61A for ; Mon, 12 Dec 2022 05:05:38 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.b="gQ48WiBo"; dkim-atps=neutral Received: from nue.mailmanlists.eu (localhost [127.0.0.1]) by nue.mailmanlists.eu (Postfix) with ESMTP id C91717E669; Mon, 12 Dec 2022 05:05:30 +0000 (UTC) Authentication-Results: nue.mailmanlists.eu; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=gQ48WiBo; dkim-atps=neutral Received: from o1678948x4.outbound-mail.sendgrid.net (o1678948x4.outbound-mail.sendgrid.net [167.89.48.4]) by nue.mailmanlists.eu (Postfix) with ESMTPS id 895E17E5D4 for ; Mon, 12 Dec 2022 05:05:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ruby-lang.org; h=from:references:subject:mime-version:content-type: content-transfer-encoding:list-id:to:cc:content-type:from:subject:to; s=s1; bh=H7VSBQB4qEEKquNBlb/J6NEPqH5TuANefMYlxH28kJU=; b=gQ48WiBoM9GY+sM50RalkVdfRyXUMYi+S07581V/2SXSbBYHQliL5dtKWvQeaRNHQFTl kVRX5PfcJHlQVI/tcpWjzCBmFim4a3hRYN6dOqB93MuU6rnIZ3R9oic2idyvA9/JuGS8Xa FI6QO1vTun0e2fqwCyVhWvFKNMNVLBdwxB9o9kZsejA49dAz7hQJAho1NyazeEGngoyk9J yQY936YgSKImuTZhCIHR8yz+4+askYWdtlmIMHvwvtBo3MJ2H5R1Ie9SeH5vV0zC9bzoEl lCJljvXJnQXhq1rdM8Jjvm0jDEC/zfTEY/4tt3mKD1kQLPUXCBG+4B/f0IE0rixw== Received: by filterdrecv-6c4ccfbdd8-wnrfr with SMTP id filterdrecv-6c4ccfbdd8-wnrfr-1-6396B693-21 2022-12-12 05:05:23.812288172 +0000 UTC m=+2095192.264402279 Received: from herokuapp.com (unknown) by geopod-ismtpd-1-3 (SG) with ESMTP id w6XRULXRQe6Zz9vMJ8WL7A for ; Mon, 12 Dec 2022 05:05:23.688 +0000 (UTC) Date: Mon, 12 Dec 2022 05:05:23 +0000 (UTC) From: "hsbt (Hiroshi SHIBATA)" Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 19007 X-Redmine-Issue-Author: nobu X-Redmine-Issue-Assignee: duerst X-Redmine-Sender: hsbt X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 87621 X-SG-EID: =?us-ascii?Q?BAqCsB0mvMCnHXWkBD9MTHUyV8AfYOY3mM6FlznSxbO5US8=2FSBDCMBBYmmIAuU?= =?us-ascii?Q?TeaKG5YiQzAGgZc0jHcwXZVFMafO4FzN+zEifU7?= =?us-ascii?Q?=2FXM1+euW2kr7zKNEEPLt6C77AhZOoTETnz7gkJo?= =?us-ascii?Q?wj9fgueGMPyHo6jZndEHUJEYSEROkuvyayBqYmt?= =?us-ascii?Q?FPgyikig0wPGHd1AIrwiTiqlThAe+iCJUwYsCR7?= =?us-ascii?Q?W3b6WtaaPy+KRkI7wBbvkt03My6gWv7EQDboQEU?= =?us-ascii?Q?awWeK+1b1LUeh6pDlQMZQ=3D=3D?= To: ruby-core@ml.ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== Message-ID-Hash: DZSOWST7DHOKAKVK5YP4UY2GIZTBDQKC X-Message-ID-Hash: DZSOWST7DHOKAKVK5YP4UY2GIZTBDQKC X-MailFrom: bounces+313651-b711-ruby-core=ml.ruby-lang.org@em5188.ruby-lang.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list Reply-To: Ruby developers Subject: [ruby-core:111259] [Ruby master Bug#19007] Unicode tables differences from Unicode.org 14.0 data List-Id: Ruby developers Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Issue #19007 has been updated by hsbt (Hiroshi SHIBATA). Thanks. I create the [3.3 milestone](https://bugs.ruby-lang.org/versions/71) ---------------------------------------- Bug #19007: Unicode tables differences from Unicode.org 14.0 data https://bugs.ruby-lang.org/issues/19007#change-100569 * Author: nobu (Nobuyoshi Nakada) * Status: Open * Priority: Normal * Assignee: duerst (Martin D=FCrst) * ruby -v: 3.2.0 6898984f1cd * Backport: 2.7: DONTNEED, 3.0: DONTNEED, 3.1: DONTNEED ---------------------------------------- I found the header in Unicode Emoji 14.0 data files had changed slightly (a= nd again at 15.0), but `enc/unicode/case-folding.rb` didn't follow it. Then I fixed it and rebuilt the headers under `enc/unicode/14.0.0`, `name2c= type.h` had diffences from the master, as bellow. `CR_Lower`, `CR_Cased` and `CR_Other_Lowercase` just seem misses in the pre= vious operation, and no problems. But U+11720..U+11721 in `CR_Grapheme_Cluster_Break_SpacingMark` is absent i= n the original data of the Unicode.org. According to @naruse's investigation, it was removed at the commit [Update = to Unicode 14.0.0], while U+11720 is still SpacingMark in the latest https:= //www.unicode.org/reports/tr29/. [Update to Unicode 14.0.0]: https://github.com/latex3/unicode-data/commit/5= 570040ac8a30e2c2ca4912d415ecaa0498fa23a#diff-1e957b94de10ea96d32a338c005b1f= 05788af458cf335fc92683bc297e53ed94L582 ```diff diff --git a/enc/unicode/14.0.0/name2ctype.h b/enc/unicode/14.0.0/name2ctyp= e.h index 99a3eeca190..f49e5cd7273 100644 --- a/enc/unicode/14.0.0/name2ctype.h +++ b/enc/unicode/14.0.0/name2ctype.h @@ -1565,7 +1565,7 @@ static const OnigCodePoint CR_Graph[] =3D { =20 /* 'Lower': [[:Lower:]] */ static const OnigCodePoint CR_Lower[] =3D { - 664, + 668, 0x0061, 0x007a, 0x00aa, 0x00aa, 0x00b5, 0x00b5, @@ -2196,6 +2196,10 @@ static const OnigCodePoint CR_Lower[] =3D { 0x105a3, 0x105b1, 0x105b3, 0x105b9, 0x105bb, 0x105bc, + 0x10780, 0x10780, + 0x10783, 0x10785, + 0x10787, 0x107b0, + 0x107b2, 0x107ba, 0x10cc0, 0x10cf2, 0x118c0, 0x118df, 0x16e60, 0x16e7f, @@ -12651,7 +12655,7 @@ static const OnigCodePoint CR_Math[] =3D { =20 /* 'Cased': Derived Property */ static const OnigCodePoint CR_Cased[] =3D { - 151, + 155, 0x0041, 0x005a, 0x0061, 0x007a, 0x00aa, 0x00aa, @@ -12763,6 +12767,10 @@ static const OnigCodePoint CR_Cased[] =3D { 0x105a3, 0x105b1, 0x105b3, 0x105b9, 0x105bb, 0x105bc, + 0x10780, 0x10780, + 0x10783, 0x10785, + 0x10787, 0x107b0, + 0x107b2, 0x107ba, 0x10c80, 0x10cb2, 0x10cc0, 0x10cf2, 0x118a0, 0x118df, @@ -22615,7 +22623,7 @@ static const OnigCodePoint CR_Extender[] =3D { =20 /* 'Other_Lowercase': Binary Property */ static const OnigCodePoint CR_Other_Lowercase[] =3D { - 20, + 24, 0x00aa, 0x00aa, 0x00ba, 0x00ba, 0x02b0, 0x02b8, @@ -22636,6 +22644,10 @@ static const OnigCodePoint CR_Other_Lowercase[] = =3D { 0xa770, 0xa770, 0xa7f8, 0xa7f9, 0xab5c, 0xab5f, + 0x10780, 0x10780, + 0x10783, 0x10785, + 0x10787, 0x107b0, + 0x107b2, 0x107ba, }; /* CR_Other_Lowercase */ =20 /* 'Other_Uppercase': Binary Property */ @@ -37049,7 +37061,7 @@ static const OnigCodePoint CR_Grapheme_Cluster_Brea= k_Extend[] =3D { =20 /* 'Grapheme_Cluster_Break_SpacingMark': Grapheme_Cluster_Break=3DSpacingM= ark */ static const OnigCodePoint CR_Grapheme_Cluster_Break_SpacingMark[] =3D { - 161, + 160, 0x0903, 0x0903, 0x093b, 0x093b, 0x093e, 0x0940, @@ -37183,7 +37195,6 @@ static const OnigCodePoint CR_Grapheme_Cluster_Brea= k_SpacingMark[] =3D { 0x116ac, 0x116ac, 0x116ae, 0x116af, 0x116b6, 0x116b6, - 0x11720, 0x11721, 0x11726, 0x11726, 0x1182c, 0x1182e, 0x11838, 0x11838, ``` --=20 https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-c= ore.ml.ruby-lang.org/