From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 Received: from nue.mailmanlists.eu (nue.mailmanlists.eu [94.130.110.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id E11121F44D for ; Thu, 18 Apr 2024 11:41:27 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (1024-bit key; secure) header.d=ml.ruby-lang.org header.i=@ml.ruby-lang.org header.a=rsa-sha256 header.s=mail header.b=cFSV49Li; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=ob1Rq4V7; dkim-atps=neutral Received: from nue.mailmanlists.eu (localhost [127.0.0.1]) by nue.mailmanlists.eu (Postfix) with ESMTP id 0828E84401; Thu, 18 Apr 2024 11:41:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ml.ruby-lang.org; s=mail; t=1713440480; bh=0UpLn3F84Dbk7vYBS07T+MIVs9V/5IhefVR9XPQauzA=; h=Date:References:To:Reply-To:Subject:List-Id:List-Archive: List-Help:List-Owner:List-Post:List-Subscribe:List-Unsubscribe: From:Cc:From; b=cFSV49LiZF8LS2awUccyP9zEC8ja6oWfS8irX/55KT47bxYGryufyWXy+VzkDV3Xi 09fyPvHvm9LiDlykZmtMxwH6QaXKnzrJQUQHYyiPkhrZc6k0owbYFhXeE6U9a68ZKU op6F3nQ7BhTqlwxWC7uFVwVBeaIpDfHW18grecsA= Received: from s.wfbtzhsw.outbound-mail.sendgrid.net (s.wfbtzhsw.outbound-mail.sendgrid.net [159.183.224.105]) by nue.mailmanlists.eu (Postfix) with ESMTPS id 4595D84379 for ; Thu, 18 Apr 2024 11:41:17 +0000 (UTC) Authentication-Results: nue.mailmanlists.eu; dkim=pass (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=ob1Rq4V7; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ruby-lang.org; h=from:references:subject:mime-version:content-type: content-transfer-encoding:list-id:to:cc:content-type:from:subject:to; s=s1; bh=pZLJgvh4HKoprl60SHPTJrAtEuTuNtTyYYOFWGBeGSY=; b=ob1Rq4V7aLqk4o/NcCWFa1T/3wI4Iu/2MhP6xhOfTBvVIwIzE8Q+1dgcVPNipU6XLEjn Smd+0R0PfPIhlChP/NRBvW+7+Es3bsh/lXTCq3qMiu9aDQJFzkVvWc/VUdHMuPwtBYNB9O EH3EgMG9TKbok8OZ7gvn+41LkdXoZCdGv89EkebpyoLRwdJQ73H+K+y1wTMI1vl5UvtoMr vgPGdBvKH/u7aRnc9fcAAt3X6qZgTJYU9iO5NAEQ/ufnK71hQXNQpAqSGQb8Y/TIM1eN6O mfBarO87n5gaairhpT6WQmDsKAzZChqQxBODj82Df5wOISvR8M0nQ1jnu/RSlGxQ== Received: by filterdrecv-6d66cffcd6-4s64m with SMTP id filterdrecv-6d66cffcd6-4s64m-1-662106DB-7 2024-04-18 11:41:15.424599399 +0000 UTC m=+481845.931317662 Received: from herokuapp.com (unknown) by geopod-ismtpd-8 (SG) with ESMTP id CQWUGsZqQaehO5fkfamqSQ for ; Thu, 18 Apr 2024 11:41:15.255 +0000 (UTC) Date: Thu, 18 Apr 2024 11:41:15 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Misc X-Redmine-Issue-Id: 20434 X-Redmine-Issue-Author: kddnewton X-Redmine-Issue-Priority: Normal X-Redmine-Sender: Eregon X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 94192 X-SG-EID: =?us-ascii?Q?u001=2EByjZWvxTCjdoV8K03xEuhE7KqN4thWULFLM7+oH78KY30oYB3qFthsDpL?= =?us-ascii?Q?4w4cbYa3ttBh8bAHPOnE=2FkzPba67JNu7Lnrked2?= =?us-ascii?Q?O7K9VQ=2FJax1bruIHzKijDlp+dnw1U7ixIzZGiEd?= =?us-ascii?Q?hadcf6tiwaX6xT9AtSVPaNhxqam342N1umiIYnm?= =?us-ascii?Q?m2XOUdhtNlzrp3mMZsYH6ZL0faPxBMDAwKaI4Ih?= =?us-ascii?Q?q4mjsXb3I2IJyF5EZX60TDfsGsnmIUY+ItFjpVU?= =?us-ascii?Q?c3UwwcZhNQTWwFbCqH3YMiYOtQ=3D=3D?= To: ruby-core@ml.ruby-lang.org X-Entity-ID: u001.I8uzylDtAfgbeCOeLBYDww== Message-ID-Hash: OWAFSEUHIVJQGQJ6K6UQVU3NAP2X4Z45 X-Message-ID-Hash: OWAFSEUHIVJQGQJ6K6UQVU3NAP2X4Z45 X-MailFrom: bounces+313651-b711-ruby-core=ml.ruby-lang.org@em5188.ruby-lang.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list Reply-To: Ruby developers Subject: [ruby-core:117596] [Ruby master Misc#20434] Deprecate encoding-related regular expression modifiers List-Id: Ruby developers Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: "Eregon (Benoit Daloze) via ruby-core" Cc: "Eregon (Benoit Daloze)" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Issue #20434 has been updated by Eregon (Benoit Daloze). This seems a good simplification to me, I think the semantics of these encoding modifiers are confusing to most Rubyists. I wouldn't be worried too much about length of the replacement, because `/.../s`/`/.../e` are likely very rare (using file encoding seems a good replacement for those). `/.../u` seems redundant with the default source encoding, so the `u` can likely just be removed in most cases. I'm not so sure `/.../n`, that may be more frequent. Methods to convert an existing Regexp from one encoding to another feel suboptiomal, because that will cause an extra Regexp instance and creating a Regexp is not cheap due to many checks, allocations, and even compilation (AFAIK eager in CRuby at least). So I think the existing `Regexp.new("a".dup.force_encoding(Encoding::WINDOWS_31J))` is good enough. And since this would address a deprecation, it seems very important that the code also works on older Ruby versions. (I'm all for `String#{encoded,with_encoding}` but it seems best to propose that as a separate ticket) I would be interested to have a good textual description in #20406 of how the encoding of a Regexp is computed currently, it seems quite complex, but having it in text would allow to reason more easily about it. Maybe we could simplify it while remaining compatible (i.e. the specific value of Regexp#encoding matters not so much, what matters is a Regexp can still be matched against Strings of various encoding like it could before). ---------------------------------------- Misc #20434: Deprecate encoding-related regular expression modifiers https://bugs.ruby-lang.org/issues/20434#change-108001 * Author: kddnewton (Kevin Newton) * Status: Open ---------------------------------------- This is a follow-up to @duerst's comment here: https://bugs.ruby-lang.org/issues/20406#note-6. As noted in the other issue, there are many encodings that factor in to how a regular expression operates. This includes: * The encoding of the file * The encoding of the string parts within the regular expression * The regular expression encoding modifiers * The encoding of the string being matched At the time the modifiers were introduced, I believe the modifiers may have been the only (??) encoding that factored in here. At this point, however, they can lead to quite a bit of confusion, as noted in the other ticket. I would like to propose to deprecate the regular expression encoding modifiers. Instead, we could suggest in a warning to instead create a regular expression with an encoded string. For example, when we find: ```ruby /\x81\x40/s ``` we would instead suggest: ```ruby ::Regexp.new(::String.new("\x81\x40", encoding: "Windows-31J")) ``` or equivalent. As a migration path, we could do the following: 1. Emit a warning to change to the suggested expression 2. Change the compiler to compile to the suggested expression when those flags are found 3. Remove support for the flags Step 2 may be unnecessary depending on how long of a timeline we would like to provide. To be clear, I'm not advocating for any particular timeline, and would be fine with this being multiple years/versions to give plenty of time for people to migrate. But I do think this would be a good change to eliminate confusion about the interaction between the four different encodings at play. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/