From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 30F391F97F for ; Tue, 4 Jun 2019 11:27:49 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 04/24] mid: id_compress requires ASCII-clean words Date: Tue, 4 Jun 2019 11:27:28 +0000 Message-Id: <20190604112748.23598-5-e@80x24.org> In-Reply-To: <20190604112748.23598-1-e@80x24.org> References: <20190604112748.23598-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Its result is used for HTML anchors and such. --- lib/PublicInbox/MID.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/MID.pm b/lib/PublicInbox/MID.pm index 7f1ab15..6904d61 100644 --- a/lib/PublicInbox/MID.pm +++ b/lib/PublicInbox/MID.pm @@ -26,11 +26,11 @@ sub mid_clean { $mid; } -# this is idempotent +# this is idempotent, used for HTML anchor/ids and such sub id_compress { my ($id, $force) = @_; - if ($force || $id =~ /[^\w\-]/ || length($id) > MID_MAX) { + if ($force || $id =~ /[^a-zA-Z0-9_\-]/ || length($id) > MID_MAX) { utf8::encode($id); return sha1_hex($id); } -- EW