From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id AFA7B1F6C1 for ; Sun, 14 Aug 2016 10:34:00 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] mid: no wide characters for sha1_hex Date: Sun, 14 Aug 2016 10:34:00 +0000 Message-Id: <20160814103400.17779-1-e@80x24.org> List-Id: Apparently there are some really screwed up In-Reply-To fields out there. --- lib/PublicInbox/MID.pm | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/MID.pm b/lib/PublicInbox/MID.pm index 78952b9..bb40cc7 100644 --- a/lib/PublicInbox/MID.pm +++ b/lib/PublicInbox/MID.pm @@ -25,6 +25,7 @@ sub id_compress { my ($id, $force) = @_; if ($force || $id =~ /[^\w\-]/ || length($id) > MID_MAX) { + utf8::encode($id); return sha1_hex($id); } $id; @@ -36,7 +37,9 @@ sub mid2path { unless (defined $x38) { # compatibility with old links (or short Message-IDs :) - $mid = sha1_hex(mid_clean($mid)); + $mid = mid_clean($mid); + utf8::encode($mid); + $mid = sha1_hex($mid); ($x2, $x38) = ($mid =~ /\A([a-f0-9]{2})([a-f0-9]{38})\z/); } "$x2/$x38"; -- EW