From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 7730B1F453 for ; Fri, 1 Feb 2019 07:51:03 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] linkify: support proto://hostname without trailing slash Date: Fri, 1 Feb 2019 07:51:03 +0000 Message-Id: <20190201075103.23547-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Sometimes users will write "http://example.com" without the trailing slash, which every browser and tool I've tested seems to understand. --- lib/PublicInbox/Linkify.pm | 5 +++-- t/linkify.t | 9 +++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Linkify.pm b/lib/PublicInbox/Linkify.pm index 274f382..aa472cd 100644 --- a/lib/PublicInbox/Linkify.pm +++ b/lib/PublicInbox/Linkify.pm @@ -16,11 +16,12 @@ use Digest::SHA qw/sha1_hex/; my $SALT = rand; my $LINK_RE = qr{(\()?\b((?:ftps?|https?|nntps?|gopher):// - [\@:\w\.-]+/ + [\@:\w\.-]+(?:/ (?:[a-z0-9\-\._~!\$\&\';\(\)\*\+,;=:@/%]*) (?:\?[a-z0-9\-\._~!\$\&\';\(\)\*\+,;=:@/%]+)? (?:\#[a-z0-9\-\._~!\$\&\';\(\)\*\+,;=:@/%\?]+)? - )}xi; + )? + )}xi; sub new { bless {}, $_[0] } diff --git a/t/linkify.t b/t/linkify.t index a55ed22..f0b3a6d 100644 --- a/t/linkify.t +++ b/t/linkify.t @@ -14,6 +14,15 @@ use PublicInbox::Linkify; is($s, qq($u.), 'trailing period not in URL'); } +{ + my $l = PublicInbox::Linkify->new; + my $u = 'http://i-forgot-trailing-slash.example.com'; + my $s = $u; + $s = $l->linkify_1($s); + $s = $l->linkify_2($s); + is($s, qq($u), 'missing trailing slash OK'); +} + # handle URLs in parenthesized statements { my $l = PublicInbox::Linkify->new; -- EW