From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-2.9 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, URIBL_BLOCKED shortcircuit=no autolearn=unavailable version=3.3.2 X-Original-To: meta@public-inbox.org Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 9DDA9200EE; Sat, 5 Sep 2015 09:14:57 +0000 (UTC) Date: Sat, 5 Sep 2015 09:14:57 +0000 From: Eric Wong To: meta@public-inbox.org Subject: Re: [PATCH 6/6] extmsg: fall back to partial Message-ID matching Message-ID: <20150905091457.GA27857@dcvr.yhbt.net> References: <1441443668-21092-1-git-send-email-e@80x24.org> <1441443668-21092-7-git-send-email-e@80x24.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1441443668-21092-7-git-send-email-e@80x24.org> List-Id: Eric Wong wrote: > In case a URL gets truncated (as is common with long URLs), > we can rely on Xapian for partial matches and bring the user > to their destination. Note: this is a bit half-assed and does not work when Message-IDs is broken into multiple terms (common). Perhaps a prefix trie is necessary; but a good on-disk one does not seem to readily exist in Debian (or anywhere) for Perl? Oh well, this is a rare feature. diff --git a/lib/PublicInbox/ExtMsg.pm b/lib/PublicInbox/ExtMsg.pm index 243b6ba..77537c2 100644 --- a/lib/PublicInbox/ExtMsg.pm +++ b/lib/PublicInbox/ExtMsg.pm @@ -95,6 +95,8 @@ sub ext_msg { unshift @pfx, { srch => $ctx->{srch}, url => $url }; foreach my $pfx (@pfx) { my $srch = delete $pfx->{srch} or next; + + # FIXME we may need a proper prefix trie here... if (my $res = $srch->mid_prefix($mid)) { $n_partial += scalar(@$res); $pfx->{res} = $res;