From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C2A4C1F453 for ; Wed, 30 Jan 2019 10:35:36 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] solvergit: don't confuse Xapian with ".." in filenames Date: Wed, 30 Jan 2019 10:35:36 +0000 Message-Id: <20190130103536.1648-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Xapian will interpret ".." as ranges, even quoted phrases. So break up words on ".." since punctuation (AFAIK) is not searchable, anyways. --- lib/PublicInbox/SolverGit.pm | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/SolverGit.pm b/lib/PublicInbox/SolverGit.pm index 59d2c93..c502526 100644 --- a/lib/PublicInbox/SolverGit.pm +++ b/lib/PublicInbox/SolverGit.pm @@ -170,6 +170,13 @@ sub extract_diff ($$$$$) { sub path_searchable ($) { defined($_[0]) && $_[0] =~ m!\A[\w/\. \-]+\z! } +# ".." appears in path names, which confuses Xapian into treating +# it as a range query. So we split on ".." since Xapian breaks +# on punctuation anyways: +sub filename_query ($) { + join('', map { qq( dfn:"$_") } split(/\.\./, $_[0])); +} + sub find_extract_diff ($$$) { my ($self, $ibx, $want) = @_; my $srch = $ibx->search or return; @@ -187,11 +194,11 @@ sub find_extract_diff ($$$) { my $path_b = $want->{path_b}; if (path_searchable($path_b)) { - $q .= qq{ dfn:"$path_b"}; + $q .= filename_query($path_b); my $path_a = $want->{path_a}; if (path_searchable($path_a) && $path_a ne $path_b) { - $q .= qq{ dfn:"$path_a"}; + $q .= filename_query($path_a); } } -- EW