user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH v2] search: use boolean prefixes for git blob queries
Date: Fri, 20 Jul 2018 06:16:12 +0000	[thread overview]
Message-ID: <20180720061612.j4s3gugasle2r4iz@whir> (raw)
In-Reply-To: <20180716040734.30104-1-e@80x24.org>

I've hit some case where probabilistic searches don't work when
using dfpre:/dfpost:/dfblob: search prefixes because stemming in
the query parser interferes.

In any case, our indexing code indexes longer/unabbreviated blob
names down to its 7 character abbreviation, so there should be
no need to do wildcard searches on git blob names.
---
 lib/PublicInbox/Search.pm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 69eca9f..090d998 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -50,6 +50,9 @@ use constant {
 
 my %bool_pfx_external = (
 	mid => 'Q', # Message-ID (full/exact), this is mostly uniQue
+	dfpre => 'XDFPRE',
+	dfpost => 'XDFPOST',
+	dfblob => 'XDFPRE XDFPOST',
 );
 
 my $non_quoted_body = 'XNQ XDFN XDFA XDFB XDFHH XDFCTX XDFPRE XDFPOST';
@@ -74,9 +77,6 @@ my %prob_prefix = (
 	dfb => 'XDFB',
 	dfhh => 'XDFHH',
 	dfctx => 'XDFCTX',
-	dfpre => 'XDFPRE',
-	dfpost => 'XDFPOST',
-	dfblob => 'XDFPRE XDFPOST',
 
 	# default:
 	'' => 'XM S A XQUOT XFN ' . $non_quoted_body,
@@ -266,7 +266,7 @@ sub qp {
 		Search::Xapian::NumberValueRangeProcessor->new(DT, 'dt:'));
 
 	while (my ($name, $prefix) = each %bool_pfx_external) {
-		$qp->add_boolean_prefix($name, $prefix);
+		$qp->add_boolean_prefix($name, $_) foreach split(/ /, $prefix);
 	}
 
 	# we do not actually create AltId objects,
-- 
EW

      reply	other threads:[~2018-07-20  6:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16  4:07 [RFC] search: use boolean prefixes for git blob queries Eric Wong
2018-07-20  6:16 ` Eric Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180720061612.j4s3gugasle2r4iz@whir \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).