From: Eric Wong <e@80x24.org> To: meta@public-inbox.org Subject: [PATCH 04/26] search: reenable phrase search on non-chert Xapian Date: Thu, 23 May 2019 09:36:42 +0000 Message-ID: <20190523093704.18367-5-e@80x24.org> (raw) In-Reply-To: <20190523093704.18367-1-e@80x24.org> This is assuming nobody uses flint or earlier, anymore; as flint predates the existence of this project. --- lib/PublicInbox/Search.pm | 48 +++++++++++++++++++++++---------------- t/search.t | 1 + 2 files changed, 30 insertions(+), 19 deletions(-) diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm index eae10d8..d861cf4 100644 --- a/lib/PublicInbox/Search.pm +++ b/lib/PublicInbox/Search.pm @@ -24,8 +24,8 @@ sub load_xapian () { # n.b. FLAG_PURE_NOT is expensive not suitable for a public # website as it could become a denial-of-service vector - # FLAG_PHRASE also seems to cause performance problems - # sometimes. + # FLAG_PHRASE also seems to cause performance problems chert + # (and probably earlier Xapian DBs). glass seems fine... # TODO: make this an option, maybe? # or make indexlevel=medium as default FLAG_PHRASE()|FLAG_BOOLEAN()|FLAG_LOVEHATE()|FLAG_WILDCARD(); @@ -137,26 +137,35 @@ sub xdir ($;$) { } } +sub _xdb ($) { + my ($self) = @_; + my $dir = xdir($self, 1); + my ($xdb, $slow_phrase); + my $qpf = \($self->{qp_flags} ||= $QP_FLAGS); + if ($self->{version} >= 2) { + foreach my $part (<$dir/*>) { + -d $part && $part =~ m!/\d+\z! or next; + my $sub = Search::Xapian::Database->new($part); + if ($xdb) { + $xdb->add_database($sub); + } else { + $xdb = $sub; + } + $slow_phrase ||= -f "$part/iamchert"; + } + } else { + $slow_phrase = -f "$dir/iamchert"; + $xdb = Search::Xapian::Database->new($dir); + } + $$qpf |= FLAG_PHRASE() unless $slow_phrase; + $xdb; +} + sub xdb ($) { my ($self) = @_; $self->{xdb} ||= do { load_xapian(); - my $dir = xdir($self, 1); - if ($self->{version} >= 2) { - my $xdb; - foreach my $part (<$dir/*>) { - -d $part && $part =~ m!/\d+\z! or next; - my $sub = Search::Xapian::Database->new($part); - if ($xdb) { - $xdb->add_database($sub); - } else { - $xdb = $sub; - } - } - $xdb; - } else { - Search::Xapian::Database->new($dir); - } + _xdb($self); }; } @@ -194,7 +203,8 @@ sub query { $self->{over_ro}->recent($opts); } else { my $qp = qp($self); - my $query = $qp->parse_query($query_string, $QP_FLAGS); + my $qp_flags = $self->{qp_flags}; + my $query = $qp->parse_query($query_string, $qp_flags); $opts->{relevance} = 1 unless exists $opts->{relevance}; _do_enquire($self, $query, $opts); } diff --git a/t/search.t b/t/search.t index c063620..538baef 100644 --- a/t/search.t +++ b/t/search.t @@ -30,6 +30,7 @@ my $ro = PublicInbox::Search->new($git_dir); my $rw_commit = sub { $rw->commit_txn_lazy if $rw; $rw = PublicInbox::SearchIdx->new($git_dir, 1); + $rw->{qp_flags} = 0; # quiet a warning $rw->begin_txn_lazy; }; -- EW
next prev parent reply index Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-23 9:36 [PATCH 00/26] xcpdb: ease Xapian DB format migrations Eric Wong 2019-05-23 9:36 ` [PATCH 01/26] t/convert-compact: skip on missing xapian-compact(1) Eric Wong 2019-05-23 9:36 ` [PATCH 02/26] v1writable: retire in favor of InboxWritable Eric Wong 2019-05-23 9:36 ` [PATCH 03/26] doc: document the reason for --no-renumber Eric Wong 2019-05-23 9:36 ` Eric Wong [this message] 2019-05-23 9:36 ` [PATCH 05/26] xapcmd: new module for wrapping Xapian commands Eric Wong 2019-05-23 9:36 ` [PATCH 06/26] admin: hoist out resolve_inboxes for -compact and -index Eric Wong 2019-05-23 9:36 ` [PATCH 07/26] xapcmd: support spawn options Eric Wong 2019-05-23 9:36 ` [PATCH 08/26] xcpdb: new tool which wraps Xapian's copydatabase(1) Eric Wong 2019-05-23 9:36 ` [PATCH 09/26] xapcmd: do not cleanup on errors Eric Wong 2019-05-23 9:36 ` [PATCH 10/26] admin: move index_inbox over Eric Wong 2019-05-23 9:36 ` [PATCH 11/26] xcpdb: implement using Perl bindings Eric Wong 2019-05-23 9:36 ` [PATCH 12/26] xapcmd: xcpdb supports compaction Eric Wong 2019-05-23 9:36 ` [PATCH 13/26] v2writable: hoist out log_range sub for readability Eric Wong 2019-05-23 9:36 ` [PATCH 14/26] xcpdb: use fine-grained locking Eric Wong 2019-05-23 9:36 ` [PATCH 15/26] xcpdb: implement progress reporting Eric Wong 2019-05-23 9:36 ` [PATCH 16/26] xcpdb: cleanup error handling and diagnosis Eric Wong 2019-05-23 9:36 ` [PATCH 17/26] xapcmd: avoid EXDEV when finalizing changes Eric Wong 2019-05-23 9:36 ` [PATCH 18/26] doc: xcpdb: update to reflect the current state Eric Wong 2019-05-23 9:36 ` [PATCH 19/26] xapcmd: use "print STDERR" for progress reporting Eric Wong 2019-05-23 9:36 ` [PATCH 20/26] xcpdb: show re-indexing progress Eric Wong 2019-05-23 9:36 ` [PATCH 21/26] xcpdb: remove temporary directories on aborts Eric Wong 2019-05-23 9:37 ` [PATCH 22/26] compact: reuse infrastructure from xcpdb Eric Wong 2019-05-23 9:37 ` [PATCH 23/26] xcpdb|compact: support some xapian-compact switches Eric Wong 2019-05-23 9:37 ` [PATCH 24/26] xapcmd: cleanup on interrupted xcpdb "--compact" Eric Wong 2019-05-23 9:37 ` [PATCH 25/26] xcpdb|compact: support --jobs/-j flag like gmake(1) Eric Wong 2019-05-23 9:37 ` [PATCH 26/26] xapcmd: do not reset %SIG until last Xtmpdir is done Eric Wong 2019-05-23 10:37 ` [PATCH 27/26] doc: various updates to reflect current state Eric Wong
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: https://public-inbox.org/README * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190523093704.18367-5-e@80x24.org \ --to=e@80x24.org \ --cc=meta@public-inbox.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
user/dev discussion of public-inbox itself Archives are clonable: git clone --mirror https://public-inbox.org/meta git clone --mirror http://czquwvybam4bgbro.onion/meta git clone --mirror http://hjrcffqmbrq6wope.onion/meta git clone --mirror http://ou63pmih66umazou.onion/meta Example config snippet for mirrors Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta nntp://news.gmane.org/gmane.mail.public-inbox.general note: .onion URLs require Tor: https://www.torproject.org/ AGPL code for this site: git clone https://public-inbox.org/public-inbox.git