about summary refs log tree commit homepage
path: root/lib/PublicInbox/OverIdx.pm
diff options
authorEric Wong <e@80x24.org>2020-11-27 09:52:54 +0000
committerEric Wong <e@80x24.org>2020-11-28 04:53:23 +0000
commit811b8d3cbaa790f59b7b107140b86248da16499b (patch)
treec380c0baf53114b71e1b5c440e41d58ab7fa78fb /lib/PublicInbox/OverIdx.pm
parentb8ff2f71f04c8a2b959d6142bc7e770672589e8a (diff)
nntp: xref: use ->ALL extindex if available
Getting Xref for cross-posted messages is an O(n) operation
where `n' is the number of newsgroups on the server.  This works
acceptably when there are dozens of groups, but would be
unnacceptable when there's tens of thousands of newsgroups.

With ~140 newsgroups, a lore.kernel.org mirror already handles
"XHDR Xref $MESSAGE_ID" requests around 30% faster after
creating the xref3.idx_nntp index.

The SQL additions to ExtSearch.pm may be a bit strange and
seem more appropriate for Over.pm; however it currently makes
sense to me since those bits of over.sqlite3 access are
exclusive to ExtSearch and can't be used by traditional
v1/v2 inboxes...
Diffstat (limited to 'lib/PublicInbox/OverIdx.pm')
1 files changed, 5 insertions, 0 deletions
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 173e3220..8bec08da 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -542,6 +542,11 @@ CREATE TABLE IF NOT EXISTS xref3 (
         $dbh->do('CREATE INDEX IF NOT EXISTS idx_docid ON xref3 (docid)');
+        # performance critical, this is not UNIQUE since we may need to
+        # tolerate some old bugs from indexing mirrors
+        $dbh->do('CREATE INDEX IF NOT EXISTS idx_nntp ON '.
+                'xref3 (oidbin,xnum,ibx_id)');
         key VARCHAR(255) PRIMARY KEY,