Eric Wong <e@80x24.org>2020-11-27 09:52:54 +0000
committerEric Wong <e@80x24.org>2020-11-28 04:53:23 +0000
nntp: xref: use ->ALL extindex if available
Getting Xref for cross-posted messages is an O(n) operation
where `n' is the number of newsgroups on the server.  This works
acceptably when there are dozens of groups, but would be
unnacceptable when there's tens of thousands of newsgroups.

With ~140 newsgroups, a lore.kernel.org mirror already handles
"XHDR Xref $MESSAGE_ID" requests around 30% faster after
creating the xref3.idx_nntp index.

The SQL additions to ExtSearch.pm may be a bit strange and
seem more appropriate for Over.pm; however it currently makes
sense to me since those bits of over.sqlite3 access are
exclusive to ExtSearch and can't be used by traditional
v1/v2 inboxes...
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 173e3220..8bec08da 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -542,6 +542,11 @@ CREATE TABLE IF NOT EXISTS xref3 (
         $dbh->do('CREATE INDEX IF NOT EXISTS idx_docid ON xref3 (docid)');
+        # performance critical, this is not UNIQUE since we may need to
+        # tolerate some old bugs from indexing mirrors
+        $dbh->do('CREATE INDEX IF NOT EXISTS idx_nntp ON '.
+                'xref3 (oidbin,xnum,ibx_id)');
         key VARCHAR(255) PRIMARY KEY,