about summary refs log tree commit homepage
path: root/lib/PublicInbox/SearchIdx.pm
DateCommit message (Collapse)
2015-09-25searchidx: remove unused sub: next_doc_id
It seems like it was never used
2015-09-15searchidx: sync Msgmap database along with Xapian
We can avoid duplicating work of extracting messages from git if we tie this to Xapian. Of course, this ties the two features together, but it's probably reasonable to expect that anybody who wants to use public-inbox to serve messages to front-end users will have both.
2015-09-15searchidx: hoist out rlog code
We'll be reusing this for loading msgmap.
2015-09-06update copyright headers and email addresses
In the future, it should be possible to use this: git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-03search: disable Message-ID compression in Xapian
We'll continue to compress long Message-IDs in URLs (which we know about), but we will store entire Message-IDs in the Xapian database to facilitate ease-of-lookups in external databases.
2015-09-01search: reduce redundant doc data
Redundant document data increases our database size, pull the smsg->mid off the unique term, the smsg->ts off the value, and only generate the formatted display date off smsg->ts.
2015-08-30search: do not index references and inreplyto terms
We no longer need them, as we can rely on index-time thread resolution and thread merging. This allows us to index less data and hopefully increase efficiency.
2015-08-29avoid length in boolean context
Perl does not currently optimize for this. ref (from p5p): http://mid.gmane.org/D5C27970-9176-4C7A-8B99-7D78360E67A2@pobox.com
2015-08-25mid: mid_compressed => mid_compress
Consistently name mid_* functions as verbs.
2015-08-23cleanup calls to header_obj
Dereference header_obj only once when performance may be critical, or simplify our code by calling "header" directly on the Email::{Simple,MIME} object if not.
2015-08-23hopefully fix broken permissions for search
We must preserve the umask for the entirety of the indexing operation, as Xapian transactions replace entire files atomically instead of writing them in place.
2015-08-23search: respect core.sharedRepository in for Xapian DB
Extend the purpose of core.sharedRepository to apply to the $GIT_DIR/public-inbox/xapian* directory.
2015-08-22search: split search indexing to a separate file
This makes organization easier and reduces the amount of code loaded for a PSGI, mod_perl or CGI instance.