about summary refs log tree commit homepage
path: root/lib/PublicInbox/SearchMsg.pm
DateCommit message (Collapse)
2016-03-12searchmsg: preserve hard tabs, but drop CR (\r)
Hard tabs *may* be searchable, so preserve them since they do not take up any more space than a normal space. However, CR (carriage return) is worthless and likely a sign of a buggy mail (or spam) client anyways.
2016-03-03use raw header for Message-ID
Message-IDs should not be MIME encoded, but in case they are, use the raw form for compatibility with ssoma and possibly other tools. This prevents a potential problem where a malicious client could confuse our storage layer into indexing incorrect contents.
2016-02-28searchmsg: update + fix license header
Not sure how, but this should've always been AGPL-3.0+ like the rest of the code, not GPL-3.0+
2015-11-20various internal documentation updates
Hopefully this gives new hackers a better overview of how the components relate to each other.
2015-09-30nntp: implement OVER/XOVER summary in search document
The document data of a search message already contains a good chunk of the information needed to respond to OVER/XOVER commands quickly. Expand on that and use the document data to implement OVER/XOVER quickly. This adds a dependency on Xapian being available for nntpd usage, but is probably alright since nntpd is esoteric enough that anybody willing to run nntpd will also want search functionality offered by Xapian. This also speeds up XHDR/HDR with the To: and Cc: headers and :bytes/:lines article metadata used by some clients for header displays and marking messages as read/unread.
2015-09-19nntp: implement XROVER, speed up XHDR for some cases
Using Xapian allows us to implement XROVER without forking new processes.
2015-09-06update copyright headers and email addresses
In the future, it should be possible to use this: git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-04SearchMsg: avoid encoding Message-IDs
Spaces may be added when using header_str with Email::MIME->create, so use the normal "header" parameter when setting Message-IDs and References.
2015-09-03search: disable Message-ID compression in Xapian
We'll continue to compress long Message-IDs in URLs (which we know about), but we will store entire Message-IDs in the Xapian database to facilitate ease-of-lookups in external databases.
2015-09-01search: reduce redundant doc data
Redundant document data increases our database size, pull the smsg->mid off the unique term, the smsg->ts off the value, and only generate the formatted display date off smsg->ts.
2015-08-28search: do not iterate through entire termlist
A document may have many terms, so this hurts performance if we blindly iterate. Unfortunately, we can't rely on the order of the termlist just yet, either, so we must repeatedly restart the search for now until we're ready to bump schema versions.
2015-08-25search: implement subject summarization
We ought to summarize subjects to avoid exploding line lengths in the web interface.
2015-08-25mid: mid_compressed => mid_compress
Consistently name mid_* functions as verbs.
2015-08-20search: preserve References: order in document data
We need proper ordering of References to thread messages correctly. We would lose this order if we load the terms from the database, so set it directly document data. Do not bother with a separate In-Reply-To, since Mail::Thread just merges the IRT into References. This bumps our schema version once again.
2015-08-16SearchMsg: ensure metadata for ghost messages mid
Ghosts have no document data in them. Perhaps we should just rely on terms for Message-ID and avoid storing that in the document data...
2015-08-15view: display replies in per-message view
This can be used to quickly scan for replies in a message without displaying an entire thread.
2015-08-15search: make search results more OO
This will relieve callers of the need to decode the data we store internally in Xapian
2015-08-13initial search backend implementation
This shall allow us to search for replies/threads more easily.