about summary refs log tree commit homepage
path: root/lib/PublicInbox/LeiStore.pm
DateCommit message (Collapse)
2021-02-08lei import: support Maildirs
It seems to be working trivially, though I'm probably going to split out Maildir reading into a separate package rather than using LeiToMail.
2021-02-05lei import: initial implementation
Only tested with .eml files so far, but Maildir + IMAP will be supported.
2021-01-12lei query + pagination sorta working
Parallelism and interactivity with pager + SIGPIPE needs work; but results are shown and phrase search works without shell users having to apply Xapian quoting rules on top of standard shell quoting.
2021-01-03use Eml (or MIME) objects for all indexing paths
We don't need to be keeping the raw message around after it hits git. Shard work now relies on Storable (or Sereal) and all of the indexing code relies on the Email::MIME-like API of Eml to access interesting parts of the message. Similarly, smsg->{raw_bytes} is no longer carried around and we do the CRLF adjustment when setting smsg->{bytes}. There's also a small simplification to t/import.t while we're in the area to use xqx instead of spawn/popen_rd.
2021-01-03searchidxshard: replace index_raw with index_eml
Since Storable and Sereal are designed for lossless serialization, we'll just pass $eml objects to whatever process is running SearchIdx.
2021-01-03searchidxshard: IPC conversion, part 2
We can remove some now-pointless wrapper functions by using ->ipc_do in even more places.
2021-01-02lei_store: alternative unconfigured "git var" workaround
While the changes to git->qx/git->popen from commit 171a9c24022ad7ef will be useful for the lei daemon, hiding git error messages from actual users is probably wrong and we'll just localize GIT_* vars for testing.
2021-01-02treewide: reduce load_xapian* callsites
Hopefully this will make it easier to spot dependency bugs in the future.
2021-01-01lei_store: quiet down "git var" failures
$git->qx and $git->popen now $env and $opt for redirects like lower-level popen_rd. This may be beneficial in other places.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2021-01-01lei_store: handle messages without Message-ID at all
For personal mail, unsent drafts messages are a common source of messages without Message-IDs.
2021-01-01lei_store: add ->set_eml, ->add_eml can return smsg
Add a ->set_eml method which can be a useful fire-and-forget way of either adding new files to store OR setting keywords on them. When seeing brand-new messages, add_eml can afford to return more information in the smsg instead of just the OID.
2021-01-01ipc: generic IPC dispatch based on Storable
I intend to use this with LeiStore when importing from multiple slow sources at once (e.g. curl, IMAP, etc). This is because over.sqlite3 can only have a single writer, and we'll have several slow readers running in parallel. Watch and SearchIdxShard should also be able to use this code in the future, but this will be proven with LeiStore, first.
2021-01-01revert "lei_store: use per-machine refname as git HEAD"
In retrospect, per-machine HEADs was a bad idea because users of removable storage would be thrown off when moving storage between different machines. This is only a partial revert, the Import::init_bare change to support alternate head names still exists because we may use it for other reasons.
2021-01-01lei_store: use per-machine refname as git HEAD
It may be helpful to identify the source of messages and perhaps avoid conflicting history. On the other hand, this may be a terrible idea for users who move portable storage (e.g. USB sticks) across computers...
2020-12-19lei_store: keyword extraction from mbox and Maildir
Dovecot, mutt, and likely much other software support mbox Status/X-Status headers. Ensure we have a way to extract these headers as JMAP-compatible keywords before removing them for git storage. ->add_eml now accepts setting keywords at import time, and will probably be called like this: $lst->add_eml($eml, $lst->mbox_keywords($eml)); $lst->add_eml($eml, $lst->maildir_keywords($fn));
2020-12-19lei_store: relax GIT_COMMITTER_IDENT check
It's pretty meaningless, since probably nobody notices committer info we extract author info from individual emails, anyways.
2020-12-19lei_store: simplify git_epoch_max, slightly
This follows how we detect the max epoch for v2 and shard count in Xapian.
2020-12-19lei: refine help/option parsing, implement "init"
There's a bunch of work in here as the foundations are being fleshed out. One of the UI/UX is to make it easy to keep built-in help and shell completions consistent
2020-12-19lei_store: local storage for Local Email Interface
Still unstable, this builds off the equally unstable extindex :P This will be used for caching/memoization of traditional mail stores (IMAP, Maildir, etc) while providing indexing via Xapian, along with compression, and checksumming from git. Most notably, this adds the ability to add/remove per-message keywords (draft, seen, flagged, answered) as described in the JMAP specification (RFC 8621 section 4.1.1). We'll use `.' (a single period) as an $eidx_key since it's an invalid {inboxdir} or {newsgroup} name.