about summary refs log tree commit homepage
path: root/lib/PublicInbox/LeiToMail.pm
DateCommit message (Collapse)
2021-01-22lei: fix inadvertant FD sharing
$wq->{-ipc_atfork_child_close} neededed to be initialized properly. And start setting $0 in workers to improve visibility.
2021-01-22lei: show {pct} and {oid} in From_ lines and filenames
From_ lines are shown when mbox* variants are output to stdout, making {oid} and {pct} information visible without risking being propagated to other importer processes if they were in lei-specific X-* headers. Maildirs already had OIDs in the filename, now they gain Xapian {pct} in case anybody cares.
2021-01-21lei_to_mail: call PublicInbox::IPC::DESTROY
It doesn't seem to matter at the moment, but it should save us from some surprises down the line.
2021-01-21lei: allow more mbox inode types
We may attempt to write an mbox to any terminal, block, or character device, not just regular files and FIFOs/pipes. The only thing that is known to not work is a directory. Sockets may be possible with some OSes (e.g. Plan 9) or filesystems. This fixes t/lei.t on FreeBSD 11.x
2021-01-21lei: test some likely errors due to misuse
Because user errors happen...
2021-01-21lei q: fix augment of compressed mailboxes
We need to delay writing out the mailbox until the compressor process is up and running, so have startq wait a bit. This means we must create the pipe early and hand it off to the workers before augmenting, despite spawning the gzip/pigz/xz/bzip2 process after augment is complete.
2021-01-21lei q: do not spawn MUA early
I'm not sure why, but mutt sometimes won't detect small quickly. We'll display a progress bar meter when writing results, instead.
2021-01-21lei q: fix SIGPIPE handling from lei2mail workers
We need to properly propagate SIGPIPE to the top-level lei-daemon process and avoid relying on auto-close, since auto-close triggers Perl warnings when explicit close() does not.
2021-01-21lei q: start ->mset while query_prepare runs
We don't need the result of query_prepare (for augmenting or mass unlinking) until we're ready to deduplicate and write results to the filesystem. This ought to let us hide some of the cost of Xapian searches on multi-device/core systems for extremely expensive searches.
2021-01-18lei_to_mail: optimize for MUAs
Instead of optimizing our own performance, this optimizes our data to reduce work done by the MUA consumer. Maildir and mbox destinations no longer support any notion of the IMAP \Recent flag. JMAP has no functioning \Recent equivalent, and neither do we. In practice, having MUAs (e.g. mutt) clear the \Recent flag when committing changes to the mbox is expensive: it creates a rename(2) storm with Maildir and overwrites the entire mbox. For mboxcl2 (and mboxcl), we'll further optimize mutt behavior by setting the Lines: header in addition to Content-Length. With these changes, mutt exits instantaneously on mboxcl2, mboxcl, and Maildirs generated by "lei q".
2021-01-18lei q: parallelize Maildir and mbox writing
With 4 dedicated workers, this seems to provide a 100-120% speedup on a 4 core machine when writing thousands of search results to a Maildir or mbox. This also sets us up for high-latency IMAP destinations in the future. This opens the door to more speedup opportunities such as optimizing dedupe locking and other ways to reduce contention. This change is fairly complex and convoluted, unfortunately. Further work may allow us to simplify it and even improve performance.
2021-01-18lei q: add --mua-cmd switch
It can be convenient to invoke an MUA as search results are being written to it, as an eager person may want to start seeing results ASAP. This lets Maildir users see results in the MUA as we are writing them. Users of IMAP will eventually be able to take advantage of them, too. Since we don't support mbox locking (yet?), we'll only invoke the MUA after results are done for mbox formats.
2021-01-18lei: q: results output to Maildir and mbox* working
All the augment and deduplication stuff seems to be working based on unit tests. OpPipe is a nice general addition that will probably make future state machines easier.
2021-01-18lei_to_mail: prepare for worker offload
We'll be doing most of the work in forked off worker processes, so ensure some of it is fork and serialization-friendly.
2021-01-12lei query + pagination sorta working
Parallelism and interactivity with pager + SIGPIPE needs work; but results are shown and phrase search works without shell users having to apply Xapian quoting rules on top of standard shell quoting.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2021-01-01lei_to_mail: open FIFOs O_WRONLY so we block
Opening a FIFO with O_RDWR always succeeds on Linux, which cause the cat(1) process invoked by t/lei_to_mail.t to get stuck. Furthermore O_APPEND makes no sense on FIFOs and perhaps there's some kernel out there which will reject it.
2021-01-01lei_to_mail: unlink mboxes if not augmenting
This matches mairix(1) behavior and may be safer if there's concurrent readers on the existing mbox, especially since we don't do currently implement mbox locking (nor does mairix).
2021-01-01lei_to_mail: support Maildir, fix+test --augment
Maildir should be plenty fine for short-lived output folders.
2021-01-01lei_to_mail: support for non-seekable outputs
Users may wish to pipe output to "git am", "spamc", or similar, so we need to support those cases and not bail out on lseek(2) or ftruncate(2) failures.
2021-01-01lei_to_mail: lazy-require LeiDedupe
LeiDedupe requires SQLite, so we may want to be able to test writing mail without DBI or SQLite down the line.
2021-01-01lei: implement various deduplication strategies
For writing mboxes and Maildirs, users may wish to use stricter or looser deduplication strategies. This gives them more control.
2021-01-01lei_to_mail: start --augment, dedupe, bz2 and xz
--augment will match the mairix(1) option of the same name to augment existing search results. We'll need to implement deduplication for a better user experience. mutt ships with compressed mbox support for bz2 and xz, at least, so we'll support those out-of-the-box.
2021-01-01lei_to_mail: start atomic and compressed mbox writing
We'll allow using multiple workers to write to a single mbox (which could be compressed). This is can be done safely with O_APPEND + syswrite for uncompressed files, and using a lock when piping to pigz/gzip/bzip2/xz.
2021-01-01lei_to_mail: initial implementation for writing mbox formats
No Maildir, support, yet, but it'll come.