about summary refs log tree commit homepage
path: root/lib
DateCommit message (Collapse)
2021-06-20lei sucks: don't warn or error out on missing dependencies
%INC can hold undef. This can be hit on a Linux machine missing Linux::Inotify2. Loading PublicInbox::KQNotify is attempted and PublicInbox/KQNotify.pm always exists, causing the `undef' entry in %INC when it fails to load IO::KQueue. $ perl -MData::Dumper -I lib \ -E 'eval { require PublicInbox::KQNotify }; say Dumper(\%INC)'
2021-06-20view: extra check to for redundant messages in HTML view
There appears to be some cases of duplicates appearing due to -extindex. I haven't nailed down the cause of it, yet, but this should make things easier for readers using the PSGI HTML interface in the meantime. The raw mboxrd remains undeduplicated for now, and the correct fix/workaround would be some fsck-like mode for public-inbox-extindex.
2021-06-20scripts: add syscall-list tool for development
We'll be supporting inotify directly as we do with epoll so so Linux users won't have to deal with XS, extra DSOs or install Linux::Inotify2 (and common::sense) modules.
2021-06-18lei/store: do not put NULL into over.num column
Simplify oid2docid and filter out undefined docids in ->add_eml, instead. This avoids SQLite "datatype mismatch" errors in OverIdx->add_over Fixes: d1052f03ea85d4af ("lei/store: cull redundant docids based on blob OID")
2021-06-17lei/store: cull redundant docids based on blob OID
I'm not sure how this happened (only once for me in March), but it should not happen... In any case, we'll operate on the lowest numbered docid and cull redundant index entries when lei/store is open for read-write. This also fixes the normal lei/store removal path to clean up the xref3 table (since it's not done automatically for public-facing -eidx due to the multi-list nature of it).
2021-06-17lei_input: prefix bare Maildir paths w/ "maildir:"
This will simplify upcoming code for watches.
2021-06-17lei inspect: learn "num:" and "docid:" prefixes
"num:" is useful for inspecting Inbox-ish directories, while "docid:" can be used for any Xapian DB (not just stuff managed by our code).
2021-06-14lei index+import: reject keywords from R/O IMAP
Since users can't set IMAP flags in read-only IMAP folders, we won't clobber local flags when importing from IMAP. This also enables the local_blob fallback used for lei-index to be used for index deduplication.
2021-06-14lei_input: allow keywords when importing 1 file from Maildir
This will eventually be useful for supporting inotify watches on Maildir. It will also allow users to script their own FS watchers more easily.
2021-06-13net_reader: canonicalize URL args on add_url
This fixes cases when users specify an IMAP or NNTP URL with standard port numbers explicitly. In other words, this allows users to use "lei ls-mail-source nntps://public-inbox.org:563/" and "lei ls-mail-source imaps://public-inbox.org:993/" without hitting "BUG:" errors.
2021-06-13lei import: use url_folder_cache for completion
And fix "lei index" completion while we're at it.
2021-06-13lei ls-mail-source: write through to URL folder cache
We'll be able to use this for shell completion for lei import, lcat, tag, etc.. This also adds --url support for scripting purposes.
2021-06-13lei: stop pager early on exit
This is necessary when using "ls-mail-source" on an unreachable IMAP server.
2021-06-12lei ls-mail-source: list IMAP folders and NNTP groups
While other tools can provide the same functionality, having integration with git-credential is convenient, here. Caching and completion will be implemented separately.
2021-06-10lei tag: less confusing warning about unimported messages
"unimported" is more meaningful than "missing", here. And instead of having every worker spew about unimported messages, we'll accumulate and only print one warning line. This necessitated alterating ->DESTROY behavior and persisting the client socket within the $lei object itself, not just the PktOp consumer object.
2021-06-10lei import: support --new-only for IMAP
Taking ~40s to synchronize a ~75K message IMAP folder is still a lot of time, so support an option to only touch new messages. This is similar to "offlineimap -q" (quick) or "mbsync --new" switches, but lei already accepts "-q" as a shortcut for --quiet. "--new" could work, but "--new-only" might be more descriptive (or "--only-new"?), since the default fetches also fetches new messages. v2: warn for non-IMAP sources, I'm not sure it's worth it for Maildir or other sources, yet. It will also make sense for MH and JMAP once we support them.
2021-06-09lei prune-mail-sync: new command to prune invalid sync data
This will be invoked automatically by "lei import" eventually, but it may make sense to expose as a separate command.
2021-06-09lei_mail_sync: hoist out --all handling from export-kw
We'll be reusing it in other commands, too.
2021-06-09lei tag: parallelize Maildir access
Since Maildir isn't guaranteed to have any sort of order, we can parallelize inputs, here. On a 4-core system, this reduced one of my tag invocations from 5.5 to 1.4s.
2021-06-09mdir_reader: maildir_each_file: pass flags, skip Trash
This is a slight behavior change for "lei q": Trashed (but not-yet-expunged) messages no longer get unlinked when --output is used without --augment.
2021-06-09inbox_writable: fix import_maildir
I'm not sure if anybody uses this, but it exists. It'll likely be dropped in the future. Fixes: fa3f0cbcd1af5008 ("use MdirReader in -watch and InboxWritable")
2021-06-09lei/store: do eidx_init before creating R/W lms dbh
Sharing lms->{dbh} with eidx shards appears to be the cause of the "Issuing rollback() due to DESTROY without explicit disconnect() of DBD::SQLite::db handle" messages I've been seeing from "lei up".
2021-06-09lei edit-search: fix and add a (weak) test
This broke recently and lacked an automated test, so rely on EDITOR=cat to ensure we have some coverage. Fixes: d2670108f71b1eff ("pkt_op: make pkt_do an OO method")
2021-06-09lei pmdir: fix nproc for <= 4 CPUs
I forgot my FreeBSD VM has 8 cores, actually, and tweaked the nproc detection on that machine before finalizing commit 10b523eb017162240b1ac3647f8dcbbf2be348a7 ("lei import: speed up repeated Maildir imports") Fixes: 10b523eb01716224 ("lei import: speed up repeated Maildir imports")
2021-06-08lei import: speed up repeated Maildir imports
On a 4-core CPU, this speeds up "lei import" on a largish Maildir inbox with 75K messages from ~8 minutes down to ~40s. Parallelizing alone did not bring any improvement and may even hurt performance slightly, depending on CPU availability. However, creating the index on the "fid" and "name" columns in blob2name yields us the same speedup we got. Parallelizing IMAP makes more sense due to the fact most IMAP stores are non-local and subject to network latency. Followup-to: bdecd7ed8e0dcf0b45491b947cd737ba8cfe38a3 ("lei import: speed up kw updates for old IMAP messages")
2021-06-08lei: generalize auxiliary WQ handling
op_wait_event is now more lei-specific since we no longer have to care about oneshot and use a synchronous loop. {ikw} (import-keywords) started a trend, but LeiPmdir (parallel Maildir) is an upcoming WQ class that will follow this idea. Eventually, {l2m} usage may be updated to follow this, too.
2021-06-08lei: safety fix for multiple WQ classes
For commands utilizing multiple workers, this simple change generalizes the persistence mechanism and and prevents lei->dclose from causing script/lei to exit if there are still in-flight workers. This ougth to prevent read-after-write consistency problems that occasionally manifest in scripts (e.g. test cases) but usually go unnoticed in normal use.
2021-06-08lei/store: checkpoint commits mail_sync.sqlite3
We mainly rely on ->done with lei/store, but moving to ->checkpoint probably makes sense. Note: over, msgmap, and mail_sync all have slightly different transacation behavior; perhaps they can be unified in the future.
2021-06-06lei: don't drop WQ workers on normal exit
This is dangerous and causes race conditions on commands which utilize multiple workqueues.
2021-06-04pkt_op: make pkt_do an OO method
This will make it easier to use for internal use such as managing Maildir and IMAP IDLE watches.
2021-06-03pkt_op: remove blocking I/O support
Since lei-daemon is guaranteed to be running, there's no need to keep blocking I/O support around (and we can get it back via git if we need it). Followup-to: 1d6e1f9a6a66a42d ("lei: require Socket::MsgHdr or Inline::C, drop oneshot")
2021-06-03lei import: speed up kw updates for old IMAP messages
On a 4-core CPU, this speeds up "lei import" on a largish IMAP inbox with 75K messages from ~21 minutes down to 40s. Parallelizing with the new LeiImportKw WQ worker class gives a near-linear speedup and brought the runtime down to ~5:40. The new idx_fid_uid index on the "fid" and "uid" columns of blob2num in mail_sync.sqlite3 brought us the final speedup. An additional index on over.sqlite3#xref3(oidbin) did not help, since idx_nntp already exists and speeds up the new ->oidbin_exists internal API. I initially experimented with a separate "lei import-kw" command but decided against it since it's useless outside of IMAP+JMAP and would require extra cognitive overhead for both users and hackers. So LeiImportKw is just a WQ worker used by "lei import" and not its own user-visible command. v2: fix ikw_done_wait arg handling (ugh, confusing API :x)
2021-06-02lei export-kw: do not write directly to mail_sync.sqlite3
Only the lei/store process should be writing to files/DBs in lei/store.
2021-06-02lei: remove "forget" (old name for "rm")
"rm" is probably the better name for it, since it matches "public-inbox-learn rm"
2021-06-01lei_mail_sync: more debug info for uncommitted txn
I'm not actually sure if I hit an uncommitted transaction just now, it doesn't seem like it.
2021-06-01lei import: reduce writes to lei/store on IMAP sync
We don't need to write VMD changes to lei/store if local keywords are unchanged.
2021-05-30lei import: import IMAP flag changes from old messages
This makes "lei import" behavior with IMAP folders more consistent with that with Maildir. Opening IMAP folders read-write with "SELECT" (instead of read-only with "EXAMINE") was necessary, since it lets an IMAP server communicate to us as to whether or not it's worth refetching IMAP flags of previously imported messages. Fetching UID+FLAGS only is one of the fastest IMAP operations with dovecot, our -imapd and presumably other common IMAP servers. It is issued by common MUAs such as mutt after every SELECT. Users may now rely on "lei import" exclusively to merge mail and keywords into lei/store, and "lei export-kw" to propagate keyword changes back to IMAP servers. A sticks-and-stones workflow for personal mailboxes is currently: lei import imaps://$MY_PERSONAL_INBOX lei q --mua=$MUA -o /tmp/results SEARCH TERMS... # do stuff from within $MUA to /tmp/results lei import /tmp/results # read keyword changes from MUA lei export-kw imaps://$MY_PERSONAL_INBOX # repeat when new stuff shows up in personal inbox The next goal is to automate repeated imports + export-kw commands with with inotify and IMAP IDLE.
2021-05-30lei: support implicit stdin by default
This adds implicit stdin suppport for p2q and lcat, while rm and rediff no longer need explicit support for it.
2021-05-30lei lcat: support maildir: paths, too
This could be helpful in case when a Maildir is on a slow or unmounted filesystem and lei/store is on fast storage.
2021-05-30lei lcat: allow IMAP folder URLs w/o UIDVALIDITY
Requiring UIDVALIDITY on the command-line is of course unreasonable.
2021-05-30lei lcat+inspect: start wiring up completion
Colons and other delimiters still cause problems for our bash completion, but some completion is better than no completion.
2021-05-30lei q: --sort and --save|v2 are incompatible
Saved searches rely on (reverse) docid ordering for efficient incremental results, and sorting any other way prevents that. Update comment description in LeiQuery while we're at it: "ls-query" and "rm-query" are "ls-search" and "forget-search", respectively, and "mv-query" is implicit with "edit-search"
2021-05-30lei import|lcat: improve+fix single message IMAP support
lcat can now dump the memoized contents of entire IMAP folders, not just a single UID. It's now parallelized and pipelined for multiple lei2mail workers. Furthemore, various forms of JSON output work consistently with blob-only output, now. While working on this, I noticed NetReader was passing UID URLs to imap_each callbacks, which was causing mail_sync.sqlite3 to store UIDs in `folders' and clearly wrong so it's now fixed.
2021-05-29lei_to_mail: use abs_path for Maildir in mail_sync.sqlite3
lei->rel2abs doesn't resolve symlinks, which could cause synchronization problems with export-kw or other commands.
2021-05-28lei q|up: support v2:/path/to/inboxdir destination
This allows "lei-managed pseudo mailing lists" as described by Konstantin. Alternates use is optional and can be enables via --shared. This doesn't manage or edit ~/.public-inbox/config; presumably there'll need to be some tweaking of search parameters before finalizing and making the inbox publicly accessible via HTTP/NNTP. Link: https://public-inbox.org/meta/20210426164454.5zd5kgugfhfwfkpo@nitro.local/T/
2021-05-28lei: retry_reopen on read-only Xapian access
Xapian DBs may be modified by a parallel process while we're reading it, and Xapian's MVCC model places the burden on readers to retry operations. We'll also have retry_reopen croak instead of die on errors, which ought to help us track down some "Document not found" errors I've occasionally seen when using "lei <q|up>".
2021-05-28lei: restore working directory in more places
Every tick of the event loop can change the working directory, so we need to restore it for every client if they operate in different directories. This would be easier if we had openat(2) and friends in Perl; but Inline::C is practically required for lei, now.
2021-05-28lei: handle a single IMAP message in most places
"lei import" can now import a single IMAP message via <imaps://example.com/MAILBOX/;UID=$UID> Likewise, "lei inspect" can show the blob information for UID URLs and "lei lcat" can display the blob without network access if imported. "lei lcat" also gets rid of some unused code and supports "blob:$OIDHEX" syntax as described in the comments (and used by our "text" output format). v2: enforce UID in URL, fail without v3: fix error reporting (s/fail/child_error/)
2021-05-28lei_mail_sync: debug code for uncommitted txn
I'm not 100% sure why, but "lei up" seems to cause uncommitted transaction errors. LeiToMail calls sto->set_sync_info, but LeiXSearch should call sto->done and lms_commit, so I'm not sure where the uncommited transaction is coming from...
2021-05-28lei: add TODO item for FUSE mount
It seems possible and natural to allow browsing lei/store as a Maildir (as well as read-write JMAP/IMAP store).