about summary refs log tree commit homepage
path: root/MANIFEST
DateCommit message (Collapse)
2021-03-24lei mark: command for (un)setting keywords and labels
Only tested for keywords and labels with file inputs, so far; but it seems to do what it needs to do. There's a bit more redundant code than I'd like, and more opportunities for code sharing in the future "lei import" will be expanded to support +kw:$KEYWORD and +L:$LABEL in the future.
2021-03-23lei: share input code between convert and import
These commands accept mail the same way, and this forces us to maintain consistent input format support between commands. We'll be using this for "lei mark", too.
2021-03-21lei: All Local Externals: bare git dir for alternates
This will be used for keyword (and label) storage for externals. We'll be using this to ensure we don't redundantly auto-import messages into lei/store if they're already in a local external (they can still be imported explicitly via "lei import").
2021-03-17extindex: add some validation and config knobs for WWW
We'll try to share a bit more configuration with extindex entries for WWW PSGI usage.
2021-03-15t/html_index: remove now-worthless test
This was for quote-folding behavior we had long ago, but it ended up just being yet another import test.
2021-03-15test_common: add create_inbox helper sub
This saves over 100ms in t/lei-q-remote-import.t so far when TMPDIR is on an SSD. If we can memoize inbox creation to save a few dozen milliseconds every test, this could add up to noticeable savings across our entire test suite.
2021-03-12msg_part_text: discover text in application/octet-stream
Some poorly-configured MUAs will send application/octet-stream even for text-only attachments. We can't make expect all MUAs are configured with proper MIME types, and there is plenty of historical mail that falls into this unfortunate criteria. v2: simplify the check and ensures returned text is Perl "utf8"
2021-03-10doc: start glossary for overlapping concepts
This is intended to keep track of concepts with different terms between NNTP, IMAP, config file, lei storage, and upcoming JMAP support.
2021-03-04lei q: import flags when clobbering/augmenting Maildirs
This will eventually be supported for other mail stores, but Maildir is the easiest to test and support, here. This lets us avoid a situation where flag changes get lost between search results.
2021-03-01lei p2q: patch-to-query generator for "lei q --stdin"
Instead of teaching the to-be-implemented "lei show" to search threads/messages based commits, this orthogonal sub-command is designed to generate queries for use with "lei q --stdin". URI-escaped query parameters may be generated with --uri for HTTP(S) public-inbox instances, but otherwise the output is designed for "lei q --stdin". To find threads for a given git commit from a git worktree: lei p2q $COMMIT_OID | lei q --stdin -t ... It can also read via --stdin|- curl $INBOX_URL/$MSGID/raw | lei p2q - | lei q --stdin -t Or from the filesystem: lei p2q $(git format-patch -1) | lei q --stdin -t This defaults to only generating "dfpost:"-prefixed terms since I've found those most useful for finding messages relating to a commit. This is subject to change. --want=s@ is a comma-separated or multi-value list of prefixes that defaults to "dfpost7". Not all are implemented, yet, but s, dfn, dfpre, and dfpost all seem to mostly work. Phrase handling may need to be tweaked to work with Xapian. OR, NEAR, ADJ, AND, NOT may be used with --want (e.g. --want=dfpost,OR,dfn) Prefixing the field prefix with '+' or '-' (e.g. --want=+dfpost) generates "+dfpost:$EXTRACTED_OID" for Xapian. For non-boolean search prefixes, wildcard (*) may also be supplied: (--want=dfn*) For boolean search prefixes, suffixing the field prefix with a digit (e.g. --want=dfpost7) provides a minimum length, allowing truncated variations to be searched. This is helpful for finding older messages as git chooses longer dfpost|dfpre abbreviations as repos get larger. Automatic date range generation is not implemented, yet.
2021-02-26lei q: support mbox locking by default
While this diverges from from mairix(1) behavior, it's the safer option. We'll follow Debian policy by supporting fcntl and dotlocks by default (in that order). Users who do not want locking can use "--lock=none" This will be used in a read-only capacity for watching mailboxes for keyword updates via inotify or EVFILT_VNODE.
2021-02-26lei q: -tt marks direct hits as "flagged"
This can be used to quickly distinguish messages which were direct hits when doing thread expansion vs messages that were merely part of the same thread. This is NOT mairix-derived behavior, but I occasionally found it useful when looking at results in an MUA to know whether a message was a direct hit or not. This makes "-t" consistent with non-"-t" cases as far as keyword reading goes.
2021-02-25lei q: auto-memoize remote messages into lei/store
This lets users avoid network traffic on subsequent searches at the expense of local disk space. --no-import-remote may be specified to reverse this trade-off for users with little storage.
2021-02-24lei <import|convert>: support NNTP sources
We can read NNTP in -watch and Net::NNTP is shipped with Perl5, so lei import and convert have no excuse not to support NNTP as a client. Authentication is not tested, yet; but should be close to what IMAP is like...
2021-02-24add PublicInbox::URInntps package
We prefer the IANA-registered form of URIs to avoid confusing users, but the URI package has yet to support it. cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983419
2021-02-19net_writer: start implementing IMAP write support
Requiring TEST_IMAP_WRITE_URL to be set to a writable IMAP server URL isn't ideal, but it works for now until we have time to setup a mock dovecot/cyrus/etc... instance for testing.
2021-02-18lei: check for IMAP auth errors
We need to ensure authentication failures and error codes get propagated to the parent process(es) properly. v2: update MANIFEST v3: LeiAuth.pm ->_lei_cfg bit moved to a previous commit
2021-02-18lei import: add IMAP and (maildir|mbox*):$PATHNAME support
This makes "lei import" more similar to "lei convert" and allows importing from disparate sources simultaneously. We'll also fix some ->child_error usage errors and make the style of the code more similar to the "lei convert" code. v2: fix missing requires
2021-02-18lei convert: mail format conversion sub-command
This will make testing IMAP support for other commands easier, as it doesn't write to lei/store at all. Like the pager and MUA, "git credential" is always spawned by script/lei (and not lei-daemon) so it has a controlling terminal for password prompts. v2: fix missing requires, correct test ordering v3: ensure config exists for IMAP auth
2021-02-18tests: setup_public_inboxes: use IMAP-friendly newsgroups
-imapd won't support newsgroups ending with /\.[0-9]+\z/ since it reserves those for partitioning inboxes into 50K slices. So bump the home[0-9]+ version and switch to IMAP-friendly newsgroup names.
2021-02-11doc: add lei-import(1)
2021-02-10net_reader: new package split from -watch
We'll be using some of this for IMAP and NNTP support in lei, too. More will need to be done to improve code sharing and reusability, soon, but this is a start.
2021-02-10use MdirReader in -watch and InboxWritable
MdirReader now handles files in "$MAILDIR/new" properly and is stricter about what it accepts. eml_from_path is also made robust against FIFOs while eliminating TOCTOU races with between stat(2) and open(2) calls.
2021-02-10lei: split out MdirReader package, lazy-require earlier
We'll do more requires in the top-level lei-daemon process to save work in workers. We can also work towards aborting on user errors in lei-daemon rather than worker processes. "lei import -f mbox*" is finally tested inside t/lei_to_mail.t
2021-02-08lei import: support Maildirs
It seems to be working trivially, though I'm probably going to split out Maildir reading into a separate package rather than using LeiToMail.
2021-02-07lei help: split out into separate file
We'll reword and improve formatting with non-breaking spaces ("\xa0") which is only replaced with SP after wrapping. Some terminology is shortened (e.g. "URL_OR_PATHNAME" => "LOCATION") to improve formatting. This also enables completion for -h/--help and lets us prioritize favored switch names while attempting to satisfy users relying on muscle memory from other tools.
2021-02-07lei: add-external --mirror support
This can be useful for users who want to clone and mirror an existing public-inbox. This doesn't have update support, yet, so users will need to run "git fetch && public-inbox-index" for now.
2021-02-07tests: split out lei-daemon.t from lei.t
This makes it easier for hackers to find daemon-specific tests and forces us to always test both daemon and oneshot mode.
2021-02-07t/tests: split out setup_public_inboxes sub
We'll probably use this in many more existing places and likely change non-lei tests to use it.
2021-02-07t/lei-externals: split out into separate test
This is still overloaded with "lei q" stuff, but that's somewhat inevitable.
2021-02-07tests: add test_lei wrapper, split out t/lei-import.t
This will make it easier to maintain and test lei going forward, we need to be testing against existing read-only daemons. We'll also save ourselves some boilerplate by exporting all the Test::More methods directly in TestCommon We'll start using this by splitting out the latest "lei import" tests into its own file.
2021-02-05lei import: initial implementation
Only tested with .eml files so far, but Maildir + IMAP will be supported.
2021-02-04lei q: support reading queries from stdin
This will be useful on shared machines when a user doesn't want search queries visible to other users looking at the ps(1) output or similar.
2021-02-03lei: switch to use SEQPACKET socketpair instead of pipe
This will allow us to use larger messages and do progress reporting to accumulate in the main daemon.
2021-02-01sharedkv: use lock_for_scope_fast
This allows us to avoid repeated open() and close() syscalls and speeds up the new xt/stress-sharedkv.t maintainer test by roughly 7%.
2021-02-01ipc: switch wq to use the event loop
This will let us to maximize the capability of our asynchronous git API. This lets us avoid relying on EOF to notify lei2mail workers; thus giving us the option of running fewer lei_xsearch worker processes in parallel than local sources. I tried using a synchronous git API; and even with libgit2 in the same process to avoid the IPC cost failed to match the throughput afforded by this change. This is because libgit2 is built (at least on Debian) with the SHA-1 collision code enabled and ubc_check stuff was dominating my profiles.
2021-02-01doc: add lei-overview(7)
[ew: s/mboxrd/mboxcl2/ since that's what mutt uses]
2021-02-01doc: start manpages for lei commands
Add manpages for lei and the currently implemented subcommands. The included options and their descriptions follow to a large degree the --help output, dropping some options that are not currently wired up.
2021-01-26doc: start working on public-inbox-extindex(1) manpage
It's barely started, but I started writing this weeks ago, but I'm still unsure about some behavioral/usability things and hoping work on lei(1) can flush them out.
2021-01-25doc: re-add missing 1.6 release notes
I missed these during the merge :x
2021-01-22lei: forget-external support with canonicalization
For proper matching, we'll do a better job canonicalizing URLs and path names for matching. Of course, users may edit the file outside of lei, so ensure we try both the canonicalized and as-is form provided by the user. I also don't think we'll need to store externals info in MiscIdx; just the config file is fine.
2021-01-18lei: q: results output to Maildir and mbox* working
All the augment and deduplication stuff seems to be working based on unit tests. OpPipe is a nice general addition that will probably make future state machines easier.
2021-01-15lei: q: lock stdout on overview output
Most writes to stdout aren't atomic and we need locking to prevent workers from interleaving and corrupting JSON output. The one case stdout won't require locking is if it's pointed to a regular file with O_APPEND; as POSIX O_APPEND semantics guarantees atomicity.
2021-01-14lei: test SIGPIPE, stop xsearch workers on client abort
The new test ensures consistency between oneshot and client/daemon users. Cancelling an in-progress result now also stops xsearch workers to avoid wasted CPU and I/O. Note the lei->atfork_child_wq usage changes, it is to workaround a bug in Perl 5: http://nntp.perl.org/group/perl.perl5.porters/258784 <CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com> This switches the internal protocol to use SOCK_SEQPACKET AF_UNIX sockets to prevent merging messages from the daemon to client to run pager and kill/exit the client script.
2021-01-12lei: query: restore JSON output overview
This internal API is better suited for fork-friendliness (but locking + dedupe still needs to be re-added). Normal "json" is the default, though stream-friendly "concatjson" and "jsonl" (AKA "ndjson" AKA "ldjson") all seem working (though tests aren't working, yet). For normal "json", the biggest downside is the necessity of a trailing "null" element at the end of the array because of parallel processes, since (AFAIK) regular JSON doesn't allow trailing commas, unlike JavaScript.
2021-01-12lei_xsearch: transfer 4 FDs internally, drop IO::FDPass
It's easier to make the code more generic by transferring all four FDs (std(in|out|err) + socket) instead of omitting stdin. We'll be reading from stdin on some imports, and possibly outputting to stdout, so omitting stdin now would needlessly complicate things. The differences with IO::FDPass "1" code paths and the "4" code paths used by Inline::C and Socket::MsgHdr are far too much to support and test at the moment.
2021-01-12cmd_ipc: send FDs with buffer payload
For another step in in syscall reduction, we'll support transferring 3 FDs and a buffer with a single sendmsg/recvmsg syscall using Socket::MsgHdr if available. Beyond script/lei itself, this will be used for internal IPC between search backends (perhaps with SOCK_SEQPACKET). There's a chance this could make it to the public-facing daemons, too. This adds an optional dependency on the Socket::MsgHdr package, available as libsocket-msghdr-perl on Debian-based distros (but not CentOS 7.x and FreeBSD 11.x, at least). Our Inline::C version in PublicInbox::Spawn remains the last choice for script/lei due to the high startup time, and IO::FDPass remains supported for non-Debian distros. Since the socket name prefix changes from 3 to 4, we'll also take this opportunity to make the argv+env buffer transfer less error-prone by relying on argc instead of designated delimiters.
2021-01-12lei query + pagination sorta working
Parallelism and interactivity with pager + SIGPIPE needs work; but results are shown and phrase search works without shell users having to apply Xapian quoting rules on top of standard shell quoting.
2021-01-01lei: rename "extinbox" => "external"
The words "extinbox" and "extindex" are too close and easy to confuse with the other. Rename "extinbox" to "external", since these could be IMAP, JMAP or other non-public-inbox search APIs. Link: https://public-inbox.org/meta/20201226112649.GB6226@dcvr/
2021-01-01ipc: generic IPC dispatch based on Storable
I intend to use this with LeiStore when importing from multiple slow sources at once (e.g. curl, IMAP, etc). This is because over.sqlite3 can only have a single writer, and we'll have several slow readers running in parallel. Watch and SearchIdxShard should also be able to use this code in the future, but this will be proven with LeiStore, first.