about summary refs log tree commit homepage
path: root/t/imapd.t
DateCommit message (Collapse)
2021-10-01ds: simplify signalfd use
Since signalfd is often combined with our event loop, give it a convenient API and reduce the code duplication required to use it. EventLoop is replaced with ::event_loop to allow consistent parameter passing and avoid needlessly passing the package name on stack. We also avoid exporting SFD_NONBLOCK since it's the only flag we support. There's no sense in having the memory overhead of a constant function when it's in cold code.
2021-09-13tests: add require_cmd, require curl when needed
t/v2mirror.t and t/lei-mirror.t are now skipped when curl is missing (instead of failing in appropriate places). A bunch of which() checks are updated to use require_cmd to avoid explicitly loading Spawn.
2021-08-08tests: fix test failures when Xapian is missing
We still support usage without Xapian, so ensure our tests work when Xapian bindings are missing
2021-03-28test_common: require_mods bundles
This makes it easier to manage test dependencies on systems where optional stuff isn't installed. This fixes some lei tests which didn't check for Plack before starting -httpd, and ensures Parse::RecDescent is available for -imapd in case Mail::IMAPClient stops using it.
2021-03-15t/imapd: create_inbox (minor)
This saves over 100ms.
2021-02-24watch: switch IMAP and NNTP fetch loops to NetReader
NetReader::<imap|nntp>_each were based on the -watch code they now replace. v2: do not warn on EINTR if user quit to fix occasional test failure in t/imapd.t
2021-02-08tests: favor IPv6
IPv4 gets plenty of real-world coverage, and apparently there's Debian buildd hosts which lack IPv4(*). So ensure everything can work on IPv6 and not cause problems for odd setups. (*) https://bugs.debian.org/979432
2021-01-05imap: fix uninitialized var on MSN search miss
It seems only triggered by bots trying to steal information.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-26inboxidle: avoid needless syscalls on refresh
We don't have to replace a bunch of existing watches with identical new ones. On Linux with Linux::Inotify2 installed, this avoids a storm of inotify_add_watch(2) and inotify_rm_watch(2) syscalls on SIGHUP with -imapd and "-extindex --watch"
2020-12-09rename {pi_config} fields to {pi_cfg}
{pi_config} may be confused with the documented `PI_CONFIG' environment variable, and we'll favor vowel-removal to be consistent with our usage of object references. The `pi_' prefix may stay in some places, for now; since a separate namespace may come into this codebase for local/private client-tooling. For InboxIdle, we'll also remove an invalid comment about holding a reference to the PublicInbox::Config object, too.
2020-09-15imap: quiet uninitialized variable warning on FETCH
This was triggered by blindly trying to FETCH an MSN (not "UID FETCH") on an empty dummy inbox. It's harmless, and probably triggered by a wayward client or misbehaving bot.
2020-09-15t/imapd.t: skip dependent test on failure
We don't want to cascade failures/warnings when something else breaks. There's likely more of these to be fixed as we encounter them.
2020-09-01rename WatchMaildir => Watch
This is no longer limited to Maildirs now that IMAP and NNTP support exist; so give it a shorter name.
2020-08-30imapd: filter out unusable flags from search
Quiet down logs from -imapd when clients are blindly sending some unsupported flag conditions (e.g. "DRAFT", "DELETED") specified in RFC 3501.
2020-08-20init: support --newsgroup option
We can reduce the need to edit the config file for NNTP group names this way.
2020-07-10imap: avoid warnings on non-slice mailboxes
Non-slice mailboxes never have messages themselves, so we must not assume a message exists when sending untagged EXISTS messages.
2020-07-05watch: don't burn CPU on IDLE failures
Network connections fail and need to be detected sooner rather than later during IDLE to avoid backtrace floods. In case the IDLE process dies completely, don't respawn right away, either, to avoid entering a respawn loop. There's also a typo fix :P
2020-06-28watch: just use ->urlmatch
We may just modify PublicInbox::Config->urlmatch in the future to support git <1.8.5, but I wonder if there's enough users on git <1.8.5 to justify it.
2020-06-28config: support ->urlmatch method for -watch
Since we have IMAP client support in -watch; make sure per-URL settings are familiar to git users by taking advantage of git's URL matching abilities. This requires git 1.8.5+, which most users ought to have (though base CentOS 7 is on 1.8.3).
2020-06-28watch: support IMAP polling
Not all IMAP servers support IDLE, and IDLE may be prohibitively expensive for some IMAP servers with many inboxes. So allow configuring a imap.$IMAP_URL.pollInterval=SECONDS to poll mailboxes. We'll also need to poll for NNTP servers in the future.
2020-06-28watch: remove Filesys::Notify::Simple dependency
Since we already use inotify and EVFILT_VNODE (kqueue) in -imapd, we might as well use them directly in -watch, too. This will allow public-inbox-watch to use PublicInbox::DS for timers to watch newsgroups/mailboxes and have saner signal handling in future commits.
2020-06-28watch: preliminary IMAP support
Only servers with IDLE are supported, for now. Polling will be needed since users may need to watch many inboxes with a few active connections due to IMAP server limitations.
2020-06-21tests: require git 2.6+ in more places
We also need to check for git 2.6 earlier in each test case, before any other TAP output is emitted to avoid confusing the TAP consumers.
2020-06-16imap: *SEARCH: reinstate "TEXT" search-key
I accidentally dropped "TEXT" handling while porting the IMAP search query parser to Parse::RecDescent. This reinstates it and adds a test to prevent future regression, and the additional test fixes a counting error for non-Xapian-enabled systems.
2020-06-16imap: *SEARCH: use Parse::RecDescent
For properly parsing IMAP search requests, it's easier to use a recursive descent parser generator to deal with subqueries and the "OR" statement. Parse::RecDescent was chosen since it's mature, well-known, widely available and already used by our optional dependencies: Inline::C and Mail::IMAPClient. While it's possible to build Xapian queries without using the Xapian string query parser; this iteration of the IMAP parser still builds a string which is passed to Xapian's query parser for ease-of-diagnostics. Since this is a recursive descent parser dealing with untrusted inputs, subqueries have a nesting limit of 10. I expect that is more than adequate for real-world use.
2020-06-16imap: reinstate non-UID SEARCH
Since we support MSNs properly, now, it seems acceptable to support regular SEARCH requests in case there are any clients which still use non-UID SEARCH.
2020-06-15t/imapd: quiet overload warning from Mail::IMAPClient
Mail::IMAPClient understandably stumbles into a warning by our bogus test request. Just silence it on our end since it's not normal operation for Mail::IMAPClient.
2020-06-15t/imapd*.t: support older Mail::IMAPClient
->has_capability on Mail::IMAPClient 3.37 (tested on CentOS 7.x) only returned boolean values, and not the value of the capability.
2020-06-15testcommon: allow OR-ing module dependencies
IMAP requires either the Email::Address::XS or Mail::Address package (part of perl-MailTools RPM or libmailtools-perl deb); and Email::Address::XS is not officially packaged for some older distros, most notably CentOS 7.x.
2020-06-13imap: remove non-UID SEARCH for now
Supporting MSNs in long-lived connections beyond the lifetime of a single request/response cycle is not scalable to a C10K scenario. It's probably not needed, since most clients seem to use UIDs. A somewhat efficient implementation I can come up uses pack("S*" ...) (AKA "uint16_t mapping[50000]") has an overhead of 100K per-client socket on a mailbox with 50K messages. The 100K is a contiguous scalar, so it could be swapped out for idle clients on most architectures if THP is disabled. An alternative could be to use a tempfile as an allocator partitioned into 100K chunks (or SQLite); but I'll only do that if somebody presents a compelling case to support MSN SEARCH.
2020-06-13imapd: don't bother sorting LIST output
The sort was unstable on my test instance anyways, and clients don't seem to mind. So stop wasting CPU cycles.
2020-06-13imap: wire up Xapian, MSN SEARCH and multi sequence-sets
Simple queries work, more complex queries involving parentheses, "OR", "NOT" don't work, yet. Tested with "=b", "=B", and "=H" search and limits in mutt on both v1 and v2 with multiple Xapian shards.
2020-06-13imap: FETCH: try to make fake MSNs sequentially
This appears to significantly improve header caching behavior with mutt. With the current public-inbox.org/git mirror(*), mutt will only re-FETCH the last ~300 or so messages in the final "inbox.comp.version-control.git.7" mailbox, instead of ~49,000 messages every time. It's not perfect, but a 500ms query is better than a >10s query and mutt itself spends as much time loading its header cache. (*) there are many gaps in NNTP article numbers (UIDs) due to spam removal from public-inbox-learn.
2020-06-13imap: LIST shows "INBOX" in all caps
While selecting a mailbox is done case-insensitively, "INBOX" is special for the LIST command, according to RFC 3501 6.3.8: > The special name INBOX is included in the output from LIST, if > INBOX is supported by this server for this user and if the > uppercase string "INBOX" matches the interpreted reference and > mailbox name arguments with wildcards as described above. The > criteria for omitting INBOX is whether SELECT INBOX will > return failure; it is not relevant whether the user's real > INBOX resides on this or some other server. Thus, the existing news.public-inbox.org convention of naming newsgroups starting with "inbox." needs to be special-cased to not confuse clients. While we're at it, do not create ".0" for dummy newsgroups if they're selected, either.
2020-06-13imap: remove dummies from sequence number FETCH
Dummy messages make for bad user experience with MUAs which still use sequence numbers. Not being able to fetch a message doesn't seem fatal in mutt, so just ignore (sometimes large) gaps.
2020-06-13imap: allow UID range search on timestamps
Since it seems somewhat common for IMAP clients to limit searches by sent Date: or INTERNALDATE, we can rely on the NNTP/WWW-optimized overview DB. For other queries, we'll have to depend on the Xapian DB.
2020-06-13imap: avoid uninitialized warnings on incomplete commands
No point in spewing "uninitialized" warnings into logs when the cat jumps on the Enter key.
2020-06-13imap: omit $UID_END from mailbox name, use index
Having two large numbers separated by a dash can make visual comparisons difficult when numbers are in the 3,000,000 range for LKML. So avoid the $UID_END value, since it can be calculated from $UID_MIN. And we can avoid large values of $UID_MIN, too, by instead storing the block index and just multiplying it by 50000 (and adding 1) on the server side. Of course, LKML still goes up to 72, at the moment.
2020-06-13imapd: ensure LIST is sorted alphabetically, for now
I'm not sure this matters, and it could be a waste of CPU cycles if no real clients care. However, it does make debugging over telnet or s_client a bit easier.
2020-06-13imap: require ".$UID_MIN-$UID_END" suffix
Finish up the IMAP-only portion of iterative config reloading, which allows us to create all sub-ranges of an inbox up front. The InboxIdler still uses ->each_inbox which will struggle with 100K inboxes. Having messages in the top-level newsgroup name of an inbox will still waste bandwidth for clients which want to do full syncs once there's a rollover to a new 50K range. So instead, make every inbox accessible exclusively via 50K slices in the form of "$NEWSGROUP.$UID_MIN-$UID_END". This introduces the DummyInbox, which makes $NEWSGROUP and every parent component a selectable, empty inbox. This aids navigation with mutt and possibly other MUAs. Finally, the xt/perf-imap-list maintainer test is broken, now, so remove it. The grep perlfunc is already proven effective, and we'll have separate tests for mocking out ~100k inboxes.
2020-06-13imap: case-insensitive mailbox name comparisons
IMAP RFC 3501 stipulates case-insensitive comparisons, and so does RFC 977 (NNTP). However, INN (nnrpd) uses case-sensitive comparisons, so we've always used case-sensitive comparisons for NNTP to match nnrpd behavior. Unfortunately, some IMAP clients insist on sending "INBOX" with caps, which causes problems for us. Since NNTP group names are typically all lowercase anyways, just force all comparisons to lowercase for IMAP and warn admins if uppercase-containing newsgroups won't be accessible over IMAP. This ensures our existing -nntpd behavior remains unchanged while being compatible with the expectations of real-world IMAP clients.
2020-06-13imap: support out-of-bounds ranges
"$UID_START:*" needs to return at least one message according to RFC 3501 section 6.4.8. While we're in the area, coerce ranges to (unsigned) integers by adding zero ("+ 0") to reduce memory overhead.
2020-06-13imapclient: wrapper for Mail::IMAPClient
We'll be using this wrapper class to workaround some upstream bugs in Mail::IMAPClient. There may also be experiments with new APIs for more performance.
2020-06-13imap: FETCH: support comma-delimited ranges
The RFC 3501 `sequence-set' definition allows comma-delimited ranges, so we'll support it in case clients send them. Coalescing overlapping ranges isn't required, so we won't support it as such an attempt to save bandwidth would waste memory on the server, instead.
2020-06-13imap: support the CLOSE command
It seems worthless to support CLOSE for read-only inboxes, but mutt sends it, so don't return a BAD error with proper use.
2020-06-13imap: do not include ".PEEK" in responses
They're not specified in RFC 3501 for responses, and at least mutt fails to handle it.
2020-06-13imap: support sequence number FETCH
We'll return dummy messages for now when sequence numbers go missing, in case clients can't handle missing messages.
2020-06-13imap: fix multi-message partial header fetches
We must keep the contents of {-partial} around when handling a request to fetch multiple messages.
2020-06-13imap: split out unit tests and benchmarks
This makes the test code easier-to-manage and allows us to run faster unit tests which don't involve loading Mail::IMAPClient.