about summary refs log tree commit homepage
path: root/lib/PublicInbox/NetReader.pm
DateCommit message (Collapse)
2021-04-30net_reader: support (imap|nntp).proxy in config file
This allows us to use URL-matching config in git and specify proxies on a per-host basis. git 2.26+ users may use wildcards to enable Tor (on 127.0.0.1:9050) for all NNTP and IMAP .onion domains. My ~/.config/lei/config file has the following: [imap "imap://*.onion"] proxy = socks5h://127.0.0.1:9050 [nntp "nntp://*.onion"] proxy = socks5h://127.0.0.1:9050
2021-04-30net_reader: Net::NNTP --proxy=socks5h:// support
Since Net::NNTP doesn't support Socket or RawSocket options/accessors like Mail::IMAPClient does; we must perform localized @ISA manipulation and massage Net::NNTP into using IO::Socket::Socks rather than IO::Socket::IP. This is a bit fragile, but Net::Cmd and Net::NNTP rarely change; and I keep an eye on them, anyways.
2021-04-30lei: IMAP .onion support via --proxy=s switch
Mail::IMAPClient provides the ability to pass a pre-connected Socket to it. We can rely on this functionality to use IO::Socket::Socks in place whatever socket class Mail::IMAPClient chooses to use. The --proxy=s is shared with curl(1), though we only support socks5h:// at the moment. Is there any need for SOCKS4 or SOCKS5 without name resolution? Tor .onions require socks5h:// for name resolution and to prevent data leakage.
2021-04-30net_reader: {nn,mic}_for: use prototypes for internal subs
We don't use these subs elsewhere, so stick prototypes on them to give them a little extra checking.
2021-04-30lei import: support UIDVALIDITY in IMAP URL
Specifying a UIDVALIDITY value allows the user to enforce a strict match and force failure. This necessitated changes to NetReader to allow die() and make error reporting more suitable for CLI usage rather than daemonized usage of -watch.
2021-04-30lei import: avoid IMAPTracker, use LeiMailSync more
IMAPTracker has a UNIQUE constraint on the `url' column, which may cause compatibility and/or rollback problems in attempting to deal with UIDVALIDITY changes. Having multiple sources of truth leads to confusion and bugs, so relying on LeiMailSync exclusively ought to simplify things. Furthermore, since LeiMailSync is only written to by LeiStore, it is safer in that it won't mark a UID or article as imported until git-fast-import has seen it, and the SQLite commit always happens after "done\n" is sent to fast-import. This mostly reverts recent commits to IMAPTracker to support lei, those are: 1) commit 7632d8f7590daf70c65d4270e750c36552fa9389 ("net_reader: restart on first UID when UIDVALIDITY changes") 2) commit 311a5d37ad275cd75b1e64d87827c4d13fe4bfab ("imap_tracker: prepare for use with lei"). This means public-inbox-watch will not change between 1.6 and 1.7: -watch stops synching a folder when UIDVALIDITY changes.
2021-04-24net_reader: imap_each: add UIDVALIDITY to URL arg
This will allow the callback to reliably maintain OID <=> UID mappings between lei/store and the IMAP folder.
2021-04-23net_reader: restart on first UID when UIDVALIDITY changes
In other words, treat the same IMAP folder with a different UIDVALIDITY as a completely different folder. If the UIDVALIDITY changes, we can start from UID=1 without falling behind or losing data. If the UIDVALIDITY gets reset to a previously known-good message, we can still resume where we left off before the first UIDVALIDITY change. This affects public-inbox-watch and "lei import" One potential downside of this is for rare altid users, but that's mainly intended for NNTP article numbers which are/were often publicized; not IMAP UIDs which are rarely publicized. The other potential downside is bandwidth waste in in the rare case UIDVALIDITY changes while IMAP folder contents remain unchanged. There's no extra storage used due to existing (v1|v2|lei/store) deduplication mechanisms. Before this change, we were matching offlineimap behavior and stopped synching an IMAP folder when its UIDVALIDITY changed. offlineimap behavior made sense for IMAP <=> Maildir synchronization since Maildirs had no sense of UIDVALIDITY and could only rely on name mapping.
2021-04-22lei import: --incremental default for NNTP and IMAP
No point in burning through bandwidth to import stuff we already saw. All this logic is shared with -watch but uses a different pathname for lei since it's tied to lei/store (and not a public-inbox).
2021-04-22lei: flesh out `forwarded' kw support for Maildir and IMAP
Maildir and IMAP can both handle `forwarded'. Ensure we don't lose `forwarded' when reading from stores which do not support it, but ensure we can set it when reading from IMAP and Maildir stores.
2021-04-03net_reader: fix read-only "lei convert" auth failures
"convert" is actually a bit more complicated than "lei import" since it may need auth for either input or output.
2021-04-03lei q: ensure wq workers shutdown on IMAP auth failures
Leaving workers running on after auth failures is bad and messy, cleanup our process management to have consistent worker teardowns. Improve error reporting, too, instead of letting Mail::IMAPClient->exists fail due to undef.
2021-03-24net_reader: nntp_each: pass keywords as `undef'
We'll use `undef' to denote keywords are unknown/unsupported, instead of an empty arrayref. This will let callers use the same callback and args for imap_each. Passing an empty arrayref to set_eml in LeiStore causes keywords to be cleared completely, which is not desired behavior when "lei import" is importing already-seen messages from NNTP.
2021-03-23net_reader: escape nasty chars from Net::NNTP->message
Net::Cmd::message (used by Net::NNTP) does no escaping at all, so "\r" was causing confusing/nonsensical error messages when I tried to import from the wrong group.
2021-03-10watch: IMAP: ignore \Deleted and \Draft messages
This matches existing Maildir behavior, as trash and draft messages have little reason to be exposed publicly.
2021-03-04lei q: support --import-augment for IMAP
IMAP is similar to Maildir and we can now preserve keyword updates done on IMAP folders.
2021-02-24net_reader: trim exports and remove unused uri_new
More network things for -watch are isolated in NetReader, now, so fewer exports are necessary.
2021-02-24watch: switch IMAP and NNTP fetch loops to NetReader
NetReader::<imap|nntp>_each were based on the -watch code they now replace. v2: do not warn on EINTR if user quit to fix occasional test failure in t/imapd.t
2021-02-24lei <import|convert>: support NNTP sources
We can read NNTP in -watch and Net::NNTP is shipped with Perl5, so lei import and convert have no excuse not to support NNTP as a client. Authentication is not tested, yet; but should be close to what IMAP is like...
2021-02-22net_reader: mic_get: reuse connections if cache enabled
We only enable {mic_cached} in WQ workers, and those aren't expected to fork again going forward. So cache here avoid a penalty for the non-augmenting (imap_delete_all) call with "lei q"
2021-02-22lei convert: auth directly from worker process
Since this only has one worker, we can auth directly in the worker since the convert worker now has access to the script/lei {sock} for running "git credential".
2021-02-21lei2mail: parallel augment for lock-free stores
This lets us make use of multiple cores on IMAP and Maildir backed by SSD (or better) storage. This benefits IMAP stores with high network latency, but may still penalize IMAP servers with rotational storage.
2021-02-21net_reader: use and accept URIimap objects in more places
This flexibility should save us some code down-the-line.
2021-02-21lei q: support IMAP/IMAPS --output destinations
Augment (and dedupe) aren't parallel, yet, so its more sensitive to high-latency networks.
2021-02-19net_writer: start implementing IMAP write support
Requiring TEST_IMAP_WRITE_URL to be set to a writable IMAP server URL isn't ideal, but it works for now until we have time to setup a mock dovecot/cyrus/etc... instance for testing.
2021-02-19net_reader: handle single-message IMAP mailboxes
Due to an off-by-one error, we were unable to read mailboxes with only a single message of UID:1. Without this fix, the message with UID:1 could only be read after UID:2 was created; so there's no permanent data loss as long as a new message showed up. This affects all releases of public-inbox-watch with IMAP support, though it probably went unnoticed because single message inboxes are rare.
2021-02-18lei: check for IMAP auth errors
We need to ensure authentication failures and error codes get propagated to the parent process(es) properly. v2: update MANIFEST v3: LeiAuth.pm ->_lei_cfg bit moved to a previous commit
2021-02-18lei convert: mail format conversion sub-command
This will make testing IMAP support for other commands easier, as it doesn't write to lei/store at all. Like the pager and MUA, "git credential" is always spawned by script/lei (and not lei-daemon) so it has a controlling terminal for password prompts. v2: fix missing requires, correct test ordering v3: ensure config exists for IMAP auth
2021-02-18lei import: start rearranging code for IMAP support
More to come in a later commit; some error handling and failure modes will be trickier with IMAP due to authentication.
2021-02-18watch: connect to NNTP and IMAP in config order
This is hopefully less surprising to users when they're prompted for credentials.
2021-02-18watch: move imap_common_init to NetReader
We'll use this in LeiImport and likely other places.
2021-02-10net_reader: new package split from -watch
We'll be using some of this for IMAP and NNTP support in lei, too. More will need to be done to improve code sharing and reusability, soon, but this is a start.