about summary refs log tree commit homepage
path: root/lib/PublicInbox/NetReader.pm
DateCommit message (Collapse)
2024-01-11lei+net_reader: show NNTP message in more failures
Showing absolutely nothing when hitting a server requiring authentication is a very bad user experience. While we're at it, use Net::Cmd->message in more places where we experience failure, too.
2024-01-11net_reader: fix NNTP credential use
Clearly this was never tested until now, as passwords being retrieved by git-credential got completely ignored and unused. This enables users to connect to NNTP(S) servers requiring a password.
2023-11-09net: retry on EINTR and check for {quit} flag
This should allow us to detect shutdown signals in -watch more quickly and not unnecessarily fail on inconsequential signals such as SIGWINCH.
2023-10-03net_reader: note glob support in .onion hint
It's only available for git 2.26+ users, but I figure most distros have it at this point.
2023-10-03net_reader: process title reflects NNTP article
This matches the IMAP behavior with UIDs. While we're in the area, cut down on LoC a bit and reduce the scope of the $art variable.
2023-10-03net_reader: support imap.sslVerify + nntp.sslVerify
These options are useful for testing as well as users stuck on out-of-date systems, dealing with forgetful sysadmins, broken cronjobs, and/or are willing to risk MITM attacks.
2023-10-03config: fix key-only truthy values with urlmatch
When using --get-urlmatch, we need a way to distinguish between between key-only or a `key=val' pair even if the `val' is empty. In other words, git interprets `-c imap.debug' as true and `-c imap.debug=' as false, but an untyped --get-urlmatch invocation has no way to distinguish between them. So we must specify we want `--bool' (we're avoiding `--type=bool' since that only appears in git 2.18+) Fixes: f170d220f876 (lei: fix `-c NAME=VALUE' config support)
2023-10-03net_reader: avoid IO::Socket::SSL 2.079..2.081 warning
IO::Socket::SSL had an unitialized variable warning from a bad regexp for a few releases. This will also prepare us to support imap.sslverify as git does and possibly other TLS-related options.
2023-10-03net_reader: bail out on NNTP SOCKS connection failure
It takes some effort to get Net::NNTP and IO::Socket::Socks to place nice together, but we don't want the setsockopt call to fail on an undefined value. So die with an error that tries to show various possible error sources. $SOCKS_ERROR is a special variable, so even using `//' (defined-or) operator isn't enough to squelch warnings about using it in its uninitialized state.
2023-03-14use v5.12 for various network client-side packages
None of these are affected by the Perl unicode_strings feature, so they can `use v5.12' safely
2022-09-10lei: bail out earlier on IMAP writer failures
Excessive IMAP connections can overload IMAP servers and cause clients to be disconnected without diagnostic messages. Use $lei->fail on these exceptions to propagate errors to the CLI ASAP to avoid further errors down the line. This ought to make problems more apparent for users using IMAP destinations. Reported-by: Ricardo Ribalda <ribalda@chromium.org> Link: https://public-inbox.org/meta/CANiDSCsDfutAUMBLPZbxdyka+_jnhv+4YNYdL9QPRoC=wNUGCQ@mail.gmail.com/
2022-04-05lei: always open mail_sync.sqlite3 R/W
This will make transparently upgrading from 1.7.0 -> 1.8.x easier. Only a single user has access to mail_sync.sqlite3, and R/W at the kernel-level is required for WAL, anyways.
2021-10-22lei export-kw: don't recreate deleted IMAP folders
In case an IMAP folder is deleted, just set an error and ignore it rather than creating an empty folder which we attempt to export keywords to for non-existent messages.
2021-10-09net_reader: hoist out _imap_fetch_bodies
We'll be supporting pipelining in a future commit, since Tor is too slow and increasing batch size can use too much memory.
2021-09-26net_reader: drop support for IgnoreSizeErrors option
Only the ->message_string method of Mail::IMAPClient uses it, and we have no intention of using ->message_string outside of tests.
2021-09-19net_reader: NNTP: remove article numbers from mail_sync folders
NNTP article numbers are stored separately from folder names in mail_sync.sqlite3. Recovering from this is optional, worse case is wasting bandwidth refetching some messages. To (optionally) recover from this, use: lei forget-mail-sync $URL_WITH_ARTNUMS Some articles will be refetched on the next import, but duplicate data won't be indexed in Xapian.
2021-09-19net_reader: disallow imap.fetchBatchSize=0
A batch size of zero is nonsensical and causes infinite loops.
2021-09-19net_reader: no STARTTLS for IMAP localhost or onions
At least not by default, to match existing NNTP behavior. Tor .onions are already encrypted, and there's no point in encrypting traffic on localhost outside of testing.
2021-09-19net_reader: fix single NNTP article fetch, test ranges
While NNTP ranges was already working, fetching a single message was broken. We'll also simplify the code a bit and ensure incremental synchronization is ignored when ranges are specified.
2021-09-19net_reader: quote URL properly for Tor .onion hint
The semicolon in ';AUTH=ANONYMOUS' requires quoting in Bourne shell.
2021-09-18net_reader: set SO_KEEPALIVE on all Net::NNTP sockets
SO_KEEPALIVE can prevent stuck processes and is safe to enable unconditionally on all TCP sockets (like git, and the rest of public-inbox does). Verified via strace on both NNTP and NNTPS with and without nntp.proxy=socks5h://...
2021-09-18net_reader: support imaps:// w/ socks5h:// proxy
While Non-TLS IMAP worked perfectly with IO::Socket::Socks and Mail::IMAPClient; we need to wrap the IO::Socket::Socks object with IO::Socket::SSL before handing it to Mail::IMAPClient.
2021-09-18net_reader: detect IMAP failures earlier
An Mail::IMAPClient object may be returned even on connection failure, so use IsConnected to check for it. This ensures git-credential will no longer prompt for passwords when there's no connection.
2021-09-18net_reader: tie SocksDebug to {imap,nntp}.Debug
I think tying IO::Socket::Socks debugging to existing debug switches is enough, and there's no need to introduce a separate socks.Debug parameter.
2021-09-16net_reader: load IO::Socket::Socks in all workers
This was previously undetected since SOCKS is mainly used for read-only (single worker) tasks, and worker[0] always loaded the module. However, "lei refresh-mail-sync" can bounce reads to any worker, so we need to ensure worker[1..Inf] load it, too.
2021-09-16net_reader: emit .onion help for potential Tor users
We can't easily use torsocks, here, so try to be helpful when it comes to proxy support.
2021-09-10lei: do not read ~/.netrc by default
Since ~/.netrc isn't widely used by most (if any) NNTP and IMAP clients, we won't read it by default for lei. AFAIK, ~/.netrc is mainly by FTP clients (e.g. ftp(1) and lftp(1)). wget uses it by default for HTTP(S) (and FTP), but curl does not. To avoid breaking stable release use cases, public-inbox-watch continues to read ~/.netrc by default. The --netrc switch is supported by all existing lei commands which may use curl.
2021-09-09net_reader: support Mail::IMAPClient Ignoresizeerrors
Some proprietary servers may do wacky things and give the wrong size, so Mail::IMAPClient has a knob for this which we can expose to users to workaround this.
2021-09-09net_reader: improve naming of common args
IMHO this makes things easier-to-follow than before.
2021-09-09net_reader: combine Net::NNTP and IMAPClient args
Since these are keyed by IMAP and NNTP URIs which can never conflict, it simplifies our internals to keep them in one big hash since we'll add POP3 and JMAP client support.
2021-09-09net_reader: imap_opt => cfg_opt
Since this our internal IMAP options are keyed by URI section, there's no need to have separate hashes for NNTP and IMAP options since they URI already distinguishes them. This will make future changes to support POP3 and JMAP and arg caching with lei/store easier.
2021-09-09net_reader: nntp_opt => cfg_opt
Since this our internal NNTP options are keyed by URI section, there's no need to have separate hashes for NNTP and IMAP options since they URI already distinguishes them. This will make future changes to support POP3 and JMAP and arg caching with lei/store easier.
2021-09-09net_reader: preserve memoized IMAPClient arg for SOCKS
Multiple invocations of mic_new may happen in long-lived processes, so do not let mic_new make irreversible changes to the cached args when using a SOCKS proxy.
2021-09-09net_reader: set IMAPClient Keepalive flag late
Since we always enable SO_KEEPALIVE unconditionally, having it in {mic_arg} leads to unnecessary IPC overhead and memory use.
2021-09-09net_reader: do not set "SSL" fields for non-TLS
This will save a little bit of memory and IPC I/O for users connecting to localhost and the majority of Tor .onions.
2021-09-06net_reader: don't approve/reject credentials w/o "fill"
Credentials sourced via ~/.netrc should not be written to git-credential.
2021-09-03lei: fix read/write IMAP access
xt/net_writer-imap.t was completely broken in recent months and I completely forgot this test. net->add_url still only accepts bare scalars (and not scalar refs), so we must set that up properly. Furthermore, our changes to do FLAGS-only synchronization in lei of old messages was causing us to not handle FLAGS properly for the test.
2021-07-03lei import: increase flags search batch size, display progress
IMAP flag-only synchronization doesn't fetch entire messages, so we can safely bump the batch size iff a user specified one for full messages to 10000 times that. Since I sometimes wonder why nothing happens for several seconds after starting "lei import $URL", we'll also show some progress during the flag synchronization phase.
2021-06-14lei index+import: reject keywords from R/O IMAP
Since users can't set IMAP flags in read-only IMAP folders, we won't clobber local flags when importing from IMAP. This also enables the local_blob fallback used for lei-index to be used for index deduplication.
2021-06-13net_reader: canonicalize URL args on add_url
This fixes cases when users specify an IMAP or NNTP URL with standard port numbers explicitly. In other words, this allows users to use "lei ls-mail-source nntps://public-inbox.org:563/" and "lei ls-mail-source imaps://public-inbox.org:993/" without hitting "BUG:" errors.
2021-06-12lei ls-mail-source: list IMAP folders and NNTP groups
While other tools can provide the same functionality, having integration with git-credential is convenient, here. Caching and completion will be implemented separately.
2021-06-09lei prune-mail-sync: new command to prune invalid sync data
This will be invoked automatically by "lei import" eventually, but it may make sense to expose as a separate command.
2021-06-03lei import: speed up kw updates for old IMAP messages
On a 4-core CPU, this speeds up "lei import" on a largish IMAP inbox with 75K messages from ~21 minutes down to 40s. Parallelizing with the new LeiImportKw WQ worker class gives a near-linear speedup and brought the runtime down to ~5:40. The new idx_fid_uid index on the "fid" and "uid" columns of blob2num in mail_sync.sqlite3 brought us the final speedup. An additional index on over.sqlite3#xref3(oidbin) did not help, since idx_nntp already exists and speeds up the new ->oidbin_exists internal API. I initially experimented with a separate "lei import-kw" command but decided against it since it's useless outside of IMAP+JMAP and would require extra cognitive overhead for both users and hackers. So LeiImportKw is just a WQ worker used by "lei import" and not its own user-visible command. v2: fix ikw_done_wait arg handling (ugh, confusing API :x)
2021-06-01lei import: reduce writes to lei/store on IMAP sync
We don't need to write VMD changes to lei/store if local keywords are unchanged.
2021-05-30lei import: import IMAP flag changes from old messages
This makes "lei import" behavior with IMAP folders more consistent with that with Maildir. Opening IMAP folders read-write with "SELECT" (instead of read-only with "EXAMINE") was necessary, since it lets an IMAP server communicate to us as to whether or not it's worth refetching IMAP flags of previously imported messages. Fetching UID+FLAGS only is one of the fastest IMAP operations with dovecot, our -imapd and presumably other common IMAP servers. It is issued by common MUAs such as mutt after every SELECT. Users may now rely on "lei import" exclusively to merge mail and keywords into lei/store, and "lei export-kw" to propagate keyword changes back to IMAP servers. A sticks-and-stones workflow for personal mailboxes is currently: lei import imaps://$MY_PERSONAL_INBOX lei q --mua=$MUA -o /tmp/results SEARCH TERMS... # do stuff from within $MUA to /tmp/results lei import /tmp/results # read keyword changes from MUA lei export-kw imaps://$MY_PERSONAL_INBOX # repeat when new stuff shows up in personal inbox The next goal is to automate repeated imports + export-kw commands with with inotify and IMAP IDLE.
2021-05-30lei import|lcat: improve+fix single message IMAP support
lcat can now dump the memoized contents of entire IMAP folders, not just a single UID. It's now parallelized and pipelined for multiple lei2mail workers. Furthemore, various forms of JSON output work consistently with blob-only output, now. While working on this, I noticed NetReader was passing UID URLs to imap_each callbacks, which was causing mail_sync.sqlite3 to store UIDs in `folders' and clearly wrong so it's now fixed.
2021-05-28lei: handle a single IMAP message in most places
"lei import" can now import a single IMAP message via <imaps://example.com/MAILBOX/;UID=$UID> Likewise, "lei inspect" can show the blob information for UID URLs and "lei lcat" can display the blob without network access if imported. "lei lcat" also gets rid of some unused code and supports "blob:$OIDHEX" syntax as described in the comments (and used by our "text" output format). v2: enforce UID in URL, fail without v3: fix error reporting (s/fail/child_error/)
2021-05-23net_reader|net_writer: pass URI refs deeper into callbacks
This will give us more flexibility in the future w.r.t. dealing with UIDVALIDITY and AUTH= info with IMAP. The LoC reduction is welcome, too.
2021-05-23lei import: store IMAP user+auth in mail_sync folder URI
Just having UIDVALIDITY in the URI isn't enough, since a single lei user may have multiple IMAP logins on the same server. This leads to compatibility problems and forces a reimport for the few users already using this lei functionality, but it's not stable nor released, yet.
2021-05-04lei: fix mail_sync.sqlite3 folder names for NNTP
We should not have "SCALAR(XXXXXXX)" showing up in SQLite DBs because we passed a SCALAR ref instead of a non-ref SCALAR.