about summary refs log tree commit homepage
DateCommit message (Collapse)
2021-09-21t/lei-up: use '-q' to silence non-redirected test
We could redirect, too, but just use -q since we don't care for the output with run_mode => 0.
2021-09-21lei q: improve --limit behavior and progress
Avoid slurping gigantic (e.g. 100000) result sets into a single response if a giant limit is specified, and instead use 10000 as a window for the mset with a given offset. We'll also warn and hint towards about the --limit= switch when the estimated result set is larger than the default limit.
2021-09-21lei q: update messages to reflect --save default
I wanted to try --dedupe=none for something, but it failed since I forgot --no-save :x So hint users towards --no-save if necessary.
2021-09-21search: drop reopen retry message
It's needless noise in syslogs for daemons and unnecessarily alarming to users on the command-line.
2021-09-21lei q: show progress on >1s preparation phase
Overwriting existing destinations safe (but slow) by default, so show a progress message noting what we're doing while a user waits.
2021-09-21lei: various completion improvements
"lei export-kw" no longer completes for anonymous sources. More commands use "lei refresh-mail-sync" as a basis for their completion work, as well. ";AUTH=ANONYMOUS@" is stripped from completions since it was preventing bash completion from working on AUTH=ANONYMOUS IMAP URLs. I'm not sure if there's a better way, but all of our code works fine without specifying AUTH=ANONYMOUS as a command-line arg. Finally, we fallback to using more candidates if none can be found, allowing multiple URLs to be completed.
2021-09-21lei lcat: support NNTP URLs
NNTP URLs are probably more prevalent in public message archives than IMAP URLs.
2021-09-21doc: lei-security: section for WIP auth methods
Lots of stuff out there that becomes a pain to setup configuration for and test...
2021-09-21lei lcat: use single queue for ordering
If lcat-ing multiple argument types (blobs vs folders), maintain the original order of the arguments instead of dumping all blobs before folder contents.
2021-09-21lei: simplify internal arg2folder usage
We can set opt->{quiet} for (internal) 'note-event' command to quiet ->qerr, since we use ->qerr everywhere else. And we'll just die() instead of setting a ->{fail} message, since eval + die are more inline with the rest of our Perl code.
2021-09-21lei_mail_sync: account for non-unique cases
NNTP servers, IMAP servers, and various MUAs may recycle "unique" identifiers due to software bugs or careless BOFHs. Warn about them, but always be prepared to account for them.
2021-09-21lei inspect: support NNTP URLs
No reason not to support them, since there's more public-inbox-nntpd instances than -imapd instances, currently.
2021-09-21lei inspect: convert to WQ worker
Xapian and SQLite access can be slow when a DB is large and/or on high-latency storage.
2021-09-20gcf2: fix loading at runtime
We need to waitpid synchronously on pkg-config to use $?. When loading Gcf2 inside the event loop, implicit dwaitpid done by PublicInbox::ProcessPipe would not call waitpid in time to zero $?. This was causing one of my -httpd to occasionally fall back to git(1) instead of using Gcf2. This was noted in: Link: https://public-inbox.org/meta/20210914085322.25517-1-e@80x24.org/
2021-09-19net_reader: NNTP: remove article numbers from mail_sync folders
NNTP article numbers are stored separately from folder names in mail_sync.sqlite3. Recovering from this is optional, worse case is wasting bandwidth refetching some messages. To (optionally) recover from this, use: lei forget-mail-sync $URL_WITH_ARTNUMS Some articles will be refetched on the next import, but duplicate data won't be indexed in Xapian.
2021-09-19doc: lei-config: document various knobs
It's still a work-in-progress, but the basic debug knob comes in handy for new users; as does proxy support.
2021-09-19net_reader: disallow imap.fetchBatchSize=0
A batch size of zero is nonsensical and causes infinite loops.
2021-09-19lei config --edit: use controlling terminal
As with "lei edit-search", "lei config --edit" may spawn an interactive editor which works best from the terminal running script/lei. So implement LeiConfig as a superclass of LeiEditSearch so the two commands can share the same verification hooks and retry logic.
2021-09-19net_reader: no STARTTLS for IMAP localhost or onions
At least not by default, to match existing NNTP behavior. Tor .onions are already encrypted, and there's no point in encrypting traffic on localhost outside of testing.
2021-09-19watch: use net_reader->mic_new wrapper for SOCKS+TLS
This brings -watch up to feature parity with lei with SOCKS support.
2021-09-19xt: add fsck script over over.sqlite3
I'm not sure what caused it, but I've noticed two missing messages that failed from "lei up" on an https:// external; and I've also seen some duplicates in the past (which I think I fixed...).
2021-09-19net_reader: fix single NNTP article fetch, test ranges
While NNTP ranges was already working, fetching a single message was broken. We'll also simplify the code a bit and ensure incremental synchronization is ignored when ranges are specified.
2021-09-19lei ls-mail-source: pretty JSON support
As with other commands, we enable pretty JSON by default if stdout is a terminal or if --pretty is specified. While the ->pretty JSON output has excessive vertical whitespace, too many lines is preferable to having everything on one line.
2021-09-19lei ls-mail-source: use "high"/"low" for NNTP
The meanings of "hwm" and "lwm" may not be obvious abbreviations for (high|low) water mark descriptions used by RFC 3977. "high" and "low" should be obvious to anyone.
2021-09-19lei: clamp internal worker processes to 4
"All" my CPUs is only 4, but it's probably ridiculous for somebody with a 16-core system to have 16 processes for accessing SQLite DBs. We do the same thing in Pmdir for parallel Maildir access (and V2Writable).
2021-09-19ipc: drop dynamic WQ process counts
In retrospect, I don't think it's needed; and trying to wire up a user interface for lei to manage process counts doesn't seem worthwhile. It could be resurrected for public-facing daemon use in the future, but that's what version control systems are for. This also lets us automatically avoid setting up broadcast sockets Followup-to: 7b7939d47b336fb7 ("lei: lock worker counts")
2021-09-19lei_xsearch: drop Data::Dumper use
We're not using Data::Dumper for JSON output.
2021-09-19lei: simplify sto_done_request
With the switch from pipes to sockets for lei-daemon => lei/store IPC, we can send the script/lei client socket to the lei/store process and rely on reference counting in both Perl and the kernel to persist the script/lei.
2021-09-19lei/store: use SOCK_SEQPACKET rather than pipe
This has several advantages: * no need to use ipc.lock to protect a pipe for non-atomic writes * ability to pass FDs. In another commit, this will let us simplify lei->sto_done_request and pass newly-created sockets to lei/store directly. disadvantages: - an extra pipe is required for rare messages over several hundred KB, this is probably a non-issue, though The performance delta is unknown, but I expect shards (which remain pipes) to be the primary bottleneck IPC-wise for lei/store.
2021-09-19ipc: allow disabling broadcast for wq_workers
Since some lei worker classes only use a single worker, there's no sense in having broadcast for those cases.
2021-09-19ipc: wq_do: support synchronous waits and responses
This brings the wq_* SOCK_SEQPACKET API functionality on par with the ipc_do (pipe-based) API.
2021-09-19doc: tuning: note git 2.33+, move libgit2 into Inline::C section
git 2.33+ contains important optimizations for the thousands-of-inboxes case. And combine the Inline::C stuff with libgit2, since our use of libgit2 requires Inline::C.
2021-09-19t/lei-refresh-mail-sync: improve test reliability
We can't assume -imapd will be ready by the time we try to connect to it after restart when using "-l $ADDR". So recreate the (closed-for-testing) listen socket in the parent and hand it off to -imapd as we do normally
2021-09-19net_reader: quote URL properly for Tor .onion hint
The semicolon in ';AUTH=ANONYMOUS' requires quoting in Bourne shell.
2021-09-19t/config: extra test for imap_url with imaps://
I configured this for public-inbox.org, but wasn't 100% sure it worked. This test ensures it stays working :>
2021-09-18lei up: automatically use dt: for remote externals
Since we can't use maxuid for remote externals, automatically maintaining the last time we got results and appending a dt: range to the query will prevent HTTP(S) responses from getting too big. We could be using "rt:", but no stable release of public-inbox supports it, yet, so we'll use dt:, instead. By default, there's a two day fudge factor to account for MTA downtime and delays; which is hopefully enough. The fudge factor may be changed per-invocation with the --remote-fudge-factor=INTERVAL option Since different externals can have different message transport routes, "lastresult" entries are stored on a per-external basis.
2021-09-18net_reader: set SO_KEEPALIVE on all Net::NNTP sockets
SO_KEEPALIVE can prevent stuck processes and is safe to enable unconditionally on all TCP sockets (like git, and the rest of public-inbox does). Verified via strace on both NNTP and NNTPS with and without nntp.proxy=socks5h://...
2021-09-18net_reader: support imaps:// w/ socks5h:// proxy
While Non-TLS IMAP worked perfectly with IO::Socket::Socks and Mail::IMAPClient; we need to wrap the IO::Socket::Socks object with IO::Socket::SSL before handing it to Mail::IMAPClient.
2021-09-18net_reader: detect IMAP failures earlier
An Mail::IMAPClient object may be returned even on connection failure, so use IsConnected to check for it. This ensures git-credential will no longer prompt for passwords when there's no connection.
2021-09-18net_reader: tie SocksDebug to {imap,nntp}.Debug
I think tying IO::Socket::Socks debugging to existing debug switches is enough, and there's no need to introduce a separate socks.Debug parameter.
2021-09-18ds: support add unique timers
A common pattern we use is to arm a timer once and prevent it from being armed until it fires. We'll be using it more to do polling for saved searches and imports.
2021-09-18lei_mail_sync: set nodatacow on btrfs
As with other SQLite3 databases, copy-on-write with files experiencing random writes leads to write amplification and low performance.
2021-09-18lei_mail_sync: rely on flock(2), avoid IPC
Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"), relying on lei/store to serialize access was a pointless endeavor. Rely on flock(2) to serialize multiple writers since (in my experience) it's the easiest way to deal with parallel writers when using SQLite. This allows us to simplify existing callers while speeding up 'lei refresh-mail-sync --all=local' by 5% or so.
2021-09-18lei: lock worker counts
It doesn't seem worthwhile to change worker counts dynamically on a per-command-basis with lei, and I don't know how such an interface would even work...
2021-09-18doc: lei-lcat: document --stdin behavior
This is another feature I've found immensely useful, but I also wonder if I'm the only one who uses it.
2021-09-17git_http_backend: forward HTTP_GIT_PROTOCOL in request headers
It looks like git-http-backend(1) will support HTTP_GIT_PROTOCOL, soon, and we won't have to add GIT_PROTOCOL support to support newer versions of the git protocol, either. Link: https://public-inbox.org/git/YTiXEEEs36NCEr9S@coredump.intra.peff.net/
2021-09-17doc: add lei-security(7) manpage
It seems like a good idea to have a manpage where somebody can quickly look up and address their concerns as to what to put on encrypted device/filesystem. And I probably would've designed lei around make(1) for parallelization if I didn't have to keep credentials off the FS :P
2021-09-17script/lei: umask(077) before execve
While my MUA also runs umask(077) unconditionally, not all MUAs do. Additionally, pagers may support writing its buffer to disk, so ensure anything else we spawn has umask(077).
2021-09-17fetch: ignore non-writable epoch dirs
This will eventually be useful for maintaing partial mirrors. Keeping inline with the original public-inbox-fetch philosophy, there are no additional config files to manage: the user merely needs to remove write permissions to an $N.git directory to prevent it from being updated. Re-enabling updates just requires restoring write permission.
2021-09-17search: fix rt: w/ approxidate when TZ != UTC
While git respects a user's local timezone and returns seconds-since-the-Epoch, we were unnecessarily and incorrectly calling gmtime+strftime on its result. So ignore calling gmtime+strftime when the strftime format is "%s", just feed the output time from git directly to Xapian. This is mainly for lei, which will likely run in a variety of timezones. While we're at it, add a recommendation to use TZ=UTC in public-inbox-httpd, in case there are (misguided :P) sysadmins who set a non-UTC TZ.