about summary refs log tree commit homepage
path: root/lib/PublicInbox/LEI.pm
DateCommit message (Collapse)
2021-11-02lei: simplify common LeiInput users with ->wq1_start
This method replaces a common pattern of starting workers, preparing internal auth ops, and asynchronous waiting of command completion. It also adds missing LeiAuth support to rediff and rm which rarely need auth.
2021-11-01treewide: kill problematic "$h->{k} //= do {" assignments
As stated in the previous change, conditional hash assignments which trigger other hash assignments seem problematic, at times. So replace: $h->{k} //= do { $h->{x} = ...; $val }; $h->{k} // do { $h->{x} = ...; $hk->{k} = $val }; "||=" is affected the same way, and some instances of "||=" are replaced with "//=" or "// do {", now.
2021-10-30lei: do not access {sock} after SIGPIPE
It's possible for this to break out of the event loop if note_sigpipe fires via PktOp in the same iteration.
2021-10-27lei mail-diff: support more inputs, split newlines
Support --in-format like the rest of LeiInput users, and don't default to .eml if a per-input format was specified. In any case, I saved a bunch of messages from mutt which uses mboxcl2. We'll also split newlines for diff, since it's a pain to read diffs with escaped "\n" characters in them.
2021-10-26lei p2q: use LeiInput for multi-patch series
The LeiInput backend now allows p2q to work like any other command which reads .eml, .patch, mbox*, Maildir, IMAP, and NNTP input. Running "git format-patch --stdout -1 $COMMIT" remains supported. This is intended to allow lower memory use while parsing "git log --pretty=mboxrd -p" output. Previously, the entire output of "git log" would be slurped into memory at once. The intended use is to allow easy(-ish :P) searching for unapplied patches as documented in the new example in the manpage.
2021-10-26lei: add net getopt spec to various commands
All of these commands should support --proxy, at least, if not other curl options.
2021-10-26lei p2q: document --uri, add examples
This is useful for users lacking in local storage. Also, referencing lei-add-external(1) seems to make less sense than referencing lei-q(1). We'll also start dropping years from the copyright statement to reduce future churn.
2021-10-22lei forget-search: support --prune=<local|remote>
Instead of: lei forget-search $OUTPUT && rm -r $OUTPUT we'll also allow a user to do: rm -r $OUTPUT && lei forget-search --prune This gives users flexibility to choose whatever flow is most natural to them.
2021-10-22lei: no Perl FileHandle for `undef' w/ ECONNRESET
Error reporting for recv_cmd4 methods is a bit wonky.
2021-10-22dir_idle: treat IN_MOVED_FROM as a gone event
Whether an MUA uses rename(2) or link(2)+unlink(2) combination should not matter to us. We should be able to handle both cases.
2021-10-19lei: remove unused ->busy time arg
Our graceful shutdown doesn't time out clients.
2021-10-19lei up: support --exclude=, --no-(external|remote|local)
These can be used to temporarily disable using certain externals in case of temporary network failure or mount point unavailability.
2021-10-19lei: conditionally add "\n" to error messages
Some error messages already include "\n" (w/ file+line info), so don't add another one. (`warn' will automatically add its caller location unless there's a final "\n").
2021-10-16lei sockets: favor level-triggered epoll for fairness
Sigfd->event_step needs priority over script/lei clients, LeiSelfSocket, and everything else.
2021-10-16lei_overview: die rather than lei->fail
This will make our code more flexible in case it gets used in non-lei things.
2021-10-16lei: more eval guards for die on failure
Relying on $lei->fail is unsustainable since there'll always be parts of our code and dependencies which can trigger die() and break the event loop.
2021-10-16lei: always keep cwd fd {3} for ->fchdir
The extra FD shouldn't cause noticeable overhead in short-lived workers, and it lets us simplify lei->rel2abs. Get rid of a 2-argument form of open() while we're at it, since it's been considered for warning+deprecation by Perl for safety reasons.
2021-10-16lei: golf PATH2CFG cleanup
More code means more bugs.
2021-10-16dir_idle: do not add watches in ->new
There's no savings in having two ways to add watches to an inotify nor kqueue descriptor.
2021-10-15lei forget-search: support multiple args
I've been testing a lot of searches which I don't want to keep around, so make it easy to remove a bunch at once. We'll behave like rm(1) and keep going in the face of failure.
2021-10-15lei + ipc: simplify process reaping
Simplify our APIs and force dwaitpid() to work in async mode for all lei workers. This avoids having lingering zombies for parallel searches if one worker finishes soon before another. The old distinction between "old" and "new" workers was needlessly complex, error-prone, and embarrasingly bad. We also never handled v2:// writers properly before on Ctrl-C/Ctrl-Z (SIGINT/SIGTSTP), so add them to @WQ_KEYS to ensure they get handled by $lei when appropropriate.
2021-10-15lei up --all: send signals to workers, receive errors
The redispatch mechanism wasn't routing signals and messages between redispatched workers and script/lei properly. We now rely on PktOp to do bidirectional message forwarding and carefully avoiding circular references by using PktOp.
2021-10-15lei: TSTP affects all curl and related subprocesses
By relying more on pgroups for remaining remaining processes, this lets us pause all curl+tail subprocesses with a single kill(2) to avoid cluttering stderr. We won't bother pausing the pigz/gzip/bzip2/xz compressor process not cat-file processes, though, since those don't write to the terminal and they idle soon after the workers react to SIGSTOP. AutoReap is hoisted out from TestCommon.pm. CLONE_SKIP is gone since we won't be using Perl threads any time soon (they're discouraged by the maintainers of Perl).
2021-10-15lei: give workers their own process group
This lets users Ctrl-Z from their terminal to pause an entire git-clone process hierarchy.
2021-10-14lei: -d (--dir) and -O (only) shortcuts
`-d' seems like a non-brainer for --dir with inspect. I find myself using `--only' a bit, too, and `-O' seems like a reasonable shortcut for it.
2021-10-14lei add-external --mirror: respect client umask
While lei is intended for non-public mail and runs umask(077) by default, externals are one area which can safely defer to the user's umask. Instead of sending it unconditionally with every command, only have lei-daemon request it when necessary.
2021-10-13lei: use standard warn() in more places
warn() is easier to augment with context information, and frankly unavoidable in the presence of 3rd-party libraries we don't control.
2021-10-13lei up --all: show output for warnings
This helps users make sense of which saved searches some warnings were coming from. Since I often create and discard externals, some warnings from saved searches were confusing to me without output context: "`$FOO' is unknown" "$FOO not indexed by Xapian"
2021-10-02lei mail-diff: diagnostic command to diff mail contents
This is useful in finding the cause of deduplication bugs, and possibly the cause of missing threads reported by Konstantin in <20211001130527.z7eivotlgqbgetzz@meerkat.local> usage: u=https://yhbt.net/lore/all/87czop5j33.fsf@tynnyri.adurom.net/raw lei mail-diff $u
2021-10-01ds: simplify signalfd use
Since signalfd is often combined with our event loop, give it a convenient API and reduce the code duplication required to use it. EventLoop is replaced with ::event_loop to allow consistent parameter passing and avoid needlessly passing the package name on stack. We also avoid exporting SFD_NONBLOCK since it's the only flag we support. There's no sense in having the memory overhead of a constant function when it's in cold code.
2021-09-27config: get_1: use full parameter name
Instead of passing the prefix section and key separately, pass them together as is commonly done with git-config(1) usage as well as our ->get_all API. This inconsistency in the get_1 API is a needless footgun and confused me a bit while working on "lei up" the other week.
2021-09-27lei rediff: add --drq and --dequote-only
More switches which can be useful for users who pipe from text editors. --drq can be helpful while writing patch review email replies, and perhaps --dequote-only, too.
2021-09-26lei: ensure refresh_watches isn't called from workers
Only the top-level lei-daemon will do inotify/kevent.
2021-09-25lei: make pkt_op easier-to-use and understand
Since switching to SOCK_SEQUENTIAL, we no longer have to use fixed-width records to guarantee atomic reads. Thus we can maintain more human-readable/searchable PktOp opcodes. Furthermore, we can infer the subroutine name in many cases to avoid repeating ourselves by specifying a command-name twice (e.g. $ops->{CMD} => [ \&CMD, $obj ]; can now simply be written as: $ops->{CMD} => [ $obj ] if CMD is a method of $obj.
2021-09-25lei: restore old sigmask before daemon exit
If the event loop fails, we want blocking waitpid (wait4) calls to be interruptible with SIGTERM via "kill $PID" rather than SIGKILL. Though a failing event loop is something we should avoid...
2021-09-24clone|--mirror: support --epoch=RANGE for partial clones
Partial (v2) clones should be useful addition for users wanting to conserve storage while having fast access to recent messages. Continuing work started in 876e74283ff3 (fetch: ignore non-writable epoch dirs, 2021-09-17), this creates bare, read-only epoch git repos. These git repos have the remotes pre-configured, but does not fetch any objects. The goal is to allow users to set the writable bit on a previously-skipped epoch and start fetching it. Shell completion support may not be necessary given how short the epoch ranges are, here. Cc: Luis Chamberlain <mcgrof@kernel.org> Link: https://public-inbox.org/meta/20210917002204.GA13112@dcvr/T/#u
2021-09-23lei: common --all[=remote|local] help message
It helps to be consistent and reduce the learning curve, here.
2021-09-22lei up: avoid excessively parallel --all
We shouldn't dispatch all outputs right away since they can be expensive CPU-wise. Instead, rely on DESTROY to trigger further redispatches. This also fixes a circular reference bug for the single-output case that could lead to a leftover script/lei after MUA exit. I'm not sure how --jobs/-j should work when the actual xsearch and lei2mail has it's own parallelism ("--jobs=$X,$M"), but it's better than having thousands of subtasks running. Fixes: b34a267efff7b831 ("lei up: fix --mua with single output")
2021-09-22lei: dclose: do not close unnecessarily
The bit about reap_compress is no longer true since LeiXSearch->query_done triggers it, instead. I only noticed this while working on "lei up".
2021-09-21lei: umask(077) before opening errors.log
There's a chance some sensitive information (e.g. folder names) can end up in errors.log, though $XDG_RUNTIME_DIR or /tmp/lei-$UID/ will have 0700 permissions, anyways.
2021-09-21script/lei: handle SIGTSTP and SIGCONT
Sometimes it's useful to pause an expensive query or refresh-mail-sync to do something else. While lei-daemon and lei/store can't be paused since they're shared across clients, per-invocation WQ workers can be paused safely using the unblockable SIGSTOP. While we're at it, drop the ETOOMANYREFS hint since it hasn't been a problem since we drastically reduced FD passing early in development.
2021-09-21lei: simplify internal arg2folder usage
We can set opt->{quiet} for (internal) 'note-event' command to quiet ->qerr, since we use ->qerr everywhere else. And we'll just die() instead of setting a ->{fail} message, since eval + die are more inline with the rest of our Perl code.
2021-09-19lei config --edit: use controlling terminal
As with "lei edit-search", "lei config --edit" may spawn an interactive editor which works best from the terminal running script/lei. So implement LeiConfig as a superclass of LeiEditSearch so the two commands can share the same verification hooks and retry logic.
2021-09-19lei ls-mail-source: pretty JSON support
As with other commands, we enable pretty JSON by default if stdout is a terminal or if --pretty is specified. While the ->pretty JSON output has excessive vertical whitespace, too many lines is preferable to having everything on one line.
2021-09-19ipc: drop dynamic WQ process counts
In retrospect, I don't think it's needed; and trying to wire up a user interface for lei to manage process counts doesn't seem worthwhile. It could be resurrected for public-facing daemon use in the future, but that's what version control systems are for. This also lets us automatically avoid setting up broadcast sockets Followup-to: 7b7939d47b336fb7 ("lei: lock worker counts")
2021-09-19lei: simplify sto_done_request
With the switch from pipes to sockets for lei-daemon => lei/store IPC, we can send the script/lei client socket to the lei/store process and rely on reference counting in both Perl and the kernel to persist the script/lei.
2021-09-19lei/store: use SOCK_SEQPACKET rather than pipe
This has several advantages: * no need to use ipc.lock to protect a pipe for non-atomic writes * ability to pass FDs. In another commit, this will let us simplify lei->sto_done_request and pass newly-created sockets to lei/store directly. disadvantages: - an extra pipe is required for rare messages over several hundred KB, this is probably a non-issue, though The performance delta is unknown, but I expect shards (which remain pipes) to be the primary bottleneck IPC-wise for lei/store.
2021-09-18lei up: automatically use dt: for remote externals
Since we can't use maxuid for remote externals, automatically maintaining the last time we got results and appending a dt: range to the query will prevent HTTP(S) responses from getting too big. We could be using "rt:", but no stable release of public-inbox supports it, yet, so we'll use dt:, instead. By default, there's a two day fudge factor to account for MTA downtime and delays; which is hopefully enough. The fudge factor may be changed per-invocation with the --remote-fudge-factor=INTERVAL option Since different externals can have different message transport routes, "lastresult" entries are stored on a per-external basis.
2021-09-18lei_mail_sync: rely on flock(2), avoid IPC
Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"), relying on lei/store to serialize access was a pointless endeavor. Rely on flock(2) to serialize multiple writers since (in my experience) it's the easiest way to deal with parallel writers when using SQLite. This allows us to simplify existing callers while speeding up 'lei refresh-mail-sync --all=local' by 5% or so.
2021-09-18lei: lock worker counts
It doesn't seem worthwhile to change worker counts dynamically on a per-command-basis with lei, and I don't know how such an interface would even work...