about summary refs log tree commit homepage
path: root/lib/PublicInbox/LEI.pm
DateCommit message (Collapse)
2024-04-17lei: use ->barrier to commit to lei/store
barrier (synchronous checkpoint) is better than ->done with parallel lei commands being issued (via '&' or different terminals), since repeatedly stopping and restarting processes doesn't play nicely with expensive tasks like `lei reindex'. This introduces a slight regression in maintaining more processes (and thus resource use) when lei is idle, but that'll be fixed in the next commit.
2024-04-13lei: remove leftover debugging message
Noticed while working on other things... Fixes: 299aac294ec3 (lei: do label/keyword parsing in optparse, 2023-10-02)
2024-04-12lei q: support --thread-id=$MSGID || -T $MSGID
This adds support for the "POST /$INBOX/$MSGID/?x=m?q=..." added last year to support per-thread searches 764035c83 (www: support POST /$INBOX/$MSGID/?x=m&q=, 2023-03-30) This only supports instances of public-inbox since 764035c83, but unfortunately there hasn't been a release since then.
2024-04-03treewide: avoid getpid() for OnDestroy checks
getpid() isn't cached by glibc nowadays and system calls are more expensive due to CPU vulnerability mitigations. To ensure we switch to the new semantics properly, introduce a new `on_destroy' function to simplify callers. Furthermore, most OnDestroy correctness is often tied to the process which creates it, so make the new API default to guarded against running in subprocesses. For cases which require running in all children, a new PublicInbox::OnDestroy::all call is provided.
2024-01-04lei: MH: support inotify to detect updates
This should help us deal with MH sequence number packing and invalidating mail_sync.sqlite3.
2023-12-30lei: support reading MH for convert+import+index
The MH format is widely-supported and used by various MUAs such as mutt and sylpheed, and a MH-like format is used by mlmmj for archives, as well. Locking implementations for writes are inconsistent, so this commit doesn't support writes, yet. inotify|EVFILT_VNODE watches aren't supported, yet, but that'll have to come since MH allows packing unused integers and renaming files.
2023-12-16lei index: support +L: labels
`lei index' should be capable of indexing the the same way `lei import' does, but without the indexing. I only noticed this omission while developing a new feature.
2023-11-29lei q: fix --no-import-before completion + docs
--no-import-before skips importing entire messages, not just keywords, so it can cause permanent data loss if -o is pointed to precious data.
2023-11-22lei_to_mail: don't close STDOUT unless it is a mbox* output
We only care about error checking when stdout is an mbox output pointed to a pathname. This is noticeable with `lei up' with multiple non-mbox* destinations. We'll also ensure throwing exceptions to trigger lei->x_it from lei->do_env results in the epoll/kqueue watch being discarded, otherwise commands may never terminate (leading to stuck tests)
2023-11-16lei: fix idempotent STDERR redirect in workers
This is needed to support forking from already-forked lei workers and $lei->{2} is already STDERR. Fixes: e015c3742f91 (lei: use autodie where appropriate, 2023-10-17)
2023-11-15treewide: more autodie safety fixes for older Perl
Avoid mixing autodie use in different scopes since it's likely to cause problems like it did in Gcf2. While none of these fix known problems with test cases, it's likely worthwhile to avoid it anyways to avoid future surprises. For Process::IO, we'll add some additional tests in t/io.t to ensure we don't get unintended exceptions for try_cat.
2023-11-15lei: use -signal numbers for old Perl
Unlike modern Perls, Perl 5.16.3 on CentOS doesn't accept negative string signals like "-TERM" . This only became a problem since commit b231d91f42d7 (treewide: enable warnings in all exec-ed processes) made our code stricter by enabling more warnings. In both cases, the kill is probably unnecessary and safe to remove since we can rely on closing sockets to drop processes.
2023-11-13lei: don't read --stdin terminals from daemon
We must use a foreground process to read from terminals on stdin, otherwise weird things like lost keystrokes and EIO can happen. So take advantage of ->send_exec_cmd to spawn `cat' in the same way we spawn MUAs, pagers, `git config --edit' and `git credential' from script/lei
2023-11-09lei: reuse FDs atfork and close explicitly
We'll avoid having a redundant STDERR FD open in lei workers, and some explicit close() on `lei up' sockets reduces the likelyhood of inadvertantly open FDs causing processes to linger.
2023-11-09lei: use cached $daemon_pid when possible
->lei_daemon_pid can only be called in the top-level daemon process when $daemon_pid is valid, so avoid a getpid(2) syscall in those cases.
2023-11-03treewide: use ->close to call ProcessIO->CLOSE
This will open the door for us to drop `tie' usage from ProcessIO completely in favor of OO method dispatch. While OO method dispatches (e.g. `$fh->close') are slower than normal subroutine calls, it hardly matters in this case since process teardown is a fairly rare operation and we continue to use `close($fh)' for Maildir writes.
2023-10-28treewide: use run_qx where appropriate
This saves us some code, and is a small step towards getting ProcessIO working with stat, fcntl and other perlops that don't work with tied handles.
2023-10-27lei: don't exit lei-daemon on ovv_begin failure
When ->ovv_begin is called in LeiXSearch->do_query in the top-level lei-daemon process, $lei->{pkt_op_p} still exists. We must make sure we're exiting the correct process since lei->out can call lei->fail and lei->fail calls lei->x_it. As to avoiding how I caused ->ovv_begin failures to begin with, that's for a much bigger change...
2023-10-25qspawn: introduce new psgi_yield API
This is intended to replace psgi_return and HTTPD/Async entirely, hopefully making our code less convoluted while maintaining the ability to handle slow clients on memory-constrained systems This was made possible by the philosophy shift in commit 21a539a2df0c (httpd/async: switch to buffering-as-fast-as-possible, 2019-06-28). We'll still support generic PSGI via the `pull' model with a GetlineResponse class which is similar to the old GetlineBody.
2023-10-19lei: simplify startq/au_done wakeup notifications
We only need to write one byte at MUA start instead of a byte for every LeiXSearch worker. Also, make sure it succeeds by enabling autodie for syswrite. When reading, we can rely on `:perlio' layer `read' semantics to retry on EINTR to avoid looping and other error checking.
2023-10-18lei: use autodie where appropriate
This makes us a bit harsher with misbehaving clients, but we only have one client implementation at the moment.
2023-10-17lei: consolidate stdin slurp, fix warnings
We can share more code amongst stdin slurper (not streaming) commands. This also fixes uninitialized variable warnings when feeding an empty stdin to these commands.
2023-10-12lei: quiet excessive write/seen messages
We don't want to end up dumping nr_seen/nr_write when progress is disabled, nor do we want forked off `lei note-event' workers dump them when DS->Reset is called on fork.
2023-10-11lei import|tag|rm: support --commit-delay=SECONDS
Delayed commits allows users to trade off immediate safety for throughput and reduced storage wear when running multiple discreet commands. This feature is currently useful for providing a way to make t/lei-store-fail.t reliable and for ensuring `lei blob' can retrieve messages which have not yet been committed. In the future, it'll also be useful for the FUSE layer to batch git activity.
2023-10-08lei: always use async `done' requests to store
It's safer against deadlocks and we still get proper error reporting by passing stderr across in addition to the lei socket.
2023-10-08lei: fix implicit stdin support for pipes
Eric Wong <e@80x24.org> wrote: > +++ b/t/lei-store-fail.t > + my $cmd = [ qw(lei import -q -F mboxrd) ]; > + my $tp = start_script($cmd, undef, $opt); Of course the lack of `-' or `--stdin' only worked on Linux and NetBSD, but not other BSDs. -------8<------ Subject: [PATCH] lei: fix implicit stdin support for pipes st_mode permission bits can't be used to determine if a file or pipe we have on stdin readable or not. Writable regular files can be opened O_RDONLY, and permissions bits for pipes are inconsistent across platforms. On FreeBSD, OpenBSD, and Dragonfly, only the S_IFIFO bit is set in st_mode with none of the permission bits are set. Linux and NetBSD have both the read and write permission bits set for both ends of a the pipe, so they're just as inaccurate but allowed the feature to work before this change. For now, we'll just assume our users know that stdin is intended for input and consider any pipe or regular file to be readable. If we were to be pedantic, we'd check O_RDONLY or O_RDWR description flags via the F_GETFL fcntl(2) op to determine if a pipe or socket is readable. However, I don't think it's worth the code to do so.
2023-10-08lei: do not issue sto->done if socket is inactive
This fixes attempts to use an undefined value as an ARRAY reference in PublicInbox::IPC::wq_io_do
2023-10-06ipc: lower-level send_cmd/recv_cmd handle EINTR directly
This ensures script/lei $send_cmd usage is EINTR-safe (since I prefer to avoid loading PublicInbox::IPC for startup time). Overall, it saves us some code, too.
2023-10-04ds: make %AWAIT_PIDS a hash, not hashref
This is more persistent than some of the others and we don't swap it on use (unlike $nextq or $ToClose). In other words, it's helpful for communicating its lifetime expectancy is close to %DescriptorMap and not like to queue-type things such as $ToClose.
2023-10-04lei: document and local-ize $OPT hashref
This variable needs to be visible to a callback running inside Getopt::Long, but we don't need to keep it around after LEI->optparse runs.
2023-10-04treewide: use PublicInbox::Lock->new
This gets rid of a few bare bless statements and helps ensure we properly load Lock.pm before using it.
2023-10-04lei: keep signals blocked on daemon shutdown
Since we completely shut down all workers before exiting, we no longer have to care about missing SIGCHLD wakeups during shutdown.
2023-10-04lei: reuse PublicInbox::Config::noop
No need to define our own empty `noop' sub when PublicInbox::Config already has one and is loaded anyways.
2023-10-04lei: get rid of l2m_progress PktOp callback
We already have an ->incr callback we can enhance to support multiple counters with a single request. Furthermore, we can just flatten the object graph by storing counters directly in the $lei object itself to reduce hash lookups.
2023-10-04lei: do_env combines fchdir and local
This will make switching $lei contexts less error-prone and hopefully save us from some suprising bugs in the future. Followup-to: 759885e60e59 (lei: ensure --stdin sets %ENV and $current_lei, 2023-09-14)
2023-10-04lei: close DirIdle (inotify) early at daemon shutdown
We don't want FS activity to delay lei-daemon shutdown.
2023-10-04move all non-test @post_loop_do into named subs
Compared to Danga::Socket, our @post_loop_do API is designed to make it easier to avoid anonymous subs (and their potential for leaks in buggy old versions of Perl).
2023-10-04ds: don't pass FD map to post_loop_do callback
It's not used by any post_loop_do callbacks anymore, and the underlying FD map is a global `our' variable accessible from anywhere, anyways.
2023-10-04ds: hoist out close_non_busy
It's shared by both by lei and public-facing daemons in using the ->busy callback.
2023-10-04lei: drop stores explicitly at daemon shutdown
This will allow us to avoid unblocking signals during shutdown to simplify our code.
2023-10-03lei: workers exit after they tell lei-daemon
We don't want workers continuing after their stdout has triggered EPIPE or some other write error. This fixes xt/lei-onion-convert.t to ensure the quit_waiter_pipe is fully-closed at daemon teardown during tests. Using the `exit' perlop still ensures OnDestroy callbacks will fire.
2023-10-02lei: do label/keyword parsing in optparse
Calling vmd_mod_extract after optparse causes the implicit stdin-as-input functionality to fail, as the implicit stdin requires a lack of inputs remaining in argv after option parsing (along with a regular file or pipe as stdin). This allows commands such as `lei import -F eml +kw:seen' to work without `--stdin', `-' or any path names when importing a single message. This also ensures commands like `lei import +kw:seen' without any inputs/locations will fail reliably, as the extra +kw: arg won't be a false-positive.
2023-10-01lei: deal with clients with blocked stderr
lei/store can get stuck if lei-daemon is blocked, and lei-daemon can get stuck when a clients stderr is redirected to a pager that isn't consumed. So start relying on Time::HiRes::alarm to generate SIGALRM to break out of the `print' perlop. Unfortunately, this isn't easy since Perl auto-restarts all writes, so we dup(2) the destination FD and close the copy in the SIGALRM handler to force `print' to return. Most programs (MUAs, editors, etc.) aren't equipped to deal with non-blocking STDERR, so we can't make the stderr file description non-blocking. Another way to solve this problem would be to have script/lei send a non-blocking pipe to lei-daemon in the {2} slot and make script/lei splice messages from the pipe to stderr. Unfortunately, that requires more work and forces more complexity into script/lei and slow down normal cases where stderr doesn't get blocked.
2023-10-01lei: ->fail only allows integer exit codes
We can't use floating point numbers nor Inf/-Inf as exit codes; but we can allow `-1' as shorthand for 255.
2023-10-01lei: correct exit signal
The first argument passed to Perl signal handlers is a signal name (e.g. "TERM") and not an integer that can be passed to the `exit' perlop. Thus we must look up the integer value from the POSIX module.
2023-10-01lei rediff: `git diff -O<order-file>' support
We can't use the `-O' switch since it conflicts with --only|-O= to specify externals. Thus we'll introduce a more verbose `--order-file=FILE' option when running `git diff'.
2023-09-28lei: don't gzip --rsyncable by default for mbox*
Using and memoizing the usability of `--rsyncable' is unsafe since pigz (or GNU gzip) can be uninstalled and leave a user with a non-rsync-aware gzip implementation in the long-running daemon. So we stop passing --rsyncable by default to pigz/gzip and no longer attempt to check for it (since it was a TOCTTOU error, anyways). Specifying --rsyncable explicitly didn't work, either, and ended up passing `1' to the gzip/pigz argv :x Finally, we now test --rsyncable on the CLI by adding support for it in `lei convert' and testing it in t/lei-convert.t
2023-09-26spawn: add run_wait to simplify spawn+waitpid use
It's basically the `system' perlop with support for env overrides, redirects, chdir, rlimits, and setpgid support.
2023-09-25ds: force event_loop wakeup on final child death
Reaping children needs to keep the event_loop spinning another round when the @post_loop_do callback may be used to check on process exit during shutdown. This allows us to get rid of the hacky SetLoopTimeout calls in lei-daemon and XapHelper.pm during process shutdown if we're trying to wait for all PIDs to exit before leaving the event loop.
2023-09-24lei: use scalar %SIG assignment
Perl v5.16.3 (and possibly some later versions) complain about this, but newer (v5.32.1) are fine with it. Fixes: e281363ba937 ("lei: ensure we run DESTROY|END at daemon exit w/ kqueue")