about summary refs log tree commit homepage
path: root/script
DateCommit message (Collapse)
2021-02-08lei: drop BSD::Resource usage
It's no longer necessary with the changes to stop doing FD passing in our backend. cf. commits 5180ed0a1cd65139 and 7d440bf3667b8ef5 ("lei q: eliminate $not_done temporary git dir hack") ("lei q: reorder internals to reduce FD passing")
2021-02-08lei q: SIGWINCH process group with the terminal
While using utime on the destination Maildir is enough for mutt to eventually notice new mail, "eventually" isn't good enough. Send a SIGWINCH to wake mutt (and likely other MUAs) immediately. This is more portable than relying on MUAs to support inotify or EVFILT_VNODE.
2021-02-07script/lei: avoid waitpid(-1, ...) to keep tests fast
We only spawn one process to be reaped at the moment. tests will run the contents of script/* in the same process if possible, so any test scripts which spawn -httpd or other read-only can cause us to stall with waitpid(-1, ...)
2021-02-07init: lowercase -j for --jobs
This is taken from common implementations of make(1) and only affected people using the command-line help output.
2021-02-04lei: use sleep(1) loop for infinite sleep
Perl may internally race and miss signals due to a lack of self-pipe / eventfd / signalfd / EVFILT_SIGNAL usage. While our event loop paths avoid these problems by using signalfd or EVFILT_SIGNAL, thse sleep() calls are not within the event loop.
2021-02-01lei: avoid ETOOMANYREFS, cleanup imports
As with PublicInbox::IPC, we'll attempt to bump RLIMIT_NOFILE and transparently workaround ETOOMANYREFS. If that fails, we'll give the user a hint to bump RLIMIT_NOFILE since ETOOMANYREFS is an uncommon error which users may be unfamiliar with. Found while stress testing for segfaults.
2021-02-01lei: increase initial timeout
PublicInbox::Listener unconditionally sets O_NONBLOCK upon accept(), so we need a larger timeout under heavy load since there's no "dataready" accept filter on the listener. With O_NONBLOCK already set, we don't have to set it at ->event_step_init
2021-01-24ipc: get rid of wq_set_recv_modes
Just open every FD as read/write. Perl (or any non-broken runtime) won't care and won't attempt to use F_SETFL to alter file description flags; as attempting to change those would lead to unpleasant side effects if the file description is shared with another process.
2021-01-23lei: support remote externals
Via curl(1), since that lets us easily use tor on a per-connection basis via LD_PRELOAD (torsocks) or proxy. We'll eventually support more curl options which can allow users to get past firewalls and deal with other odd network configurations.
2021-01-22lei: remove INT/QUIT/TERM handlers, fix daemon EOF
The signal handlers on the client side were unnecessary, all we need is to handle socket EOF properly in the daemon by killing xsearch and l2m workers.
2021-01-15lei: pass FD to CWD via cmsg, use fchdir on server
Perl chdir() automatically does fchdir(2) if given a file or directory handle since 5.8.8/5.10.0, so we can safely rely on it given our 5.10.1+ requirement. This means we no longer have to waste several milliseconds loading the Cwd.so and making stat() calls to ensure ENV{PWD} is correct and usable in the server. It also lets us work in directories that are no longer accessible via pathname.
2021-01-14daemon+watch: fix localization of %SIG for non-signalfd users
It turns out "local" did not take effect in the way we used it: http://nntp.perl.org/group/perl.perl5.porters/258784 <CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com> Fortunately, none of the old use cases seem affected, unlike the previous lei change to ensure consistent SIGPIPE handling.
2021-01-14lei: test SIGPIPE, stop xsearch workers on client abort
The new test ensures consistency between oneshot and client/daemon users. Cancelling an in-progress result now also stops xsearch workers to avoid wasted CPU and I/O. Note the lei->atfork_child_wq usage changes, it is to workaround a bug in Perl 5: http://nntp.perl.org/group/perl.perl5.porters/258784 <CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com> This switches the internal protocol to use SOCK_SEQPACKET AF_UNIX sockets to prevent merging messages from the daemon to client to run pager and kill/exit the client script.
2021-01-12lei_xsearch: transfer 4 FDs internally, drop IO::FDPass
It's easier to make the code more generic by transferring all four FDs (std(in|out|err) + socket) instead of omitting stdin. We'll be reading from stdin on some imports, and possibly outputting to stdout, so omitting stdin now would needlessly complicate things. The differences with IO::FDPass "1" code paths and the "4" code paths used by Inline::C and Socket::MsgHdr are far too much to support and test at the moment.
2021-01-12lei: run pager in client script
While most single keystrokes work fine when the pager is launched from the background daemon, Ctrl-C and WINCH can cause strangeness when connected to the wrong terminal.
2021-01-12lei: get rid of client {pid} field
Using kill(2) is too dangerous since extremely long queries may mean the original PID of the aborted lei(1) client process to be recycled by a new process. It would be bad if the lei_xsearch worker process issued a kill on the wrong process. So just rely on sending the exit message via socket.
2021-01-12ipc: start supporting sending/receiving more than 3 FDs
Actually, sending 4 FDs will be useful for lei internal xsearch work once we start accepting input from stdin. It won't be used with the lightweight lei(1) client, however. For WWW (eventually), a single FD may be enough.
2021-01-12cmd_ipc: send FDs with buffer payload
For another step in in syscall reduction, we'll support transferring 3 FDs and a buffer with a single sendmsg/recvmsg syscall using Socket::MsgHdr if available. Beyond script/lei itself, this will be used for internal IPC between search backends (perhaps with SOCK_SEQPACKET). There's a chance this could make it to the public-facing daemons, too. This adds an optional dependency on the Socket::MsgHdr package, available as libsocket-msghdr-perl on Debian-based distros (but not CentOS 7.x and FreeBSD 11.x, at least). Our Inline::C version in PublicInbox::Spawn remains the last choice for script/lei due to the high startup time, and IO::FDPass remains supported for non-Debian distros. Since the socket name prefix changes from 3 to 4, we'll also take this opportunity to make the argv+env buffer transfer less error-prone by relying on argc instead of designated delimiters.
2021-01-12ds: block signals when reaping
This lets us call dwaitpid long before a process exits and not have to wait around for it. This is advantageous for lei where we can run dwaitpid on the pager as soon as we spawn it, instead of waiting for a client socket to go away on DESTROY.
2021-01-04lei: prefer IO::FDPass over our Inline::C recv_3fds
While our recv_3fds() implementation is more efficient syscall-wise, loading Inline takes nearly 50ms on my machine even after Inline::C memoizes the build. The current ~20ms in the fast path is barely acceptable to me, and 50ms would be unusable. Eventually, script/lei may invoke tcc(1) or cc(1) directly in the fast path, but it needs @INC for the slow path, at least. We'll encode the number of FDs into the socket name allow parallel installations, for now.
2021-01-03send and receive all 3 FDs at once
We'll always be transferring stdin, stdout, and stderr together for lei. Perhaps I lack imagination or foresight, but I can't think of a reason to send more or less FDs.
2021-01-03spawn: support send_fd+recv_fd w/o IO::FDPass
IO::FDPass may be an extra installation burden I don't want to impose on users. We only support Linux and *BSDs, however.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2021-01-01on_destroy: support PID owner guard
Since we'll be forking for Xapian indexing and maybe other places, having a simple guard in place to ensure OnDestroy doesn't unexpectedly unlink files or similar is a safer option.
2021-01-01lei: avoid Spawn package when starting daemon
Spawn was designed to speed up process spawning inside long-lived daemons with largish memory usage. It does not help for short-lived scripts which only exist to start and connect to a daemon. This change actually speeds up initial lei startup from ~190ms to ~140ms(!). Normal usage once the daemon is running is unaffected, at <20ms for help text. While we're in the area, simplify Cwd error message generation, too.
2021-01-01syscall: SFD_NONBLOCK can be a constant, again
Since Perl exposes O_NONBLOCK as a constant, we can safely make SFD_NONBLOCK a constant, too. This is not the case for SFD_CLOEXEC, since O_CLOEXEC is not exposed by Perl despite being used internally in the interpreter.
2021-01-01init: remove embedded UnlinkMe package
PublicInbox::OnDestroy can do the same thing
2021-01-01spawn: move run_die here from PublicInbox::Import
It seems like a more logical place for it, but we'll favor the newly-added xsys_e() in tests for BAIL_OUT use.
2020-12-31Merge remote-tracking branch 'origin/master' into lorelei
* origin/master: (58 commits) ds: flatten + reuse @events, epoll_wait style fixes ds: simplify EventLoop implementation check defined return value for localized slurp errors import: check for git->qx errors, clearer return values git: qx: avoid extra "local" for scalar context case search: remove {mset} option for ->mset method search: remove pointless {relevance} setting miscsearch: take reopen from Search and use it extsearch: unconditionally reopen on access extindex: allow using --all without EXTINDEX_DIR extindex: add undocumented --no-scan switch extindex: enable autoflush on STDOUT/STDERR extindex: various --watch signal handling fixes extindex: --watch for inotify-based updates eml: fix undefined vars on <Perl 5.28 t/config: test --get-urlmatch for git <2.26 default to CORE::warn in $SIG{__WARN__} handlers inbox: name variable for values loop iterator inboxidle: avoid needless syscalls on refresh inboxidle: clue users into resolving ENOSPC from inotify ...
2020-12-28check defined return value for localized slurp errors
Reading from regular files (even on STDIN) can fail when dealing with flakey storage.
2020-12-27extindex: allow using --all without EXTINDEX_DIR
If "--all" is specified to index all inboxes, implicitly choose the configured [extindex "all"] external index since "--all" is incompatible with specifying inbox directories on the command-line.
2020-12-27extindex: add undocumented --no-scan switch
This makes diagnosing --watch problems easier when there's 50K inboxes by avoiding the lengthy scan (which is the reason --watch exists in the first place).
2020-12-27extindex: enable autoflush on STDOUT/STDERR
With --watch, the output may be redirected to a pipe or socket which Perl may decide to buffer. Ensure Perl doesn't buffer these outputs since they can provide real-time status updates in response to signals or FS activity.
2020-12-27extindex: --watch for inotify-based updates
This reuses existing InboxIdle infrastructure to update external indices based on per-inbox updates. This is an alternative to auto-updating external indices via the -index command and also works with existing uses of -mda and public-inbox-watch. Using inotify (or EVFILT_VNODE) allows watching thousands of inboxes without having to scan every single one at every invocation. This is especially beneficial in cases where an external index is not writable to the users writing to per-inbox indices.
2020-12-26index: filter out indexlevel=basic from extindex
extindex users will likely want to use indexlevel=basic for per-inbox indices, however extindex itself doesn't support basic index level (yet?). Let's ensure we don't trip up extindex users who specify "-L basic" on the -index command-line.
2020-12-26index: fix --no-fsync flag propagation to extindex
Negation in flag names are confusing, but trying to deviate from the DB_NO_SYNC name used by Xapian is also confusing.
2020-12-26index: do not attach inbox to extindex unless updated
We'll count the number of log changes (regardless of index or unindex) and only attach inboxes to ExtSearchIdx objects when they get new work. We'll also reduce lock bouncing and only update external indices after all per-inbox indexing is done. This also updates existing v2 indexing/unindexing callers to be more consistent and ensures unindex log entries update per-inbox last commit information.
2020-12-26index: disable --fast-noop on --reindex
These options make no sense when used together, just inform the user and move on since it's probably harmless to continue.
2020-12-26init: use the return value of rel2abs_collapsed
:x Fixes: 9fcce78e40b0a7c6 ("script/public-inbox-*: favor caller-provided pathnames")
2020-12-25index: support --fast-noop / -F switch
Note: I'm not sure if it's worth documenting and supporting this long-term. We can can avoid taking locks for invocations of "index --all" and rely on high-resolution ctime (struct timespec st_ctim) comparisons of msgmap.sqlite3 and the packed-refs + refs/heads directory of the newest epoch. This cuts public-inbox-index invocations with "--all --no-update-extindex -L basic" down from 0.92s to 0.31s. The change with "-L medium" or "-L full" and (default) non-zero jobs is even more drastic, reducing a 12-13s no-op invocation down to the same 0.31s
2020-12-25inboxwritable: delay umask_prepare calls
This simplifies all ->with_umask callers and opens the door for further optimizations to delay/elide process spawning.
2020-12-24index: update [extindex "all"] by default, support -E
In most cases, this ensures users will only have to opt-in to using -extindex once and won't have to issue extra commands to keep external indices up-to-date when using public-inbox-index. Since we support arbitrary numbers of external indices for ease-of-development, we'll support repeating "-E" ("--update-extindex=") in case users want to test changes in parallel.
2020-12-21use rel2abs_collapsed when loading Inbox objects
We need to canonicalize paths for inboxes which do not have a newsgroup defined, otherwise ->eidx_key matches can fail in unexpected ways.
2020-12-20script/public-inbox-*: favor caller-provided pathnames
We'll try to avoid calling Cwd::abs_path and use File::Spec->rel2abs instead, since abs_path will resolve symlinks the user specified on the command-line. Unfortunately, ->rel2abs still leaves "/.." and "/../" uncollapsed, so we still need to fall back to Cwd::abs_path in those cases. While we are at it, we'll also resolve inboxdir from deep inside v2 directories instead of misdetecting them as v1 bare git repos. In any case, stop matching directories by name and instead rely on the unique combination of st_dev + st_ino on stat() as we started doing in the extindex code.
2020-12-19lei: drop $SIG{__DIE__}, add oneshot fallbacks
We'll force stdout+stderr to be a pipe the spawning client controls, thus there's no need to lose error reporting by prematurely redirecting stdout+stderr to /dev/null. We can now rely exclusively on OnDestroy to write to syslog() on uncaught die failures. Also support falling back to oneshot mode on socket and cwd failures, since some commands may still be useful if the current working directory goes missing :P
2020-12-19lei: micro-optimize startup time
We'll use lower-level Socket and avoid IO::Socket::UNIX, use Cwd::fastcwd(*), avoid IO::Handle->autoflush by using the select operator, and reuse buffer for reading the socket while avoiding unnecessary $/ localization in a tiny script. All these things adds up to ~5-10 ms savings on my loaded system. (*) caveats about fastcwd won't apply since lei won't work in removed directories.
2020-12-19rename LeiDaemon package to PublicInbox::LEI
"LEI" is an acronym, and ALL CAPS is consistent with existing PublicInbox::{IMAP,HTTP,NNTP,WWW} naming for top-level modules, 3 of 4 old ones which deal directly with sockets and requests.
2020-12-19lei: refine help/option parsing, implement "init"
There's a bunch of work in here as the foundations are being fleshed out. One of the UI/UX is to make it easy to keep built-in help and shell completions consistent
2020-12-19lei: use spawn (vfork + execve) for lazy start
This allows us to rely on FD_CLOEXEC being set on pipes from prove(1), so forgetting `daemon-stop' won't cause tests to hang. Unfortunately, daemon tests will be slower with this.
2020-12-19lei: FD-passing and IPC basics
The start of lei, a Local Email Interface. It'll support a daemon via FD passing to avoid startup time penalties if IO::FDPass is installed, but fall back to a slow one-shot mode if not. Compared to traditional socket daemon, FD passing should allow us to eventually do stuff like run "git show" and still have proper terminal support for pager and color.