about summary refs log tree commit homepage
path: root/lib
DateCommit message (Collapse)
2021-02-08search: use one git-rev-parse process for all dates
This is necessary to avoid slowdowns with pathological cases with many dates in the query, since each rev-parse invocation takes ~5ms. This is immeasurably slower with one open-ended range, but already faster with any closed range featuring two dates which require parsing via git.
2021-02-08lei q: use git approxidate with d:, dt: and rt: ranges
Instead of having --(sent|received)-(before|after)=s command-line switches, we'll just try to make sense of argv so it's usable within parenthesized statements and such. Given the negligible performance penalty with Inline::C process spawning, we'll probably wire this up to the WWW interface, too. "d:" is for mairix compatibility. I don't know if "dt:" and "rt:" will be too useful, but they exist because of IMAP (and JMAP).
2021-02-08git: implement date_parse method
Users are expected to be familiar with git's "approxidate" functionality for parsing dates, so we'll expose that in our UIs. Xapian itself has limited date parsing functionality and I can't expect users to learn it. This takes around 4-5ms on my aging workstation, so it'll probably be made acceptable for the WWW UI, even. libgit2 has a git__date_parse function which I expect to have less overhead, but it's only for internal use at the moment.
2021-02-08lei: drop BSD::Resource usage
It's no longer necessary with the changes to stop doing FD passing in our backend. cf. commits 5180ed0a1cd65139 and 7d440bf3667b8ef5 ("lei q: eliminate $not_done temporary git dir hack") ("lei q: reorder internals to reduce FD passing")
2021-02-08lei: avoid racing on unlink + bind + listen
When multiple lei(1) processes are starting in parallel without lei-daemon already running, it's possible for them to trample each others' socket path trying to start lei-daemon. Lock errors.log before unlink/bind/listen. We'll add an extra connect(2) attempt to check if the starter lost the race. Without this change, a stress script like the following could easily cause problems: lei q -o ~/tmp/a foo ... & lei q -o ~/tmp/b bar ... & lei q -o ~/tmp/c quux ... & lei q -o ~/tmp/d baz ... &
2021-02-08lei: start_pager: drop COLUMNS default
It shouldn't be needed since none of our subcommands will care or attempt to format output. Once "lei show" is implemented, we'll run "git show" directly on the result.
2021-02-08ds: improve add_timer usability
Packing args into an arrayref is awkward and we may be using this API more in lei.
2021-02-08tests: favor IPv6
IPv4 gets plenty of real-world coverage, and apparently there's Debian buildd hosts which lack IPv4(*). So ensure everything can work on IPv6 and not cause problems for odd setups. (*) https://bugs.debian.org/979432
2021-02-08lei q: support --alert=CMD for early MUA users
For --mua users writing to lock-free -o MFOLDER destinations; we'll keep -WINCH and send an ASCII terminal bell when results are complete. This is intended to let early MUA spawners know when lei2mail is done writing results. We'll also support running arbitrary commands. It may be used to run play(1) (from SoX), handle pipelines+redirects (e.g. "/bin/sh -c 'echo search done | wall'") or other commands.
2021-02-08lei q: SIGWINCH process group with the terminal
While using utime on the destination Maildir is enough for mutt to eventually notice new mail, "eventually" isn't good enough. Send a SIGWINCH to wake mutt (and likely other MUAs) immediately. This is more portable than relying on MUAs to support inotify or EVFILT_VNODE.
2021-02-08lei_xsearch: quiet Eml warnings from remote mboxrds
This will probably cover full Atom/HTML feed generation or any outputs which are order-dependent, but those aren't prioritized at the moment.
2021-02-08lei q: improve remote mboxrd UX + MUA
For early MUA spawners using lock-free outputs, we we need to on the startq pipe to silence progress reporting. For --augment users, we can start the MUA even earlier by creating Maildirs in the pre-augment phase. To improve progress reporting for non-MUA (or late-MUA) spawners, we'll no longer blindly append "--compressed" to the curl(1) command when POST-ing for the gzipped mboxrd. Furthermore, we'll overload stringify ('""') in LeiCurl to ensure the empty -d '' string shows up properly. v2: fix startq waiting with --threads mset_progress is never shown with early MUA spawning, The plan is to still show progress when augmenting and deduping. This fixes all local search cases. A leftover debug bit is dropped, too
2021-02-08lei q: fix arbitrary --mua command handling
Perl doesn't seem to warn for shadowed variables, here :x
2021-02-08lei import: support Maildirs
It seems to be working trivially, though I'm probably going to split out Maildir reading into a separate package rather than using LeiToMail.
2021-02-07httpd/async: avoid unnecessary on-stack delete
While this doesn't fix a known problem, this was a risky construct in case somebody uses confess/longmess inside the user-supplied callback. cf. commit 0795b0906cc81f40 ("ds: guard against stack-not-refcounted quirk of Perl 5")
2021-02-07imap: avoid unnecessary on-stack delete
None of the Content-Type attributes are long-lived (and unlikely to be memory intensive). While these callsites won't trigger $DB::args segfaults via confess or longmess, it'll make future code audits easier. cf. commit 0795b0906cc81f40 ("ds: guard against stack-not-refcounted quirk of Perl 5")
2021-02-07lei: replace --thread with --threads
Nobody is expected to use long options, but for consistency with mairix(1), we'll use the pluralized option throughout (including existing PublicInbox::{Search,SearchView}). Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
2021-02-07lei: remove --mua-cmd alias for --mua
While "mua-cmd" may be more accurate, nobody is expected to type 4 extra characters. It's a needless ambiguity with no precedence or prior art to follow. Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
2021-02-07lei: more consistent IPC exit and error handling
We're able to propagate $? from wq_workers in a consistent manner, now.
2021-02-07ipc: wq_do => wq_io_do
We will have a ->wq_do that doesn't pass FDs for I/O.
2021-02-07Revert "ipc: add support for asynchronous callbacks"
This reverts commit a7e6a8cd68fb6d700337d8dbc7ee2c65ff3d2fc1. It turns out to be unworkable in the face of multiple producer processes, since the lock we make has no effect when calculating pipe capacity.
2021-02-07xapcmd: avoid potential die surprise in children
Make some notes about sub usage, this may be converted to use workqueues once the cmsg dependency is dropped.
2021-02-07ipc: trim down the Storable checks
It's distributed with Perl and our Makefile.PL even declares a dependency on it, just like Encode and all the Compress::* stuff.
2021-02-07ipc: do not die inside wq_worker child process
die() in a child zips up the stack into the parent, which is undesirable behavior. We're going to exit anyways, just warn and let exit(1) happen due to $@ being set.
2021-02-07spawn_pp: die more consistently in child
The default $SIG{__DIE__} inside a forked child doesn't actually do what we want it to do. We don't want it to zip up the stack the parent used, but instead want to exit the child process after warning.
2021-02-07lei add-external: handle interrupts with --mirror
This also updates lei_xsearch to follow the same pattern for stopping curl(1) and tail(1) processes it spawns.
2021-02-07spawn: pi_fork_exec: support "pgid"
We'll be using this to allow the "git clone" process hierarchy to be killed via Ctrl-C. This also fixes a long-standing bug in error reporting for the Inline::C version, because we're actually testing for errors, now! n.b. strlen(3) is officially async-signal-safe as of POSIX.1-2016, but I can't think of a reason any previous implementation prior to that wouldn't be.
2021-02-07spawn: pi_fork_exec: restore parent sigmask in child
We continue to unblock SIGCHLD unconditionally, but also any signals not blocked by the parent (wq_worker). This will allow Ctrl-C (SIGINT) to stop "git clone" and allow git-clone cleanup to be performed and other long-running processes when pi_fork_exec supports setpgid(2). This won't affect existing daemons on systems with signalfd(2) or EVFILT_SIGNAL at all, since those run with signals blocked anyways.
2021-02-07lei: remove short switch support for curl(1) options
In particular, -U and -u switches may conflict with diff(1) options we may need for "lei show" which will use solver remotely or locally.
2021-02-07lei_curl: replace -K/--config with --curl-config
Seeing --config in the command-line for lei may mislead users into thinking we support config file overrides that way. Rename the option to --curl-config and drop the short switch for now.
2021-02-07lei add-external: reject index and remote opts w/o mirror
Option combinations which make no sense should fail to prevent misunderstandings and avoid surprises.
2021-02-07lei help: split out into separate file
We'll reword and improve formatting with non-breaking spaces ("\xa0") which is only replaced with SP after wrapping. Some terminology is shortened (e.g. "URL_OR_PATHNAME" => "LOCATION") to improve formatting. This also enables completion for -h/--help and lets us prioritize favored switch names while attempting to satisfy users relying on muscle memory from other tools.
2021-02-07lei: add-external --mirror support
This can be useful for users who want to clone and mirror an existing public-inbox. This doesn't have update support, yet, so users will need to run "git fetch && public-inbox-index" for now.
2021-02-07treewide: replace confess with croak
The PublicInbox::Eml (and previously Email::MIME) use of confess was the primary (or only) culprit behind the lei2mail segfaults fixed by commit 0795b0906cc81f40. ("ds: guard against stack-not-refcounted quirk of Perl 5"). We never care about a backtrace when dealing with Eml objects anyways, so it was just a worthless waste of CPU cycles. We can also drop confess in a few other places. Since we only use Perl and Inline::C, users will never be without source and can replace s/croak/Carp::confess/ on a per-callsite basis to help report problems. It's also possible to use PERL5OPT=-MCarp=verbose in the environment though still potentially risky. Link: https://public-inbox.org/meta/20210201082833.3293-1-e@80x24.org/
2021-02-07tests: split out lei-daemon.t from lei.t
This makes it easier for hackers to find daemon-specific tests and forces us to always test both daemon and oneshot mode.
2021-02-07t/tests: split out setup_public_inboxes sub
We'll probably use this in many more existing places and likely change non-lei tests to use it.
2021-02-07tests: add test_lei wrapper, split out t/lei-import.t
This will make it easier to maintain and test lei going forward, we need to be testing against existing read-only daemons. We'll also save ourselves some boilerplate by exporting all the Test::More methods directly in TestCommon We'll start using this by splitting out the latest "lei import" tests into its own file.
2021-02-07lei_query: trim curl options
Get rid of short options which will or may conflict with some of our own. We may switch over to "git -c http.*" options since we need to run "git clone" and "git fetch" anyways.
2021-02-07lei: abort lei_import worker on client abort
We'll stuff all the common wq key fields into the @WQ_KEYS array so it's easier to keep track of what to kill or reap.
2021-02-07lei: fix completion of --no-kw / --no-keywords
We did not complete --no-* flags properly when multiple options are allowed.
2021-02-07lei: favor "keywords" over "flags", test --no-kw
JMAP brain says "keywords", IMAP brain says "flags"; JMAP brain wins today. Since "keywords" is a bit long, support "kw" as a shortcut since there's no conflict and "kw:" will be our search prefix for looking up messages by keyword.
2021-02-07lei_overview: drop unnecessary autoflush call
This was actually causing xt/lei-sigpipe.t failures, presumably due to reused/recycled workers with many externals.
2021-02-05httpd/async: set O_NONBLOCK correctly
While Perl tie is nice for some things, getting IO::Handle->blocking to work transparently with it doesn't seem possible at the moment. Add some examples in t/spawn.t for future hackers. Fixes: 22e51bd9da476fa9 ("qspawn: switch to ProcessPipe via popen_rd")
2021-02-05lei import: initial implementation
Only tested with .eml files so far, but Maildir + IMAP will be supported.
2021-02-05lei_xsearch: drop unused imports
Reaping is handled by the parent PublicInbox::IPC, and we have no business using PublicInbox::Import since LeiXSearch won't write to git directly (it will write via LeiStore).
2021-02-05lei_query: remove uneeded dwaitpid import
All process management is handled elsewhere.
2021-02-05lei q: eliminate $not_done temporary git dir hack
Another step towards simplifying lei internals. None of our current uses of ->wq_do involve FD passing, and the plan is only rely on FD passing between lei-daemon and lei(1). Internally, it ought to be possible for lei-daemon internal bits to be ordered properly to not need FD passing.
2021-02-05eml: handle warning ignores for lei
There's nothing we can do about bad emails in our search results, so quiet things down and don't fight the MUA for the terminal.
2021-02-05lei q: reinstate early MUA spawn for Maildir
Once all files are written, we can use utime() to poke Maildirs to wake up MUAs that fail to account for nanosecond timestamps resolution.
2021-02-05lei q: only start pager if output is to stdout
No need to be starting a pager if we're writing to a regular file.