about summary refs log tree commit homepage
path: root/lib/PublicInbox/LeiQuery.pm
DateCommit message (Collapse)
2023-10-17lei: consolidate stdin slurp, fix warnings
We can share more code amongst stdin slurper (not streaming) commands. This also fixes uninitialized variable warnings when feeding an empty stdin to these commands.
2023-10-04lei: do_env combines fchdir and local
This will make switching $lei contexts less error-prone and hopefully save us from some suprising bugs in the future. Followup-to: 759885e60e59 (lei: ensure --stdin sets %ENV and $current_lei, 2023-09-14)
2023-09-15lei: ensure --stdin sets %ENV and $current_lei
--stdin usage means the current request can be delayed indefinitely while other requests with different %ENV come in. So make sure our warnings and %ENV can match non-stdin behavior. This probably fix segfaults during process cleanup on OpenBSD since _lei_atfork_child use non-localized assignment of $current_lei. But it could be another red herring. Either way, it's the right thing to do from an environment replication perspective.
2023-07-27clone: allow running without DBI / DBD::SQLite
Due to historic reasons, LeiQuery.pm gets loaded with LEI.pm and -clone depends on LEI. So delay loading any DBI-dependent modules until querying is actually required.
2023-03-25lei: improve bash completion involving colons
This fixes completions of labels (`+L:' for `lei import' and `L:' for `lei q') so they can appear anywhere in the command-line. I mainly wanted this for `lei import $URL +L:label', but this also fixes `lei forget-external' completions for URLs (which involve colons).
2022-12-02lei_saved_search: expand only/include/exclude to absolute paths
While users may specify relative paths for convenience on the command-line, absolute paths are required for `lei up' since that (especially `lei up --all') could run from anywhere. Note that we need to do this when parsing the command-line options, since shortcuts for URL matching on URL path components are allowed for `lei q', and those same shortcuts may remain in effect across to `lei up' as the underlying external may be moved to a different URI host.
2022-11-16lei q|up: limit default write --jobs for IMAP(S)
Eric Wong <e@80x24.org> wrote: > Thanks for confirming things work as intended. I think the > default should be clamped, though... 15 seems a bit high for > smaller IMAP servers *shrug* --------8<------- Subject: [PATCH] lei q|up: limit default write --jobs for IMAP(S) IMAP(S) servers often limit per-user connections, so avoid bumping into limits to improve the out-of-the-box experience. 4 seems like a conservative default, since we already chose that number for remote HTTP(S) endpoints. Link: https://public-inbox.org/meta/20220910201958.GA12212@dcvr/
2022-10-01lei: force --jobs=1,1 for SQLite < 3.8.3
SQLite prior to 3.8.3 did not reset its PRNG for generating unique temporary file names, so it would barf on t/lei-up.t occasionally due to O_EXCL -> EEXIST conflicts. This fixes occasional test failures under CentOS 7.x which ships SQLite 3.7.17.
2022-07-01tree-wide: Fix typo likelyhood
This was pointed out by the Debian package linter "lintian".
2021-11-10lei q: disallow "\n" in argv[] elements
I don't expect this to be hit in real-world use via normal interactive shells. However, somebody could accidentally add "\n" in languages (e.g. Perl, C) where it's easy to pass "\n" in argv[].
2021-10-30lei_to_mail: limit workers for text, reply and v2 outputs
"text" and "reply" outputs are intended for the pager, so parallelizing them is a waste of resources. v2 has shards, of course, so parallelizing writes to it is also a waste since the deduplication work is a bit more complex.
2021-10-25lei_to_mail: write directly to mail_sync.sqlite3
No need to go through the lei/store process when we write mail_sync.sqlite3. This ought to reduce ENOBUFS errors (and the sleep workaround) on RAM-starved systems.
2021-10-19lei up: support --exclude=, --no-(external|remote|local)
These can be used to temporarily disable using certain externals in case of temporary network failure or mount point unavailability.
2021-10-19lei: use die for external and query handling
This allows "lei up" to continue processing unrelated externals if on output fails.
2021-10-16lei: more eval guards for die on failure
Relying on $lei->fail is unsustainable since there'll always be parts of our code and dependencies which can trigger die() and break the event loop.
2021-09-25lei forget-external: split into separate file
This was written before we had auto-loading, and forget-external should be a rarely-used command that's not worth loading at startup. Do some golfing while we're in the area, too.
2021-09-21lei q: improve --limit behavior and progress
Avoid slurping gigantic (e.g. 100000) result sets into a single response if a giant limit is specified, and instead use 10000 as a window for the mset with a given offset. We'll also warn and hint towards about the --limit= switch when the estimated result set is larger than the default limit.
2021-09-10lei_query: fix comment about %lei2curl commands
Just a typo.
2021-08-14lei: hexdigest mocks account for unwanted headers
PublicInbox::Import never imports @UNWANTED_HEADERS, so ensure our mock blob OIDs do the same. This ought to prevent duplicates if the PSGI mboxrd download starts setting "X-Status: F" like "lei q -tt .."
2021-08-11lei: attempt to canonicalize away "/../" pathnames
As documented, File::Spec->canonpath does not canonicalize "/../". While we want to do our best to preserve symlinks in pathnames, leaving "/../" can mislead our inotify|kqueue usage.
2021-05-30lei q: --sort and --save|v2 are incompatible
Saved searches rely on (reverse) docid ordering for efficient incremental results, and sorting any other way prevents that. Update comment description in LeiQuery while we're at it: "ls-query" and "rm-query" are "ls-search" and "forget-search", respectively, and "mv-query" is implicit with "edit-search"
2021-05-28lei: restore working directory in more places
Every tick of the event loop can change the working directory, so we need to restore it for every client if they operate in different directories. This would be easier if we had openat(2) and friends in Perl; but Inline::C is practically required for lei, now.
2021-05-03lei <q|up>: writes to Maildirs and IMAP use mail-sync
This will allow keyword updates from other folders to propagate to folders where search results may be duplicated.
2021-04-20lei_query: avoid POSIX::lround for older Perls
POSIX.pm shipped with Perl 5.16.3 did not support lround, at least. So just rely on built-in core functions.
2021-04-20lei up: support --all=local
Users may wish to update several saved searches at once. We can support parallel updates in lei-daemon so users won't have to do it themselves via xargs or similar. Supporting IMAP outputs would be significantly more involved since we'd have to pre-authenticate for every single IMAP output before entering the redispatch loop.
2021-04-17lei_query: fix relative path handling on --stdin
Since --stdin could be waiting on user keyboard input or something else slow, we handle it in the event loop. That means other commands can change the working directory of lei-daemon while a query is being trickled to us via stdin. Rearranging query handling internals to delay opening the --output destination in commit 26e0fe73de93f451 meant another command could throw off our --output pathname if it is relative. Fixes: 26e0fe73de93f451 ("lei_query: rearrange internals to capture query early")
2021-04-16lei q: --save preserves relative time queries
Somebody may want a saved search which consistently asks for messages within a rolling time period window. In other words, we want to support using "lei q --save dt:last.week.." and keeps the "dt:last.week.." relative to whenever "lei up" is run. This ensures relative date-time specifications get used in the future rather than converting into an absolute date-time from the initial "lei q" invocation.
2021-04-13lei: add "lei up" to complement "lei q --save"
The command isn't finalized, yet, but it's intended to update an existing saved search.
2021-04-13lei q: start wiring up saved search
This will have a over.sqlite3 for content-based deduplication. It may exhibit ibxish methods, so serving a read-only (or even R/W) IMAP or instance or displaying HTML isn't outside the realm of possibility.
2021-04-13lei_query: rearrange internals to capture query early
To support saved search, we need the query string available to us before we setup LeiDedupe via (LeiOverview || LeiToMail).
2021-04-01lei_query: remove unnecessary V2Writable require
AFAIK that was only used for nproc detection, and nproc is handled by PublicInbox::IPC, nowadays.
2021-03-30lei q: avoid redundant default setting for sort with l2m
No point in munging user-supplied $lei->{opt} when %mset_opt exists. We'll be depending on docid being in descending order for saved search support.
2021-03-27lei_query: hoist out lxs_prepare
We'll be reusing it for "lei blob", as it makes sense to keep handling of --only, --include, etc. switches consistent.
2021-03-26lei q: skip lei/store->write_prepare for JSON outputs
JSON outputs won't write to lei/store at all, so there's no point in forking the store worker if it's not already running. LeiSearch object ($lse) is also fork-safe until it opens a persistent FD for Xapian/SQLite so we can unconditionally carry it across fork.
2021-03-24lei: improve management around short-lived workers
Instead of creating a short-lived circular reference, ensure they don't exist in the first place. Note the following changes to hold an extra ref to $sto: - $self->_lei_store(1)->write_prepare($self); + my $sto = $self->_lei_store(1); + $sto->write_prepare($self); I'm not a perlguts expert, but I actually wanted to switch to the one-line version for LeiImport, but xt/lei-auth-fail.t was getting stuck for some reason. It seems the extra ref to the LeiStore ($sto) object is necessary.
2021-03-21lei: tie ALE lifetime to config file
This should make a future change to "lei import" work more nicely, since we'll be needing ALE to vivify external-only messages upon explicit "lei import".
2021-03-21lei: All Local Externals: bare git dir for alternates
This will be used for keyword (and label) storage for externals. We'll be using this to ensure we don't redundantly auto-import messages into lei/store if they're already in a local external (they can still be imported explicitly via "lei import").
2021-03-19lei q: -I/--include overrides --no-(external|local|remote)
Assume that anybody using -I/--include for external locations will want to override --no-$FOO if they're explicitly including a location. With some effort, we could make it order-dependent (e.g. "-I $LOCATION --no-$FOO" and "--no-$FOO -I $LOCATION" behave differently). However that's not straightforward when using Getopt::Long to parse command-line options into a hashref. I'm also not sure if order-dependent switches are a desirable UI/UX quality.
2021-03-05lei q: fix --import-before default and FIFO output
commit 6c551bffd75afb41d9b5e4774068abe7e06ed0e7 ("lei q: --import-augment for mbox and mbox.gz") added a check to in _pre_augment_mbox for the option being a ref() to distinguish between default values and user-supplied values (which are non-ref SCALARs from Getopt::Long). However, LeiQuery failed to use a SCALAR ref as the default value, making the check in _pre_augment_mbox useless. We now update LeiQuery to use \1 instead of 1 as the default value so "lei q -f mboxrd ..." to stdout works once again. Unfortunately, testing with redirects pointed to regular files didn't trigger the code paths being updated. Testing with a FIFO revealed further bugs in the FIFO handling code which are also fixed in this commit. We'll also update the $lei->out error message to be less-specific about "stdout" and use the term "output", instead, since LeiToMail replaces stdout for all mbox outputs.
2021-03-04lei q: s/import-augment/import-before/g
Since this importing of keywords is active even when --augment isn't specified, calling it --import-before seems more appropriate. In the future, this will likely default to adding unseen emails to lei/store, not just updating keywords. Link: https://public-inbox.org/meta/20210303222930.GA18597@dcvr/T/
2021-03-04lei q: import flags when clobbering/augmenting Maildirs
This will eventually be supported for other mail stores, but Maildir is the easiest to test and support, here. This lets us avoid a situation where flag changes get lost between search results.
2021-02-25lei q: auto-memoize remote messages into lei/store
This lets users avoid network traffic on subsequent searches at the expense of local disk space. --no-import-remote may be specified to reverse this trade-off for users with little storage.
2021-02-23lei q: reduce default lei2mail workers
While disk I/O is typically buffered for good scheduling, git blob decoding uses a non-trivial amount of CPU time and it helps to leave some CPU available for it.
2021-02-22lei_auth: trim and remove leftover worker code
LeiAuth is no longer a separate worker process. Instead, it's used directly by LeiToMail and LeiImport for sharing auth info from the first worker to the rest of the workers, using lei-daemon as a message router. So drop the old code to reduce human cognitive load and interpreter memory overhead.
2021-02-22lei q: reduce wasted IMAP connection for auth
We can rework the first lei2mail worker to authenticate, and then share auth info with the rest of the lei2mail workers. As with "lei import", this uses PktOp and lei-daemon to share updated credentials between the first an subsequent l2m workers.
2021-02-21ipc: support setting a locked number of WQ workers
We can use this to ensure sharded work doesn't do unexpected things if workers are added/removed. We currently don't increase/decrease workers once a workqueue is started, but non-lei code (-httpd/imapd) may start doing so. This also fixes a bug where lei2mail workers could not be adjusted via --jobs on the command-line.
2021-02-21lei q: support IMAP/IMAPS --output destinations
Augment (and dedupe) aren't parallel, yet, so its more sensitive to high-latency networks.
2021-02-11search: use git approxidate in WWW and "lei q --stdin"
This greatly improves the usability of d:, dt:, and rt: search prefixes for users already familiar git's "approxidate" feature. That is, users familiar with the --(since|after|until|before)= options in git-log(1) and similar commands will be able to use those dates in the WWW UI.
2021-02-08lei q: use git approxidate with d:, dt: and rt: ranges
Instead of having --(sent|received)-(before|after)=s command-line switches, we'll just try to make sense of argv so it's usable within parenthesized statements and such. Given the negligible performance penalty with Inline::C process spawning, we'll probably wire this up to the WWW interface, too. "d:" is for mairix compatibility. I don't know if "dt:" and "rt:" will be too useful, but they exist because of IMAP (and JMAP).
2021-02-07lei: replace --thread with --threads
Nobody is expected to use long options, but for consistency with mairix(1), we'll use the pluralized option throughout (including existing PublicInbox::{Search,SearchView}). Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/