about summary refs log tree commit homepage
path: root/script
DateCommit message (Collapse)
2021-11-02init: respect umask when creating description
I noticed a description for a new inbox had st_mode=0600.
2021-10-15lei: use send() perlop for signals
This may save us a small bit of startup time since there's fewer args and opcodes should be smaller.
2021-10-14lei add-external --mirror: respect client umask
While lei is intended for non-public mail and runs umask(077) by default, externals are one area which can safely defer to the user's umask. Instead of sending it unconditionally with every command, only have lei-daemon request it when necessary.
2021-10-13fetch: support --try-remote/-T for alternate remote names
This allows -fetch to work out-of-the-box on using the grokmirror 2.x default of "_grokmirror".
2021-10-09extindex: support --reindex --fast
This mode only checks history for missed/stale messages and doesn't attempt to reindex messages which are already indexed.
2021-10-05index: --reindex w/ --{since,until,before,after}
This lets administrators reindex specific time ranges according to git "approxidate" formats. These arguments are passed directly to underlying git-log(1) invocations and may still reach into old epochs. Since these options rely on git committer dates (which we infer from the most recent Received: header), they are not guaranteed to be strictly tied to git history and it's possible to over/under-reindex some messages. It's probably not a major problem in practice, though; reindexing a few extra messages is generally harmless aside from some extra device wear. Since this currently relies on git-log, these options do not affect -extindex, yet.
2021-10-01ds: simplify signalfd use
Since signalfd is often combined with our event loop, give it a convenient API and reduce the code duplication required to use it. EventLoop is replaced with ::event_loop to allow consistent parameter passing and avoid needlessly passing the package name on stack. We also avoid exporting SFD_NONBLOCK since it's the only flag we support. There's no sense in having the memory overhead of a constant function when it's in cold code.
2021-09-24clone|--mirror: support --epoch=RANGE for partial clones
Partial (v2) clones should be useful addition for users wanting to conserve storage while having fast access to recent messages. Continuing work started in 876e74283ff3 (fetch: ignore non-writable epoch dirs, 2021-09-17), this creates bare, read-only epoch git repos. These git repos have the remotes pre-configured, but does not fetch any objects. The goal is to allow users to set the writable bit on a previously-skipped epoch and start fetching it. Shell completion support may not be necessary given how short the epoch ranges are, here. Cc: Luis Chamberlain <mcgrof@kernel.org> Link: https://public-inbox.org/meta/20210917002204.GA13112@dcvr/T/#u
2021-09-22lei: drop redundant WQ EOF callbacks
Redundant code is noise and therefore confusing :<
2021-09-22script/lei: describe purpose of sleep loop
It looks dumb, but I'm not about to take a runtime penalty to use signalfd|EVFILT_SIGNAL, here, either.
2021-09-21script/lei: handle SIGTSTP and SIGCONT
Sometimes it's useful to pause an expensive query or refresh-mail-sync to do something else. While lei-daemon and lei/store can't be paused since they're shared across clients, per-invocation WQ workers can be paused safely using the unblockable SIGSTOP. While we're at it, drop the ETOOMANYREFS hint since it hasn't been a problem since we drastically reduced FD passing early in development.
2021-09-17script/lei: umask(077) before execve
While my MUA also runs umask(077) unconditionally, not all MUAs do. Additionally, pagers may support writing its buffer to disk, so ensure anything else we spawn has umask(077).
2021-09-15support -C (chdir) for most non-daemon commands
Because make(1), git(1), tar(1) all support -C in this form, as do our newer commands such as lei, public-inbox-{clone,fetch}.
2021-09-15fetch: support --exit-code switch
As noted in the new manpage entry, this is useful for avoiding public-inbox-index invocations when there's nothing to update. We use 127 to match "grok-pull", and also because it doesn't conflict with any of the current curl(1) exit codes.
2021-09-15multi_git: hoist out common epoch/alternates handling
IMHO, this greatly improves code sharing and organization between v2, extindex, and lei/store. Common git-related logic for these is lightly-refactored and easier to reason about. The impetus for this big change was to ensure inboxes created+managed by public-inbox-{clone,fetch} could have alternates and configs setup properly without depending on SQLite (via V2Writable). This change does that while making old code shorter and better factored.
2021-09-12init: set a useful description
"Unnamed repository" for v1 inboxes was misleading, and having a non-existent description for v2 was equally annoying, so set a short description based on the primary address. We remove descriptions when setting up new test inboxes to preserve the behavior of the t/lei-mirror.t test case.
2021-09-12new public-inbox-{clone,fetch} commands
Setting up and maintaining git-only mirrors of v2 inboxes is complex since multiple commands are required to clone and fetch into epochs. Unlike grokmirror, these commands do not require any configuration. Instead, they rely on existing git config files and work like "git clone --mirror" and "git fetch", respectively. Like grokmirror, they use manifest.js.gz, but only on a per-inbox basis so users won't have to clone every inbox of a large instance nor edit config files to include/exclude inboxes they're interested in.
2021-09-11lei: fix handling of broken lei.saved-search config files
lei shouldn't become unusable if a config file is invalid. Instead, show the "git config" stderr and attempt to continue gracefully. Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://public-inbox.org/meta/20210910141157.6u5adehpx7wftkor@meerkat.local/
2021-08-11treewide: use *nix-specific dirname regexps
None of our code elsewhere accounts for non-*nix pathnames and it's not worth our time to start. So stop wasting CPU cycles giving the illusion that we'd care about non-*nix pathnames.
2021-08-08httpd: set psgi.url_scheme to 'https' for TLS listeners
For users using the native TLS functionality of -httpd (instead of using nginx + Plack::Middleware::ReverseProxy), psgi.url_scheme=http was wrong and would lead to improper redirects.
2021-08-04extindex: fix boost with partial runs
Boost relies on knowledge of all inboxes in a given config file to work properly. So while we support indexing a subset of inboxes, we must still account for boost in inboxes we're not indexing. So split internal inbox groups into "known" and "active", where previously we only cared for inboxes which were being actively indexed. Furthermore, boost checks need to be applied when a message arrives in different inboxes across multiple invocations. Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://public-inbox.org/meta/20210802204058.vscbxs5q7xyolyu2@nitro.local/
2021-07-31extindex: -xcpdb and -compact support
Since extindex uses Xapian shards in a similar way to v2 inboxes, we'll support -xcpdb (reshard+upgrade) and -compact all the same to give admins tuning+upgrade options.
2021-07-28lei: die on ECONNRESET
ECONNRESET should be rare on a private local socket, and if we hit it, it's because we're hitting the listen() limit.
2021-07-28treewide: s/sequential_shard/sequential-shard/g
The underscore variant was never documented and maintaining the difference between the command-line and internal hash is not worth it.
2021-07-25init: support git <2.30 for "-c KEY=VALUE" args
It turns out `--fixed-value' is a relatively new git-config(1) feature in git 2.30+ (December 2020). So use the quotemeta perlop for now since it seems compatible-enough for POSIX ERE used by git.
2021-07-25extindex: support --dedupe[=MSGID]
Sometimes I just want to dedupe a single Message-ID to test something, and this lets me do it. This patch appears to do what its supposed to. But it also appears to be finding duplicates that were previously missed. That's a good thing, but I wish I understood what seems to be fixed :x I'm not sure why the previous ExtSearchIdx.pm (blob 357312b8) was causing messages to be missed, even, and why this patch seems to fix it... And it's not infinite looping, either. Anyways, before this patch, "-extindex --dedupe" was taking ~5 min to no-op every message (after the initial full --dedupe run which took over a day to run). No-op --dedupes now take just under 2 hours to scan every single cross-posted message for a no-op dedupe. The initial dedupe took nearly 44 hours on my system for <https://yhbt.net/lore/all/> due to SATA-2 TLC SSD latency on 3 gigantic Xapian shards. Running --dedupe with this change seems to prevent /BUG\?.*?not deduplicated properly/ stderr messages from being triggered by View.pm. Current versions of -extindex do not seem susceptible to introducing duplicates.
2021-07-22init: allow arbitrary key-values via -c KEY=VALUE
This won't blindly append identical key=values, but allows specifying multiple, different key=value pairs as long as the values are different.
2021-07-20httpd: fix SIGHUP by invalidating cache on reload
Since we require separate PublicInbox::HTTPD instances for each listen socket address (in order to support {SERVER_<NAME|PORT>} for PSGI env), the old cache needed to be invalidated on rare app refreshes. SIGHUP has always been broken in -httpd (but not -imapd or -nntpd) due to this cache. Update the daemon documentation and 5.10.1-ize some bits while we're in the area.
2021-07-06extindex: implement --dedupe to fix old extindices
This is intended to fix older indices that had deduplication bugs for matching content. It'll also make dealing with future changes to ContentHash easier since that's never guaranteed stable. It also supports --dry-run to print changes only without making them.
2021-05-28script/lei: drop leftover message about fallback
Non-daemon lei isn't implemented, anymore.
2021-05-26lei: require Socket::MsgHdr or Inline::C, drop oneshot
The cost of supporting separate code paths between oneshot and daemon isn't worth the trouble; especially if there are more users to support. The test suite time nearly doubles with oneshot, so that's hurting developer productivity. FD passing is currently required to work efficiently with remote HTTP(S) queries which return large messages, as seen in commit 708b182a57373172f5523f3dc297659d58e03b58 ("ipc: wq: handle >MAX_ARG_STRLEN && <EMSGSIZE case"). Additionally, upcoming support for IMAP IDLE and inotify-based monitoring of Maildirs cannot work properly without a background daemon.
2021-05-05script/public-inbox-extindex: chmod +x
Everything else that's intended to be executable at some point has the executable bit set. Remove an inaccurate comment while we're at it.
2021-05-01lei edit-search: support relocating lei.q.output
The contents of the old lei.q.output will not be removed, but will be converted into the new one.
2021-04-28lei: simple WQ workers use {wq1} field
This lets us share more code and reduces cognitive overhead when it comes to picking names (because {lsss} was ridiculous). We'll need to ensure the first error set in lei is the actual error we exit with, otherwise things can get confusing and errors may get lost.
2021-04-22lei: XDG_RUNTIME_DIR=/dev/null disables daemon mode
We'll support this mode of operation for now to quiet down testing of oneshot mode where the daemon doesn't persist.
2021-04-17lei q: fix MUA spawn after reading query from stdin
Since "lei q" may read queries from stdin, we must reconnect a known terminal before spawning terminal MUAs. Attempt to use stdout as stdin for this purpose, since terminal MUAs tend to expect stdout to be a terminal. Reported-By: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87v98klxg3.fsf@kyleam.com/
2021-04-05script/lei: waitpid for git-credential and pager
We need to ensure we reap things we spawn.
2021-04-02lei: fix git-credential handling
I completely forgot about git-credential prompting when making lei background the client process for MUA. Now it backgrounds itself only for the MUA when no FDs are passed, since the MUA is the final command run. Otherwise, it relies on FD passing as before. Fixes: c790a75439f3a1db ("script/lei: background ourselves on MUA/pager exec")
2021-04-01script/lei: background ourselves on MUA/pager exec
This ought to give the MUA or pager exclusive access to the controlling terminal. The downside is we can only exec the pager or MUA once per invocation, but I can't imagine a valid case for running those things multiple times, either. Note: I'm no expert when it comes to terminal control matters, but this allows Ctrl-Z-ed mutt instance to come back and is a nice code reduction, as well.
2021-03-28treewide: shorten temporary filename
File::Temp only requires four 'X' characters (unlike mkstemp(3), which requires six). So only so only give it 4 to avoid an 80-column violation and maybe save metadata space on FSes.
2021-02-08lei: drop BSD::Resource usage
It's no longer necessary with the changes to stop doing FD passing in our backend. cf. commits 5180ed0a1cd65139 and 7d440bf3667b8ef5 ("lei q: eliminate $not_done temporary git dir hack") ("lei q: reorder internals to reduce FD passing")
2021-02-08lei q: SIGWINCH process group with the terminal
While using utime on the destination Maildir is enough for mutt to eventually notice new mail, "eventually" isn't good enough. Send a SIGWINCH to wake mutt (and likely other MUAs) immediately. This is more portable than relying on MUAs to support inotify or EVFILT_VNODE.
2021-02-07script/lei: avoid waitpid(-1, ...) to keep tests fast
We only spawn one process to be reaped at the moment. tests will run the contents of script/* in the same process if possible, so any test scripts which spawn -httpd or other read-only can cause us to stall with waitpid(-1, ...)
2021-02-07init: lowercase -j for --jobs
This is taken from common implementations of make(1) and only affected people using the command-line help output.
2021-02-04lei: use sleep(1) loop for infinite sleep
Perl may internally race and miss signals due to a lack of self-pipe / eventfd / signalfd / EVFILT_SIGNAL usage. While our event loop paths avoid these problems by using signalfd or EVFILT_SIGNAL, thse sleep() calls are not within the event loop.
2021-02-01lei: avoid ETOOMANYREFS, cleanup imports
As with PublicInbox::IPC, we'll attempt to bump RLIMIT_NOFILE and transparently workaround ETOOMANYREFS. If that fails, we'll give the user a hint to bump RLIMIT_NOFILE since ETOOMANYREFS is an uncommon error which users may be unfamiliar with. Found while stress testing for segfaults.
2021-02-01lei: increase initial timeout
PublicInbox::Listener unconditionally sets O_NONBLOCK upon accept(), so we need a larger timeout under heavy load since there's no "dataready" accept filter on the listener. With O_NONBLOCK already set, we don't have to set it at ->event_step_init
2021-01-24ipc: get rid of wq_set_recv_modes
Just open every FD as read/write. Perl (or any non-broken runtime) won't care and won't attempt to use F_SETFL to alter file description flags; as attempting to change those would lead to unpleasant side effects if the file description is shared with another process.
2021-01-23lei: support remote externals
Via curl(1), since that lets us easily use tor on a per-connection basis via LD_PRELOAD (torsocks) or proxy. We'll eventually support more curl options which can allow users to get past firewalls and deal with other odd network configurations.
2021-01-22lei: remove INT/QUIT/TERM handlers, fix daemon EOF
The signal handlers on the client side were unnecessary, all we need is to handle socket EOF properly in the daemon by killing xsearch and l2m workers.