about summary refs log tree commit homepage
path: root/t
DateCommit message (Collapse)
2021-08-11treewide: use *nix-specific dirname regexps
None of our code elsewhere accounts for non-*nix pathnames and it's not worth our time to start. So stop wasting CPU cycles giving the illusion that we'd care about non-*nix pathnames.
2021-08-08tests: fix test failures when Xapian is missing
We still support usage without Xapian, so ensure our tests work when Xapian bindings are missing
2021-08-08httpd: set psgi.url_scheme to 'https' for TLS listeners
For users using the native TLS functionality of -httpd (instead of using nginx + Plack::Middleware::ReverseProxy), psgi.url_scheme=http was wrong and would lead to improper redirects.
2021-08-04extindex: fix boost with partial runs
Boost relies on knowledge of all inboxes in a given config file to work properly. So while we support indexing a subset of inboxes, we must still account for boost in inboxes we're not indexing. So split internal inbox groups into "known" and "active", where previously we only cared for inboxes which were being actively indexed. Furthermore, boost checks need to be applied when a message arrives in different inboxes across multiple invocations. Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://public-inbox.org/meta/20210802204058.vscbxs5q7xyolyu2@nitro.local/
2021-07-31extindex: -xcpdb and -compact support
Since extindex uses Xapian shards in a similar way to v2 inboxes, we'll support -xcpdb (reshard+upgrade) and -compact all the same to give admins tuning+upgrade options.
2021-07-25extindex: support --jobs/-j properly on creation for shard count
This wasn't wired up properly, but Xapian appears to suffer from I/O amplification problems as DB shards get larger: https://lists.xapian.org/pipermail/xapian-discuss/2019-February/009727.html <23640.32170.703368.841021@y.dockes.com> Of course, we shouldn't have too many shards, either; because performance problems with too many shards was the entire reason extindex was created: https://lists.xapian.org/pipermail/xapian-discuss/2020-August/009823.html <20200826064728.GA32239@dcvr>
2021-07-25t/lei-watch.t: improve test reliability
On single CPU (and overloaded SMP) systems, we can't rely on inotify in lei-daemon firing before a "lei note-event done" client hits it. So force in a single tick() to ensure the scheduler can yield to lei-daemon and see the inotify wakeup before "lei note-event done" to commit the write.
2021-07-25lei_mail_sync: locations_for API uses oidbin for comparisons
Favor oidbin use internally to reduce internal memory traffic.
2021-07-25lei rm-watch: new command to support removing watches
Pretty trivial since it just invokes "git-config". It's mainly intended to make shell completion easier.
2021-07-25t/lei*: check error messages on failures
I just hit an unreproducible failure in t/lei-p2q.t and lacked $lei_err information to diagnose it. Hopefully this helps track down odd failures in the future.
2021-07-22t/solver_git: use like() to improve error reporting
I hit a test failure here, but haven't been able to reproduce it...
2021-07-22lei: auto-refresh watches in config, cancel missing
This makes behavior less surprising on restarts as we no longer lose state on restarts, so there's no need to manually run "lei add-watch" to re-enable watches. This also allows us to transparently handle changes if somebody edits the lei config file directly or via git-config(1).
2021-07-22lei: start implementing inotify Maildir support
This allows lei to automatically note keyword (message flag) changes made to a Maildir and propagate it into lei/store: lei add-watch --state=tag-ro /path/to/Maildir This doesn't persist across restarts, yet. In the future, it will be applied automatically to "lei q" output Maildirs by default (with an option to disable it). State values of tag-rw, index-<ro|rw>, import-<ro|rw> will all be supported for Maildir. This represents a fairly major internal change that's fairly intrusive, but the whole daemon-oriented design was to facilitate being able to automatically monitor (and propagate) Maildir/IMAP flag changes.
2021-07-22init: allow arbitrary key-values via -c KEY=VALUE
This won't blindly append identical key=values, but allows specifying multiple, different key=value pairs as long as the values are different.
2021-07-22extsearch: support publicinbox.*.boost parameter
This behaves identically the lei external "boost" parameter in prioritizing raw messages for extindex. Relying exclusively on the config file order doesn't work well for mirrors since it's impossible to guarantee config file ordering via grokmirror hooks. Config file ordering remains the default if boost is unconfigured, or in case of ties. Note: I chose the name "boost" rather than "priority" or "rank" since I always get confused by whether higher or lower numbers take precedence when it comes to kernel scheduling. "weight" is also a part of Xapian API terminology, which we currently do not expose to configuration (but may in the future).
2021-07-20httpd: fix SIGHUP by invalidating cache on reload
Since we require separate PublicInbox::HTTPD instances for each listen socket address (in order to support {SERVER_<NAME|PORT>} for PSGI env), the old cache needed to be invalidated on rare app refreshes. SIGHUP has always been broken in -httpd (but not -imapd or -nntpd) due to this cache. Update the daemon documentation and 5.10.1-ize some bits while we're in the area.
2021-07-06extindex: implement --dedupe to fix old extindices
This is intended to fix older indices that had deduplication bugs for matching content. It'll also make dealing with future changes to ContentHash easier since that's never guaranteed stable. It also supports --dry-run to print changes only without making them.
2021-06-29www: fix manifest.js.gz for default publicInbox.grokManifest
ManifestJsGz->response was not invoking the new "url_filter" method properly. Furthermore, fix url_filter for returning 404 responses. Reported-by: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87fsx3128a.fsf@kyleam.com/ Fixes: 520be116e8a686cb ("www_listing: start updating for pagination + search")
2021-06-23www: do not warn on blank query parameters
Sometimes users (or bots) may lead queries with '&' and trigger uninitialized variable warnings, just ignore them and give consumers a $ctx->{qp}->{''} entry. While we're in the area, pass a regexp rather than scalar string to the `split' perlop to prevent Perl from recompiling the regexp on every call.
2021-06-23search: make xap_terms easier-to-use and use it more
This allows us to simplify callers throughout, and exceptions are can no longer be silently hidden. MiscSearch now uses xap_terms for looking up eidx_key terms for a code reduction. We also simplify LeiStore->_msg_kw for runtime use by moving the MsetIterator handling into t/lei_store.t test case.
2021-06-22t/hl_mod: accept "make" or "makefile" for Makefile
Version 4.0 of highlight has renamed the "make" language to "makefile". So just check the string starts with "make", to handle both 3.x and 4.x. I tested that public-inbox does actually work with highlight 4 -- it can highlight my Makefile fine. :)
2021-06-18t/sigfd: add diagnostic for occasional FreeBSD failure
Not 100% sure what's going on, here...
2021-06-14lei index+import: reject keywords from R/O IMAP
Since users can't set IMAP flags in read-only IMAP folders, we won't clobber local flags when importing from IMAP. This also enables the local_blob fallback used for lei-index to be used for index deduplication.
2021-06-14lei_input: allow keywords when importing 1 file from Maildir
This will eventually be useful for supporting inotify watches on Maildir. It will also allow users to script their own FS watchers more easily.
2021-06-13t/lei-import-http: quiet unnecessary diag message
Leftover while writing the test.
2021-06-12lei ls-mail-source: list IMAP folders and NNTP groups
While other tools can provide the same functionality, having integration with git-credential is convenient, here. Caching and completion will be implemented separately.
2021-06-09lei edit-search: fix and add a (weak) test
This broke recently and lacked an automated test, so rely on EDITOR=cat to ensure we have some coverage. Fixes: d2670108f71b1eff ("pkt_op: make pkt_do an OO method")
2021-06-08lei import: speed up repeated Maildir imports
On a 4-core CPU, this speeds up "lei import" on a largish Maildir inbox with 75K messages from ~8 minutes down to ~40s. Parallelizing alone did not bring any improvement and may even hurt performance slightly, depending on CPU availability. However, creating the index on the "fid" and "name" columns in blob2name yields us the same speedup we got. Parallelizing IMAP makes more sense due to the fact most IMAP stores are non-local and subject to network latency. Followup-to: bdecd7ed8e0dcf0b45491b947cd737ba8cfe38a3 ("lei import: speed up kw updates for old IMAP messages")
2021-05-30lei: support implicit stdin by default
This adds implicit stdin suppport for p2q and lcat, while rm and rediff no longer need explicit support for it.
2021-05-30lei lcat: allow IMAP folder URLs w/o UIDVALIDITY
Requiring UIDVALIDITY on the command-line is of course unreasonable.
2021-05-30lei import|lcat: improve+fix single message IMAP support
lcat can now dump the memoized contents of entire IMAP folders, not just a single UID. It's now parallelized and pipelined for multiple lei2mail workers. Furthemore, various forms of JSON output work consistently with blob-only output, now. While working on this, I noticed NetReader was passing UID URLs to imap_each callbacks, which was causing mail_sync.sqlite3 to store UIDs in `folders' and clearly wrong so it's now fixed.
2021-05-28lei q|up: support v2:/path/to/inboxdir destination
This allows "lei-managed pseudo mailing lists" as described by Konstantin. Alternates use is optional and can be enables via --shared. This doesn't manage or edit ~/.public-inbox/config; presumably there'll need to be some tweaking of search parameters before finalizing and making the inbox publicly accessible via HTTP/NNTP. Link: https://public-inbox.org/meta/20210426164454.5zd5kgugfhfwfkpo@nitro.local/T/
2021-05-28t/lei-*: better diagnostics for occasional failures
Some of these have been failing occasionally, not sure how, yet...
2021-05-28lei: handle a single IMAP message in most places
"lei import" can now import a single IMAP message via <imaps://example.com/MAILBOX/;UID=$UID> Likewise, "lei inspect" can show the blob information for UID URLs and "lei lcat" can display the blob without network access if imported. "lei lcat" also gets rid of some unused code and supports "blob:$OIDHEX" syntax as described in the comments (and used by our "text" output format). v2: enforce UID in URL, fail without v3: fix error reporting (s/fail/child_error/)
2021-05-27lei rm: new command to remove messages from index
This is similar to "public-inbox-learn rm", but it's possible to point an entire Maildir/IMAP/mbox*/newsgroup at it.
2021-05-26lei: require Socket::MsgHdr or Inline::C, drop oneshot
The cost of supporting separate code paths between oneshot and daemon isn't worth the trouble; especially if there are more users to support. The test suite time nearly doubles with oneshot, so that's hurting developer productivity. FD passing is currently required to work efficiently with remote HTTP(S) queries which return large messages, as seen in commit 708b182a57373172f5523f3dc297659d58e03b58 ("ipc: wq: handle >MAX_ARG_STRLEN && <EMSGSIZE case"). Additionally, upcoming support for IMAP IDLE and inotify-based monitoring of Maildirs cannot work properly without a background daemon.
2021-05-25ipc: wq: handle >MAX_ARG_STRLEN && <EMSGSIZE case
WQWorkers are limited roughly to MAX_ARG_STRLEN (the kernel limit of argv + environ) to avoid excessive memory growth. Occasionally, we need to send larger messages via workqueues that are too small to hit EMSGSIZE on the sender. This fixes "lei q" when using HTTP(S) externals, since that code path sends large Eml objects from lei_xsearch workers directly to lei2mail WQ workers.
2021-05-25lei forget-mail-sync: new command to drop sync information
Sometimes a user stops caring to sync an IMAP or Maildir folder, or wants to force a resync. Let them run this command to have lei forget all the sync information about the mail folder. This won't delete any stored messages in git, but will leave "lei index" users with dangling references.
2021-05-24lei inspect: use LeiMailSync->match_imap_url
Move match_imap_url into LeiMailSync so it can be used in more places, such as "lei inspect". Upcoming commands such as "lei forget-mail-sync" and {add,forget,pause,resume}-watch will also support relaxed IMAP matching rules since there's no reasonable way to expect users use ";UIDVALIDITY=" on the command-line.
2021-05-23lei <q|up>: set \Recent on non-empty mbox and Maildir
Despite JMAP not supporting the equivalent of the IMAP \Recent flag, it is useful for "lei q --augment", and "lei up" users to be able to distinguish new results from old-but-unread messages in an mbox or Maildir. For mbox family messages, we'll drop the "O" status flag when appending to mboxes, and we'll write to the "new" subdirectory of Maildirs. Behavior when writing to initially empty Maildirs and mboxes remains unchanged since there's no need to distinguish between new and old results in the initial case. Having users wait for a rename(2) storm or complete mbox rewrite hurts UX. With IMAP mailboxes, \Recent is already enforced by the IMAP server and IMAP clients have no way of changing it(*) (*) mutt uses the "Old" IMAP flag which isn't part of RFC 3501, other MUAs may do similar things.
2021-05-23lei import: store IMAP user+auth in mail_sync folder URI
Just having UIDVALIDITY in the URI isn't enough, since a single lei user may have multiple IMAP logins on the same server. This leads to compatibility problems and forces a reimport for the few users already using this lei functionality, but it's not stable nor released, yet.
2021-05-23uri_imap: support uid/auth/user as full accessors
We will need this for mail synchronization
2021-05-23lei export-kw: new command to export keywords to Maildirs
IMAP will eventually be supported.
2021-05-23lei tag: support tagging index-only messages
This will make some of our tests faster and allow users to try more features of lei without high storage requirements.
2021-05-23treewide: favor open(..., '+<&=', $fd)
Cut down on unnecessary imports of IO::Handle and method lookup + dispatch overhead.
2021-05-19lei: relax rules for "new" in Maildir
mbsync and offlineimap both use ":2," suffixes for filenames in "new/", however my interpretation of the Maildir spec at <https://cr.yp.to/proto/maildir.html> is that ":2," is only for files in "cur/". My interpretation also matches that of doveecot, but we'll allow what mbsync and offlineimap do given their popularity.
2021-05-15dir_idle: support IN_DELETE_SELF|IN_MOVE_SELF, too
We'll treat IN_MOVE_SELF as IN_DELETE_SELF since there doesn't seem to be a reliable way to distinguish them with FakeInotify, nor know the new name with kevent.
2021-05-15dir_idle: detect files which are gone
lei now makes use of this to clean up after unlinked sockets with less delay. This will also be used to maintain mail_sync.sqlite3.
2021-05-09git: fix numerous bugs in git_quote and git_unquote
git always quotes with leading zeros to ensure the octal representation is 3 characters long. We enforce that to match low ASCII characters (e.g. [x01-\x06]) that don't need the range provided by 3 characters. git_unquote now does a single pass so it won't get fooled by decoded backslashes into parsing a digit as an octal character. git_unquote is also capped to "\377" so we don't overflow a byte.
2021-05-06t/sigfd: use PublicInbox::DS::block_signals
We already use PublicInbox::DS in this test and I've always found the terminology of sig* APIs confusing :x