public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2021-09-25	lei ls-external: split into separate file
	This was written before we had auto-loading and rarely used.
2021-09-25	lei add-external: split into separate file
	Also was written before we had auto-loading and rarely used.
2021-09-25	lei forget-external: split into separate file
	This was written before we had auto-loading, and forget-external should be a rarely-used command that's not worth loading at startup. Do some golfing while we're in the area, too.
2021-09-25	doc: lei: manpages for export-kw and refresh-mail-sync
	Something is better than nothing.
2021-09-19	lei config --edit: use controlling terminal
	As with "lei edit-search", "lei config --edit" may spawn an interactive editor which works best from the terminal running script/lei. So implement LeiConfig as a superclass of LeiEditSearch so the two commands can share the same verification hooks and retry logic.
2021-09-19	xt: add fsck script over over.sqlite3
	I'm not sure what caused it, but I've noticed two missing messages that failed from "lei up" on an https:// external; and I've also seen some duplicates in the past (which I think I fixed...).
2021-09-17	doc: add lei-security(7) manpage
	It seems like a good idea to have a manpage where somebody can quickly look up and address their concerns as to what to put on encrypted device/filesystem. And I probably would've designed lei around make(1) for parallelization if I didn't have to keep credentials off the FS :P
2021-09-17	lei refresh-mail-sync: replace prune-mail-sync
	Merely pruning mail synchronization information was insufficient for Maildir: renames are common in Maildir and we need to detect them after-the-fact when lei-daemon isn't running. Running this command could make "lei index" far more useful... v2: close R/O mail_sync.sqlite3 dbh before fork Keeping the DB file handle open across fork can cause bad things to happen even if we don't use it since sqlite3 itself still knows about it (but doesn't know Perl code doesn't know about it).
2021-09-15	multi_git: hoist out common epoch/alternates handling
	IMHO, this greatly improves code sharing and organization between v2, extindex, and lei/store. Common git-related logic for these is lightly-refactored and easier to reason about. The impetus for this big change was to ensure inboxes created+managed by public-inbox-{clone,fetch} could have alternates and configs setup properly without depending on SQLite (via V2Writable). This change does that while making old code shorter and better factored.
2021-09-12	new public-inbox-{clone,fetch} commands
	Setting up and maintaining git-only mirrors of v2 inboxes is complex since multiple commands are required to clone and fetch into epochs. Unlike grokmirror, these commands do not require any configuration. Instead, they rely on existing git config files and work like "git clone --mirror" and "git fetch", respectively. Like grokmirror, they use manifest.js.gz, but only on a per-inbox basis so users won't have to clone every inbox of a large instance nor edit config files to include/exclude inboxes they're interested in.
2021-09-10	doc: lei-index manpage
	It's a pretty incomplete command, so it's important to document its incompleteness.
2021-09-10	lei add-external --mirror: deduce paths for PSGI mount prefixes
	The current manifest.js.gz generation in WWW doesn't account for PSGI mount prefixes (and grokmirror 1.x appears to work fine). In other words, <https://yhbt.net/lore/lkml/manifest.js.gz> currently has keys like "/lkml/git/0.git" and not "/lore/lkml/git/0.git" where "/lore" is the PSGI mount prefix. This works fine with the prefix accounted for in my grokmirror (1.x) repos.conf like this: site = https://yhbt.net/lore/ manifest = https://yhbt.net/lore/manifest.js.gz Adding the PSGI mount prefix in manifest.js.gz is probably not desirable since it would force the prefix into the locally cloned path by grokmirror, and all the cloned directories would have the remote PSGI mount prefix prepended to the toplevel. So, "lei add-external --mirror" needs to account for PSGI mount prefixes by deducing the prefix based on available keys in the manifest.js.gz hash table.
2021-09-09	lei-rm: add man page, support LeiInput args
	-F/--in-format and --lock=TYPE(S) are easily supported by all classes using LeiInput.
2021-09-03	lei up --all: avoid double-close on shared STDOUT
	This is merely to avoid perl setting errors internally which were not user visible. The double-close wasn't a problem in practice since we open a new file hanlde for the mbox or mbox.gz anyways, so the new t/lei-up.t test case shows no regressions nor fixes.
2021-09-02	lei: propagate keyword changes from lei/store
	This works with existing inotify/EVFILT_VNODE functionality to propagate changes made from one Maildir to another Maildir. I chose the lei/store worker process to handle this since propagating changes back into lei-daemon on a massive scale could lead to dead-locking while both processes are attempting to write to each other. Eliminating IPC overhead is a nice side effect, but could hurt performance if Maildirs are slow. The code for "lei export-kw" is significantly revamped to match the new code used in the "lei/store" daemon. It should be more correct w.r.t. corner-cases and stale entries, but perhaps better tests need to be written. squashed: t/lei-auto-watch: increase delay for FreeBSD kevent My FreeBSD VM seems to need longer for this test than inotify under Linux, likely because the kevent support code needs to be more complicated.
2021-08-25	lei up: improve --all=local stderr output
	The "# $NR written to $DEST ($total matches)" messages are arguably the most useful output of "lei up --all=local", but they get intermixed with progress messages from various workers. Queue up these finalization messages and only spit them out on ->DESTROY.
2021-08-04	lei: close inotify FD in forked child
	Linux::Inotify2 2.3+ includes an ->fh method to give us the ability to safely close an FD without hitting EBADF (and automatically use FD_CLOEXEC). We'll still need a new wrapper class (LI2Wrap) to handle it for users of old versions, though. Link: http://lists.schmorp.de/pipermail/perl/2021q3/thread.html
2021-07-25	lei rm-watch: new command to support removing watches
	Pretty trivial since it just invokes "git-config". It's mainly intended to make shell completion easier.
2021-07-22	lei: start implementing inotify Maildir support
	This allows lei to automatically note keyword (message flag) changes made to a Maildir and propagate it into lei/store: lei add-watch --state=tag-ro /path/to/Maildir This doesn't persist across restarts, yet. In the future, it will be applied automatically to "lei q" output Maildirs by default (with an option to disable it). State values of tag-rw, index-<ro\|rw>, import-<ro\|rw> will all be supported for Maildir. This represents a fairly major internal change that's fairly intrusive, but the whole daemon-oriented design was to facilitate being able to automatically monitor (and propagate) Maildir/IMAP flag changes.
2021-06-20	scripts: add syscall-list tool for development
	We'll be supporting inotify directly as we do with epoll so so Linux users won't have to deal with XS, extra DSOs or install Linux::Inotify2 (and common::sense) modules.
2021-06-12	lei ls-mail-source: list IMAP folders and NNTP groups
	While other tools can provide the same functionality, having integration with git-credential is convenient, here. Caching and completion will be implemented separately.
2021-06-09	lei prune-mail-sync: new command to prune invalid sync data
	This will be invoked automatically by "lei import" eventually, but it may make sense to expose as a separate command.
2021-06-08	lei import: speed up repeated Maildir imports
	On a 4-core CPU, this speeds up "lei import" on a largish Maildir inbox with 75K messages from ~8 minutes down to ~40s. Parallelizing alone did not bring any improvement and may even hurt performance slightly, depending on CPU availability. However, creating the index on the "fid" and "name" columns in blob2name yields us the same speedup we got. Parallelizing IMAP makes more sense due to the fact most IMAP stores are non-local and subject to network latency. Followup-to: bdecd7ed8e0dcf0b45491b947cd737ba8cfe38a3 ("lei import: speed up kw updates for old IMAP messages")
2021-06-03	lei import: speed up kw updates for old IMAP messages
	On a 4-core CPU, this speeds up "lei import" on a largish IMAP inbox with 75K messages from ~21 minutes down to 40s. Parallelizing with the new LeiImportKw WQ worker class gives a near-linear speedup and brought the runtime down to ~5:40. The new idx_fid_uid index on the "fid" and "uid" columns of blob2num in mail_sync.sqlite3 brought us the final speedup. An additional index on over.sqlite3#xref3(oidbin) did not help, since idx_nntp already exists and speeds up the new ->oidbin_exists internal API. I initially experimented with a separate "lei import-kw" command but decided against it since it's useless outside of IMAP+JMAP and would require extra cognitive overhead for both users and hackers. So LeiImportKw is just a WQ worker used by "lei import" and not its own user-visible command. v2: fix ikw_done_wait arg handling (ugh, confusing API :x)
2021-05-27	lei rm: new command to remove messages from index
	This is similar to "public-inbox-learn rm", but it's possible to point an entire Maildir/IMAP/mbox*/newsgroup at it.
2021-05-25	lei forget-mail-sync: new command to drop sync information
	Sometimes a user stops caring to sync an IMAP or Maildir folder, or wants to force a resync. Let them run this command to have lei forget all the sync information about the mail folder. This won't delete any stored messages in git, but will leave "lei index" users with dangling references.
2021-05-23	lei export-kw: new command to export keywords to Maildirs
	IMAP will eventually be supported.
2021-05-17	doc lei: add manpages for new commands
	[ew: MANIFEST: s/lei-cat/lei-lcat/]
2021-05-17	doc lei: add manpage for convert
	Link: https://public-inbox.org/meta/20210429015738.GA30172@dcvr/
2021-05-05	lei rediff: capture and regenerate file modes
	Don't lose file mode information when regenerating a diff.
2021-05-05	lei rediff: regenerate diffs from stdin
	Sometimes a mailed patch is generated with non-ideal output, (lacking context, noisy whitespace changes, etc.), or a user wants to use the same external diff viewer they've configured git to use. Since we have SolverGit to regenerate arbitrary blobs from patches; this new command allows us to regenerate a diff with different options using the blobs SolverGit gives us. The amount of git-diff(1) options is mind numbing, so it's likely I missed some favorites or botched the getopt spec translation. This also fixes Inbox::base_url to check psgi.url_scheme before attempting to generate URLs and avoid uninitialized variable warnings. Oddly, the "lei blob" tests did not trigger these uninitialized warnings. Note: this will automatically import+index the message(s) it's regenerating, because solver relies on being able to lookup pre/postimage OIDs and read blobs.
2021-05-04	lei index: new command to index mail w/o git storage
	Since completely purging blobs from git is slow, users may wish to index messages in Maildirs (and eventually other local storage) without storing data in git. Much code from LeiImport and LeiInput is reused, and a new dummy FakeImport class supplies a non-storing $im->add and minimize changes to LeiStore. The tricky part of this command is to support "lei import" after a message has gone through "lei index". Relying on $smsg->{bytes} == 0 (as we do for external-only vmd storage) does not work here, since it would break searching for "z:" byte-ranges when not using externals. This eventually required PublicInbox::Import::add to use a SharedKV to keep track of imported blobs and prevent duplication.
2021-05-01	lei: rename ls-sync to ls-mail-sync
	This allows tab-completion for "ls-search" to work with fewer characters ("ls-s<TAB>" instead of "ls-se<TAB>"), and I expect "ls-search" to be used more frequently than "ls-mail-sync". This also matches the --mail-sync switch of "lei import"
2021-05-01	xt/lei-onion-convert: test for NNTP+IMAP onions
	These tests require a running Tor instance (defaulting to 127.0.0.1:9050) and Internet connectivity, but otherwise work pretty well.
2021-04-30	net_reader: Net::NNTP --proxy=socks5h:// support
	Since Net::NNTP doesn't support Socket or RawSocket options/accessors like Mail::IMAPClient does; we must perform localized @ISA manipulation and massage Net::NNTP into using IO::Socket::Socks rather than IO::Socket::IP. This is a bit fragile, but Net::Cmd and Net::NNTP rarely change; and I keep an eye on them, anyways.
2021-04-27	lei q + lcat: support --format=text output
	This is mainly for "lei lcat" where it's the default, but I find it useful anyways compared to the JSON view. Colors are loaded from ~/.config/lei/config, and fall back to using diff colors from a normal git config (e.g. ~/.gitconfig).
2021-04-27	lei lcat: extract Message-IDs from URLs and show them
	It's a wrapper around "lei q" which extracts Message-IDs from URLs, "<$MSGID>", "id:$MSGID" and attempts to display the local version of the message. Its main purpose is to extract Message-IDs out of commonly-understood URLs to save users bandwidth and time by displaying the message locally. When reading from stdin, it will discard things it doesn't understand, so you can just pipe an entire "Link: $URL" line to it and it'll attempt to pluck the Message-ID out of the URL.
2021-04-27	lei: add "ls-sync" command for listing sync folders
	This will be useful, later.
2021-04-26	lei_input: support PublicInbox::WWW mboxrd URLs
	This gives "lei import", "lei tag", and similar commands the ability to use URLs recognized by our PSGI frontend directly. This is more convenient than an equivalent shell pipeline since "set -o pipefail" is not portable and errors may be lost.
2021-04-24	lei import: keep sync info for Maildir and IMAP folders
	We aren't using it, yet, but the plan is to be able to use this information to propagate keyword changes back to IMAP and Maildir folders using some to-be-implemented command. "lei inspect" is a half-baked new command to make testing this change easier. It will be updated to support more SQLite+Xapian introspection duties in the future, including public-inbox things independent of lei.
2021-04-24	lei_mail_sync: for bidirectional keyword sync
	We'll be using the new class to efficiently propagate keyword changes from lei/store back to Maildir or IMAP folders.
2021-04-21	doc: add lei_design_notes.txt and lei-store-format(5)
	lei itself is a somewhat weird design, but lei/store is a fairly normal hybrid of extindex with v2-style storage.
2021-04-20	lei edit-search: command to tweak search parameters
	This may be useful for users to tweak search parameters. This command is also the reason lei.saved-search is a git-config file rather than JSON.
2021-04-20	lei forget-search: new command to forget saved searches
	Readers may lose interest in subscription topics. This lets them avoid clutter by forgetting a saved search. This does not and will not destroy the contents of an --output mailbox. In other words, this is similar to unsubscribing from an Atom/RSS feed or NNTP group. I've also decided we won't support 'mv-search', since it'll probably be rarely used and "lei convert" can be used, instead.
2021-04-20	lei-sigpipe: update and move test from xt => t
	We have "lei import" and better test infrastructure for lei, now, so we can more easily test SIGPIPE without relying on an already-configured instance.
2021-04-18	lei ls-search: command to list saved searches
	Going forward, we'll probably support JSON for all the "ls-*" subcommands. This also provides the basis for "lei up" shell completion.
2021-04-13	lei: add "lei up" to complement "lei q --save"
	The command isn't finalized, yet, but it's intended to update an existing saved search.
2021-04-13	lei q: start wiring up saved search
	This will have a over.sqlite3 for content-based deduplication. It may exhibit ibxish methods, so serving a read-only (or even R/W) IMAP or instance or displaying HTML isn't outside the realm of possibility.
2021-04-11	www: do not obfuscate addresses in URLs
	As they are likely Message-IDs. If an email address ends up in a URL, then it's likely public, so there's even less reason to obfuscate that particular address. [km: add xt/perf-obfuscate.t] [ew: modernize perf test (5.10.1), use diag instead of print] This version of the patch avoids the massive slowdown noted by Kyle in <https://public-inbox.org/meta/87wnt9or6t.fsf@kyleam.com/>. Performance remains roughly the same, if not slightly faster (which may be due to me testing this on a busy server). Results from xt/perf-obfuscate.t against 6078 messages on a local mirror of <https://public-inbox.org/meta/>: before: 6.67 usr + 0.04 sys = 6.71 CPU after: 6.64 usr + 0.04 sys = 6.68 CPU Reported-by: Kyle Meyer <kyle@kyleam.com> Helped-by: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87a6q8p5qa.fsf@kyleam.com/
2021-04-03	lei/store: (more) synchronous non-fatal error output
	Since every command that writes to lei/store calls ->done to commit its output, we can rely on that to return a pathname for a readable file with errors in it. Errors can still get crossed up if multiple lei commands are writing to the store at once, but reduces the delay in seeing them and ensures it won't get seen when somebody is attempting to use shell completion.