about summary refs log tree commit homepage
path: root/lib
DateCommit message (Collapse)
2021-04-13lei q: start wiring up saved search
This will have a over.sqlite3 for content-based deduplication. It may exhibit ibxish methods, so serving a read-only (or even R/W) IMAP or instance or displaying HTML isn't outside the realm of possibility.
2021-04-13lei_query: rearrange internals to capture query early
To support saved search, we need the query string available to us before we setup LeiDedupe via (LeiOverview || LeiToMail).
2021-04-13lei_dedupe: adjust to prepare for saved searches
LeiSavedSearch will use a LeiDedupe-like internal API, so we won't have to make as many changes to callsites between saved and unsaved searches.
2021-04-13lei_xsearch: use per-external queries when not sorting
We only need the combined mset query when we care about sort order. When writing to --output destinations intended for MUA consumption, sort order is irrelevant as MUAs are expected to offer their own sorting, so run queries to each external in parallel. This prepares us for docid-sort-based saved search support. It will also become faster than the combined mset query for users with many externals due to current Xapian exhibiting poor performance with many shards (the same reason -extindex exists)
2021-04-13lei blob: quiet "git rev-parse --git-dir" stderr w/o --cwd
This seemed to be causing occasional "make check-run" failures with errors bleeding into other tests.
2021-04-11www: do not obfuscate addresses in URLs
As they are likely Message-IDs. If an email address ends up in a URL, then it's likely public, so there's even less reason to obfuscate that particular address. [km: add xt/perf-obfuscate.t] [ew: modernize perf test (5.10.1), use diag instead of print] This version of the patch avoids the massive slowdown noted by Kyle in <https://public-inbox.org/meta/87wnt9or6t.fsf@kyleam.com/>. Performance remains roughly the same, if not slightly faster (which may be due to me testing this on a busy server). Results from xt/perf-obfuscate.t against 6078 messages on a local mirror of <https://public-inbox.org/meta/>: before: 6.67 usr + 0.04 sys = 6.71 CPU after: 6.64 usr + 0.04 sys = 6.68 CPU Reported-by: Kyle Meyer <kyle@kyleam.com> Helped-by: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87a6q8p5qa.fsf@kyleam.com/
2021-04-07import: convert init.defaultBranch to fully qualified ref
init.defaultBranch expects a branch name, not a fully qualified ref. git-init prepends "refs/heads/" automatically and unconditionally. PublicInbox::Import::default_branch, however, incorrectly passes on the init.defaultBranch value as is, leading to it being used in spots where a fully qualified ref is required. For example, with an init.defaultBranch value of "master", public-inbox-index for a v2 repository would lead to an all.git repository where HEAD's content is "ref: master" instead of "ref: refs/heads/master". Prepend "refs/heads/" to the incoming init.defaultBranch value. Fixes: 7c2f36de2fb49dd7 (import: respect init.defaultBranch)
2021-04-07lei_store: use getpwuid and hostname for ident
It's nicer in case a user transfers lei/store across machines and wants a way to track when/where they imported something.
2021-04-05lei q: fix auth IMAP --output with remote mboxrd
IMAP authentication info is only shared amongst lei2mail workers, so we must ensure all IMAP writes go through lei2mail workers even if we don't have to access the mail through git. This allows us to decouple the latency of the remote mboxrd from the latency of the IMAP --output at the expense of extra IPC overhead within our own processes.
2021-04-05lei_to_mail: improve comments and reduce LoC
We don't need to waste LoC on corner cases, single-use internal subs, or restoring SIG{__WARN__} when a process exits. All that extra code contributes to memory use and startup time, especially for users who can't use FD passing.
2021-04-05lei: maildir: move shard support to MdirReader
We'll eventually want lei_input users like "lei import" and "lei tag" to support parallel reads.
2021-04-05lei_tag: fix comments w.r.t support levels
RFC 8621 registers $flagged, $answered, $seen, $draft which map to IMAP, Maildir, and mbox Status/X-Status flags. $forwarded is noted in JMAP, but only Maildir and and the "Lemonade" IMAP profile (RFC 5550) support it
2021-04-05lei_to_mail: trim down imports
We don't need to import so many things. None of the Errno constants are in common paths so unlikely to benefit from constant folding.
2021-04-05lei_search: ignore Resent-Message-ID for indexing
It currently conflicts with the way OverIdx and SearchIdx index messages, ultimately leading to violating a NOT NULL constraint on id2num.id in over.sqlite3. We may allow searching Resent-* fields separately, though I'm not sure how useful it'll be.
2021-04-03lei/store: (more) synchronous non-fatal error output
Since every command that writes to lei/store calls ->done to commit its output, we can rely on that to return a pathname for a readable file with errors in it. Errors can still get crossed up if multiple lei commands are writing to the store at once, but reduces the delay in seeing them and ensures it won't get seen when somebody is attempting to use shell completion.
2021-04-03lei: improve handling of Message-ID-less draft messages
We need a stable fallback time for digest2mid in the presence of messages without Received/Date headers. Furthermore, we must avoid using uninitialized smsg->{mid} when parsing References for draft replies.
2021-04-03lei tag: note message mismatches on failure
Just exiting with a failure code and no error message is confusing :x
2021-04-03test_common: lei_ok: improve diagnostics
$? is useful, as is labeling lei_err since I'm easily-confused :x
2021-04-03lei_store: update alternates on new epoch
We'll just let the ExtSearchIdx code handle this uncommon case by doing a full commit.
2021-04-03lei: allow progress to non-TTY after MUA spawn
Sometimes I want to save debug info to a file or pipe even when spawning an MUA.
2021-04-03lei q: don't show remote progress if MUA is running
Remote results can safely use the same mset progress reporting as local results, despite not knowing the size of the result set. We're assuming terminal MUAs, for now.
2021-04-03net_reader: fix read-only "lei convert" auth failures
"convert" is actually a bit more complicated than "lei import" since it may need auth for either input or output.
2021-04-03lei_auth: rename {net_merge} to {net_merge_continue}
No reason for the hash key to differ from the subroutine name, here.
2021-04-03lei tag: fix tagging of IMAP inputs
We need net_merge_all and to lock the number of worker jobs. Parallel inputs are not supported, yet (is it needed?, I don't expect this to be used for multiple files very often...).
2021-04-03lei q: ensure wq workers shutdown on IMAP auth failures
Leaving workers running on after auth failures is bad and messy, cleanup our process management to have consistent worker teardowns. Improve error reporting, too, instead of letting Mail::IMAPClient->exists fail due to undef.
2021-04-03URInntps: add URI 5.08 release note
I wanted to say 2031, but that's probably too aggressive a removal timeline.
2021-04-02lei: fix git-credential handling
I completely forgot about git-credential prompting when making lei background the client process for MUA. Now it backgrounds itself only for the MUA when no FDs are passed, since the MUA is the final command run. Otherwise, it relies on FD passing as before. Fixes: c790a75439f3a1db ("script/lei: background ourselves on MUA/pager exec")
2021-04-01lei: maildir: handle "forwarded" keyword as "P"
mbox and IMAP seem to have no way of describing this keyword. but Maildir does with the "P" flagged (for "passed").
2021-04-01lei_store: quiet down per-message related warnings
It's needless noise when doing augment and output preparation and shows up way too late and out-of-band with lei-daemon.
2021-04-01lei_store: quiet down git user info being unset
lei_store contents aren't intended to become public, so there's no point in nagging users for their email address for git committer information like git does.
2021-04-01lei_store: set_xvmd: don't add if no vmd at all
There's no point in adding vmd information for an external message if it was never stored and there's no vmd at all. We also don't need to check _docids_for for similar messages, either, since we always check lse->kw_changed, first.
2021-04-01lei q: reduce lei/store work for kw changes to stored mail
We can tweak lse->kw_changed to return docids and reduce IPC traffic and reduce work the lei/store worker needs to do.
2021-04-01lei_query: remove unnecessary V2Writable require
AFAIK that was only used for nproc detection, and nproc is handled by PublicInbox::IPC, nowadays.
2021-04-01lei sucks: sub-command to aid bug reporting
It's a bit of an Easter egg, though it's not possible to hide those in Free Software... Anyways, it doesn't cost us an entry in %CMD of LEI.pm and anybody frustrated enough with lei just might type "lei sucks" on the command-line :>
2021-03-31lei_input: reduce IPC traffic with multiple inputs
No point in sending a command for every input when a single one will do. We'll also trigger LeiStore->done sooner in the worker rather than later.
2021-03-31lei blob: "--mail" disables solver, use --include/only
Assume a user specifying --mail doesn't want to spend cycles reconstructing a blob from a code repo. Also, don't require users to use add-external or a previous -I or --only to ready an external for use with ale.git.
2021-03-31doc: add lei-mail-formats(5) manpage
While plenty of online documentation exists, it's good to have a locally-available summary for users to look at offline. Fix a URL in Watch.pm while we're at it, too.
2021-03-31lei: fix IMAP auth failure handling
We must use the $ops hashref returned by lei->workers_start, since it's modified to include extra handlers for auth failures and whatnot. Fixes: 954581b8e575966a ("lei: simplify PktOp callers")
2021-03-30lei tag: rename from "lei mark"
I've decided "tag" is a better verb since it seems more widely-used term for associating metadata with data. Not only is it analogous to the "notmuch tag" command, but also makes sense when compared to tooling for manipulating metadata for non-mail data (e.g. audio metadata tags). There's even a Wikipedia entry for it: https://en.wikipedia.org/wiki/Tag_(metadata) whereas "mark" is used in the description, but has no entry of its own with regards to metadata.
2021-03-30lei q: avoid redundant default setting for sort with l2m
No point in munging user-supplied $lei->{opt} when %mset_opt exists. We'll be depending on docid being in descending order for saved search support.
2021-03-30lei_to_mail: update some comments and style
Note that update_kw_maybe is critical in preventing accidental data loss with default "lei q --output" behavior. Also avoid treating (proposed) MH support as lock-free, since appears to lack specifications for locking and be even worse than mbox* in that regard...
2021-03-30git: local_nick: handle trailing or redundant '/' in git_dir
Some cgit configs use trailing slashes in pathnames which we preserve internally. Before this change, trailing slashes in cgit config files was causing ViewVCS (SolverGit) output to show up as "???" for coderepos without cgitUrl configured.
2021-03-29lei_input: support compressed mboxes
Since "lei q" and "lei convert" already support writing these compressed inboxes, it makes sense that all mbox readers support them, as well. Using compression is one reliable way to know an mboxrd or mboxo hasn't been unexpectedly truncated.
2021-03-29lei blob: cleanup solver tmpdir on failure
$lei->fail sends SIGTERM which prevents the File::Temp::Dir in $solver->{tmp} from being cleaned up, so use $lei->child_error instead.
2021-03-29lei_input: treat ".eml" and ".patch" suffix as "eml"
".eml" is a suffix supported by (/usr/local)/etc/mime.types on Debian and FreeBSD systems using the "mime-support" package. ".patch" is what "git format-patch" generates by default since git v1.5.0 in 2007.
2021-03-29lei: use IO::Uncompress::Gunzip MultiStream
This is compatible with default gunzip(1) behavior and future-proofs us against potential changes in PublicInbox::WWW to save memory on public-inbox-httpd instances.
2021-03-29lei_input: avoid special case sub for --stdin
We can consistently open /dev/stdin correctly nowadays, so drop the input_stdin and just use the normal ->path_to_fd code path.
2021-03-28treewide: shorten temporary filename
File::Temp only requires four 'X' characters (unlike mkstemp(3), which requires six). So only so only give it 4 to avoid an 80-column violation and maybe save metadata space on FSes.
2021-03-28lei: drop coderepo placeholders, submodule TODO
"lei blob" supports --git-dir and -C, and checks if the current directory has a git directory associated with it. It will likely support submodules in the future. I'm inclined to believe declaring coderepos in a command-line tool is needless clutter and users will rarely want to search for blobs across different projects when on the command-line.
2021-03-28lei blob: add remote external support
Introduce a new LeiRemote wrapper to provide an internal API which SolverGit expects. This lets us use HTTP/HTTPS endpoints to reconstruct blobs off patches as we would with local endpoints, just more slowly...