Date | Commit message (Collapse) |
|
This will have a over.sqlite3 for content-based deduplication.
It may exhibit ibxish methods, so serving a read-only (or even
R/W) IMAP or instance or displaying HTML isn't outside the realm
of possibility.
|
|
To support saved search, we need the query string available to
us before we setup LeiDedupe via (LeiOverview || LeiToMail).
|
|
LeiSavedSearch will use a LeiDedupe-like internal API,
so we won't have to make as many changes to callsites
between saved and unsaved searches.
|
|
We only need the combined mset query when we care about sort
order. When writing to --output destinations intended for MUA
consumption, sort order is irrelevant as MUAs are expected to
offer their own sorting, so run queries to each external in
parallel.
This prepares us for docid-sort-based saved search support.
It will also become faster than the combined mset query for
users with many externals due to current Xapian exhibiting poor
performance with many shards (the same reason -extindex exists)
|
|
This seemed to be causing occasional "make check-run" failures
with errors bleeding into other tests.
|
|
As they are likely Message-IDs. If an email address ends up in
a URL, then it's likely public, so there's even less reason to
obfuscate that particular address.
[km: add xt/perf-obfuscate.t]
[ew: modernize perf test (5.10.1), use diag instead of print]
This version of the patch avoids the massive slowdown noted by Kyle in
<https://public-inbox.org/meta/87wnt9or6t.fsf@kyleam.com/>.
Performance remains roughly the same, if not slightly faster
(which may be due to me testing this on a busy server). Results
from xt/perf-obfuscate.t against 6078 messages on a local mirror
of <https://public-inbox.org/meta/>:
before: 6.67 usr + 0.04 sys = 6.71 CPU
after: 6.64 usr + 0.04 sys = 6.68 CPU
Reported-by: Kyle Meyer <kyle@kyleam.com>
Helped-by: Kyle Meyer <kyle@kyleam.com>
Link: https://public-inbox.org/meta/87a6q8p5qa.fsf@kyleam.com/
|
|
init.defaultBranch expects a branch name, not a fully qualified ref.
git-init prepends "refs/heads/" automatically and unconditionally.
PublicInbox::Import::default_branch, however, incorrectly passes on
the init.defaultBranch value as is, leading to it being used in spots
where a fully qualified ref is required. For example, with an
init.defaultBranch value of "master", public-inbox-index for a v2
repository would lead to an all.git repository where HEAD's content is
"ref: master" instead of "ref: refs/heads/master".
Prepend "refs/heads/" to the incoming init.defaultBranch value.
Fixes: 7c2f36de2fb49dd7 (import: respect init.defaultBranch)
|
|
It's nicer in case a user transfers lei/store across machines
and wants a way to track when/where they imported something.
|
|
IMAP authentication info is only shared amongst lei2mail workers,
so we must ensure all IMAP writes go through lei2mail workers
even if we don't have to access the mail through git.
This allows us to decouple the latency of the remote mboxrd from
the latency of the IMAP --output at the expense of extra IPC
overhead within our own processes.
|
|
We don't need to waste LoC on corner cases, single-use internal
subs, or restoring SIG{__WARN__} when a process exits. All that
extra code contributes to memory use and startup time, especially
for users who can't use FD passing.
|
|
We'll eventually want lei_input users like "lei import" and
"lei tag" to support parallel reads.
|
|
RFC 8621 registers $flagged, $answered, $seen, $draft which
map to IMAP, Maildir, and mbox Status/X-Status flags.
$forwarded is noted in JMAP, but only Maildir and and the
"Lemonade" IMAP profile (RFC 5550) support it
|
|
We don't need to import so many things. None of the Errno
constants are in common paths so unlikely to benefit from
constant folding.
|
|
It currently conflicts with the way OverIdx and SearchIdx
index messages, ultimately leading to violating a NOT NULL
constraint on id2num.id in over.sqlite3.
We may allow searching Resent-* fields separately, though I'm
not sure how useful it'll be.
|
|
Since every command that writes to lei/store calls ->done
to commit its output, we can rely on that to return a
pathname for a readable file with errors in it.
Errors can still get crossed up if multiple lei commands
are writing to the store at once, but reduces the delay
in seeing them and ensures it won't get seen when somebody
is attempting to use shell completion.
|
|
We need a stable fallback time for digest2mid in the presence
of messages without Received/Date headers. Furthermore, we
must avoid using uninitialized smsg->{mid} when parsing
References for draft replies.
|
|
Just exiting with a failure code and no error message is
confusing :x
|
|
$? is useful, as is labeling lei_err since I'm easily-confused :x
|
|
We'll just let the ExtSearchIdx code handle this uncommon case
by doing a full commit.
|
|
Sometimes I want to save debug info to a file or pipe even when
spawning an MUA.
|
|
Remote results can safely use the same mset progress reporting
as local results, despite not knowing the size of the result
set. We're assuming terminal MUAs, for now.
|
|
"convert" is actually a bit more complicated than "lei import"
since it may need auth for either input or output.
|
|
No reason for the hash key to differ from the subroutine
name, here.
|
|
We need net_merge_all and to lock the number of worker jobs.
Parallel inputs are not supported, yet (is it needed?, I don't
expect this to be used for multiple files very often...).
|
|
Leaving workers running on after auth failures is bad and messy,
cleanup our process management to have consistent worker
teardowns. Improve error reporting, too, instead of letting
Mail::IMAPClient->exists fail due to undef.
|
|
I wanted to say 2031, but that's probably too aggressive a
removal timeline.
|
|
I completely forgot about git-credential prompting when
making lei background the client process for MUA.
Now it backgrounds itself only for the MUA when no FDs are
passed, since the MUA is the final command run. Otherwise, it
relies on FD passing as before.
Fixes: c790a75439f3a1db ("script/lei: background ourselves on MUA/pager exec")
|
|
mbox and IMAP seem to have no way of describing this keyword.
but Maildir does with the "P" flagged (for "passed").
|
|
It's needless noise when doing augment and output preparation
and shows up way too late and out-of-band with lei-daemon.
|
|
lei_store contents aren't intended to become public, so there's
no point in nagging users for their email address for git
committer information like git does.
|
|
There's no point in adding vmd information for an external
message if it was never stored and there's no vmd at all.
We also don't need to check _docids_for for similar messages,
either, since we always check lse->kw_changed, first.
|
|
We can tweak lse->kw_changed to return docids and reduce IPC
traffic and reduce work the lei/store worker needs to do.
|
|
AFAIK that was only used for nproc detection, and nproc
is handled by PublicInbox::IPC, nowadays.
|
|
It's a bit of an Easter egg, though it's not possible to hide those
in Free Software... Anyways, it doesn't cost us an entry in %CMD
of LEI.pm and anybody frustrated enough with lei just might type
"lei sucks" on the command-line :>
|
|
No point in sending a command for every input when a
single one will do. We'll also trigger LeiStore->done
sooner in the worker rather than later.
|
|
Assume a user specifying --mail doesn't want to spend cycles
reconstructing a blob from a code repo. Also, don't require
users to use add-external or a previous -I or --only to ready an
external for use with ale.git.
|
|
While plenty of online documentation exists, it's good to have
a locally-available summary for users to look at offline.
Fix a URL in Watch.pm while we're at it, too.
|
|
We must use the $ops hashref returned by lei->workers_start,
since it's modified to include extra handlers for auth failures
and whatnot.
Fixes: 954581b8e575966a ("lei: simplify PktOp callers")
|
|
I've decided "tag" is a better verb since it seems more
widely-used term for associating metadata with data.
Not only is it analogous to the "notmuch tag" command, but
also makes sense when compared to tooling for manipulating
metadata for non-mail data (e.g. audio metadata tags).
There's even a Wikipedia entry for it:
https://en.wikipedia.org/wiki/Tag_(metadata)
whereas "mark" is used in the description, but has no
entry of its own with regards to metadata.
|
|
No point in munging user-supplied $lei->{opt} when %mset_opt
exists. We'll be depending on docid being in descending order
for saved search support.
|
|
Note that update_kw_maybe is critical in preventing accidental
data loss with default "lei q --output" behavior.
Also avoid treating (proposed) MH support as lock-free, since
appears to lack specifications for locking and be even worse
than mbox* in that regard...
|
|
Some cgit configs use trailing slashes in pathnames
which we preserve internally.
Before this change, trailing slashes in cgit config files
was causing ViewVCS (SolverGit) output to show up as "???"
for coderepos without cgitUrl configured.
|
|
Since "lei q" and "lei convert" already support writing these
compressed inboxes, it makes sense that all mbox readers support
them, as well.
Using compression is one reliable way to know an mboxrd or mboxo
hasn't been unexpectedly truncated.
|
|
$lei->fail sends SIGTERM which prevents the File::Temp::Dir in
$solver->{tmp} from being cleaned up, so use $lei->child_error
instead.
|
|
".eml" is a suffix supported by (/usr/local)/etc/mime.types
on Debian and FreeBSD systems using the "mime-support" package.
".patch" is what "git format-patch" generates by default since
git v1.5.0 in 2007.
|
|
This is compatible with default gunzip(1) behavior and
future-proofs us against potential changes in PublicInbox::WWW
to save memory on public-inbox-httpd instances.
|
|
We can consistently open /dev/stdin correctly nowadays, so
drop the input_stdin and just use the normal ->path_to_fd
code path.
|
|
File::Temp only requires four 'X' characters (unlike mkstemp(3),
which requires six). So only so only give it 4 to avoid an
80-column violation and maybe save metadata space on FSes.
|
|
"lei blob" supports --git-dir and -C, and checks if the
current directory has a git directory associated with it.
It will likely support submodules in the future.
I'm inclined to believe declaring coderepos in a command-line
tool is needless clutter and users will rarely want to search
for blobs across different projects when on the command-line.
|
|
Introduce a new LeiRemote wrapper to provide an internal API
which SolverGit expects. This lets us use HTTP/HTTPS endpoints
to reconstruct blobs off patches as we would with local
endpoints, just more slowly...
|