Date | Commit message (Collapse) |
|
Maildir and IMAP can both handle `forwarded'. Ensure we don't
lose `forwarded' when reading from stores which do not support
it, but ensure we can set it when reading from IMAP and Maildir
stores.
|
|
We have "lei import" and better test infrastructure for lei,
now, so we can more easily test SIGPIPE without relying on
an already-configured instance.
|
|
As they are likely Message-IDs. If an email address ends up in
a URL, then it's likely public, so there's even less reason to
obfuscate that particular address.
[km: add xt/perf-obfuscate.t]
[ew: modernize perf test (5.10.1), use diag instead of print]
This version of the patch avoids the massive slowdown noted by Kyle in
<https://public-inbox.org/meta/87wnt9or6t.fsf@kyleam.com/>.
Performance remains roughly the same, if not slightly faster
(which may be due to me testing this on a busy server). Results
from xt/perf-obfuscate.t against 6078 messages on a local mirror
of <https://public-inbox.org/meta/>:
before: 6.67 usr + 0.04 sys = 6.71 CPU
after: 6.64 usr + 0.04 sys = 6.68 CPU
Reported-by: Kyle Meyer <kyle@kyleam.com>
Helped-by: Kyle Meyer <kyle@kyleam.com>
Link: https://public-inbox.org/meta/87a6q8p5qa.fsf@kyleam.com/
|
|
Because failures are often overlooked, unfortunately.
|
|
We must use the $ops hashref returned by lei->workers_start,
since it's modified to include extra handlers for auth failures
and whatnot.
Fixes: 954581b8e575966a ("lei: simplify PktOp callers")
|
|
This makes it easier to manage test dependencies on systems
where optional stuff isn't installed. This fixes some lei tests
which didn't check for Plack before starting -httpd, and ensures
Parse::RecDescent is available for -imapd in case
Mail::IMAPClient stops using it.
|
|
detect_nproc is in the IPC module, now; and we can safely
disable fsync when creating test data.
And "modernize" up to 5.10.1 while we're at it.
The use fsync was causing this to run for hours instead
of minutes since I forgot to use eatmydata.
|
|
Some poorly-configured MUAs will send application/octet-stream
even for text-only attachments. We can't make expect all MUAs
are configured with proper MIME types, and there is plenty of
historical mail that falls into this unfortunate criteria.
v2: simplify the check and ensures returned text is Perl "utf8"
|
|
This matches existing Maildir behavior, as trash and draft
messages have little reason to be exposed publicly.
|
|
They're unnecessary visual noise, and angle brackets don't
always work as intended when going through Xapian's query
parser.
Since we already use "m:" and "refs:" instead of the actual
header names, it should be obvious we're at liberty to
abbreviate such things
Link: https://public-inbox.org/meta/20210304184348.GA19350@dcvr/
|
|
So far, searching by size has never been publicly documented,
and IMHO, of questionable utility. In any case, "z:" is what
mairix(1) uses, so it may be familiar to existing mairix users
(I've never used this prefix myself).
So far, this prefix is only used internally in tests and in
auto-translated queries from IMAP; thus this incompatible change
is unlikely to affect anyone.
|
|
IMAP is similar to Maildir and we can now preserve keyword
updates done on IMAP folders.
|
|
eml ("message/rfc822" MIME type) is supported by "lei import",
so it probably makes sense to support via convert, at least
for tests. And IMAP support is supported in "lei q -o $MFOLDER",
so this only required renaming {nrd} => {net} and initializing
outputs before augment preparation (creating the IMAP folder)
|
|
This flexibility should save us some code down-the-line.
|
|
Augment (and dedupe) aren't parallel, yet, so its more sensitive to
high-latency networks.
|
|
This interpolation is used by the upstream URI package
and we rely on it elsewhere for HTTP(S) URIs, so save
ourselves some surprises down the line.
|
|
Requiring TEST_IMAP_WRITE_URL to be set to a writable IMAP
server URL isn't ideal, but it works for now until we have time
to setup a mock dovecot/cyrus/etc... instance for testing.
|
|
All of our current IMAP code relies on Mail::IMAPClient
at the moment, so ensure we skip those tests on systems
without that module.
|
|
We need to ensure authentication failures and error codes get
propagated to the parent process(es) properly.
v2: update MANIFEST
v3: LeiAuth.pm ->_lei_cfg bit moved to a previous commit
|
|
IPv4 gets plenty of real-world coverage, and apparently there's
Debian buildd hosts which lack IPv4(*). So ensure everything
can work on IPv6 and not cause problems for odd setups.
(*) https://bugs.debian.org/979432
|
|
We will have a ->wq_do that doesn't pass FDs for I/O.
|
|
Avoid on-stack shortcuts which may prevent destructors from
firing since we're not inside the event loop. We'll also tidy
up the unlink mechanism in LeiOverview while we're at it.
|
|
Sometimes it can be confusing for "lei q" to finish writing to a
Maildir|mbox and not know if it did anything. So show some
per-external progress and stats.
These can be disabled via the new --quiet/-q switch.
We differ slightly from mairix(1) here, as we use stderr
instead of stdout for reporting totals (and we support
parallel queries from various sources).
|
|
This allows us to avoid repeated open() and close() syscalls
and speeds up the new xt/stress-sharedkv.t maintainer test
by roughly 7%.
|
|
Mainly around fork() calls, but some nearby places as well.
|
|
We need to properly propagate SIGPIPE to the top-level
lei-daemon process and avoid relying on auto-close,
since auto-close triggers Perl warnings when explicit
close() does not.
|
|
The new test ensures consistency between oneshot and
client/daemon users. Cancelling an in-progress result now also
stops xsearch workers to avoid wasted CPU and I/O.
Note the lei->atfork_child_wq usage changes, it is to workaround
a bug in Perl 5: http://nntp.perl.org/group/perl.perl5.porters/258784
<CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com>
This switches the internal protocol to use SOCK_SEQPACKET
AF_UNIX sockets to prevent merging messages from the daemon to
client to run pager and kill/exit the client script.
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
I've been using something like this to mock out thousands
of inboxes for testing.
|
|
{ibx} is shorter and is the most prevalent abbreviation
in indexing and IMAP code, and the `$ibx' local variable
is already prevalent throughout.
In general, the codebase favors removal of vowels in variable
and field names to denote non-references (because references are
"lighter" than non-references).
So update WWW and Filter users to use the same code since
it reduces confusion and may allow easier code sharing.
|
|
Fixes: 6550226296e9db79 ("xt: remove eml_check_roundtrip")
|
|
Unlike Email::MIME, PublicInbox::Eml::as_string should be able
to round trip from the Perl object to a raw scalar and back
without changes.
|
|
"*foo" is ambiguous in that it may refer to a bareword file handle;
so we'll use it where we can without triggering warnings.
PublicInbox::TestCommon::run_script_exit required dropping the
prototype, however. We'll also future-proof by dropping "use
warnings" in Cgit.pm and use the less-ambiguous "//=" in Inbox.pm
while we're in the area.
|
|
We'll be making changes to solver to make it even fairer
to slow clients on slow storage. Ensure we test with
public-inbox-httpd-specific codepaths, since the generic
PSGI code paths are rare in production use.
|
|
strict.pm helped me find a typo in an upcoming recent change, so
ensure we use it since it does more good than harm. We'll also
take the opportunity here to declare v5.10.1 compatibility level
to future-proof against Perl incompatibilities.
|
|
{over_ro} being a part of the Search object is a historical
oddity which will go away, soon. Lets start removing its use in
tests and rarely-used helper scripts.
|
|
mbsync was not retrieving anything since it was looking for
"inbox" when we need to return "INBOX" as a special case
for IMAP.
Fixes: 8af34015e9aa94e5 (imap: LIST shows "INBOX" in all caps)
|
|
Test::More dups standard FDs and may create FDs for other
purposes. run_mode => 0 lets us rely on FD_CLOEXEC to ensure
-imapd has enough FDs to accept all incoming connections at
the cost of higher (one-off) startup time.
|
|
We want to be able to parallelize and stress test more
endpoints and toggle `--compressed' and possibly other
options in curl.
|
|
This lets the -httpd worker process make better use of time
instead of waiting for git-cat-file to respond. With 4 jobs in
the new test case against a clone of
<https://public-inbox.org/meta/>, a speedup of 10-12% is shown.
Even a single job shows a 2-5% improvement on an SSD.
|
|
Since the removal of pseudo-hash support in Perl 5.10, the
"fields" module no longer provides the space or speed benefits
it did in 5.8. It also does not allow for compile-time checks,
only run-time checks.
To me, the extra developer overhead in maintaining "use fields"
args has become a hassle. None of our non-DS-related code uses
fields.pm, nor do any of our current dependencies. In fact,
Danga::Socket (which DS was originally forked from) and its
subclasses are the only fields.pm users I've ever encountered in
the wild. Removing fields may make our code more approachable
to other Perl hackers.
So stop using fields.pm and locked hashes, but continue to
document what fields do for non-trivial classes.
|
|
For properly parsing IMAP search requests, it's easier to use a
recursive descent parser generator to deal with subqueries and
the "OR" statement.
Parse::RecDescent was chosen since it's mature, well-known,
widely available and already used by our optional dependencies:
Inline::C and Mail::IMAPClient. While it's possible to build
Xapian queries without using the Xapian string query parser;
this iteration of the IMAP parser still builds a string which is
passed to Xapian's query parser for ease-of-diagnostics.
Since this is a recursive descent parser dealing with untrusted
inputs, subqueries have a nesting limit of 10. I expect that is
more than adequate for real-world use.
|
|
IMAP requires either the Email::Address::XS or Mail::Address
package (part of perl-MailTools RPM or libmailtools-perl deb);
and Email::Address::XS is not officially packaged for some older
distros, most notably CentOS 7.x.
|
|
Since we limit our mailboxes slices to 50K and can guarantee a
contiguous UID space for those mailboxes, we can store a mapping
of "UID offsets" (not full UIDs) to Message Sequence Numbers as
an array of 16-bit unsigned integers in a 100K scalar.
For UID-only FETCH responses, we can momentarily unpack the
compact 100K representation to a ~1.6M Perl array of IV/UV
elements for a slight speedup.
Furthermore, we can (ab)use hash key deduplication in Perl5 to
deduplicate this 100K scalar across all clients with the same
mailbox slice open.
Technically we can increase our slice size to 64K w/o increasing
our storage overhead, but I suspect humans are more accustomed
to slices easily divisible by 10.
|
|
This will make it easier to show parameters used for testing
and potential tweaks to be made.
|
|
Having two large numbers separated by a dash can make visual
comparisons difficult when numbers are in the 3,000,000 range
for LKML. So avoid the $UID_END value, since it can be
calculated from $UID_MIN. And we can avoid large values of
$UID_MIN, too, by instead storing the block index and just
multiplying it by 50000 (and adding 1) on the server side.
Of course, LKML still goes up to 72, at the moment.
|
|
Finish up the IMAP-only portion of iterative config reloading,
which allows us to create all sub-ranges of an inbox up front.
The InboxIdler still uses ->each_inbox which will struggle with
100K inboxes.
Having messages in the top-level newsgroup name of an inbox will
still waste bandwidth for clients which want to do full syncs
once there's a rollover to a new 50K range. So instead, make
every inbox accessible exclusively via 50K slices in the form of
"$NEWSGROUP.$UID_MIN-$UID_END".
This introduces the DummyInbox, which makes $NEWSGROUP
and every parent component a selectable, empty inbox.
This aids navigation with mutt and possibly other MUAs.
Finally, the xt/perf-imap-list maintainer test is broken, now,
so remove it. The grep perlfunc is already proven effective,
and we'll have separate tests for mocking out ~100k inboxes.
|
|
It's useful to know how fast SIGHUP can be handled, too.
|
|
imapd-validate is a beefed up version of our nntpd-validate test
which hammers the server with parallel connections over regular
IMAP, IMAPS, IMAP+STARTTLS; and COMPRESS=DEFLATE variants of
each of those. It uses $START_UID:$END_UID fetch ranges to
reduce requests and slurp many responses at once to saturate
"git cat-file --batch" processes.
mbsync(1) also uses pipelining extensively (but IMHO
unnecessarily), so it was able to shake out some bugs in
the async git code.
Finally, we remove xt/cmp-imapd-compress.t since it's
redundant now that we have PublicInbox::IMAPClient to work
around bugs in Mail::IMAPClient.
|
|
Include a test for Mail::IMAPTalk, here, since Mail::IMAPClient
stalls with compression enabled:
https://rt.cpan.org/Ticket/Display.html?id=132720
|