Date | Commit message (Collapse) |
|
We should be able to treat v2 outputs just like any other mail
format, with the exception that content dedupe is always
enforced by the v2 format.
This allows users hosting v2 public-inboxes to catch up broken
synchronization from alternate archives such as the mbox
archives hosted by https://lists.gnu.org/
Link: https://public-inbox.org/meta/20231114-hypersonic-papaya-starling-e1cfc8@nitro/
|
|
Avoid mixing autodie use in different scopes since it's likely
to cause problems like it did in Gcf2. While none of these
fix known problems with test cases, it's likely worthwhile to
avoid it anyways to avoid future surprises.
For Process::IO, we'll add some additional tests in t/io.t
to ensure we don't get unintended exceptions for try_cat.
|
|
Perl 5.16.3 on CentOS seems more verbose in one of the EIO
tests. Relax the regexp so we can account for extra errors
reported by Perl.
|
|
Start lowercasing newsgroup names automatically since uppercase
names are incompatible with IMAP and POP3 and also causes
problems with both -extindex and -cindex.
We'll also warn on eidx_key and newsgroup conflicts to avoid
sometimes subtle breakage when using -extindex and -cindex.
|
|
We'll require an error stream for dump_ibx and dump_roots
commands; they're too important to ignore. Instead of writing
code to provide diagnostics for errors, rely on abort(3) and the
-ggdb3 compiler flag to generate nice core dumps for gdb since
all commands sent to xap_helper are from internal users.
We'll even abort on most usage errors since they could be
bugs in split2argv or our use of getopt(3).
We'll also just exit on ENOMEM errors since it's the easiest way
to recover from those errors by starting a new process which
closes all open Xapian DB handles.
|
|
None of our current code relies on it, and I can't imagine it's
something we'd need in the future, actually... This keeps the
door open for relying more on Spawn in TestCommon.
|
|
No need to suffer through an extra dose of slow Perl load times
when we can drive the build in the big parent Perl process and
get the executable path name to pass to spawn directly.
|
|
-mda now honors `--help' properly and invocations missing
ORIGINAL_RECIPIENT now fail with EX_NOUSER.
Helped-by: Leah Neukirchen <leah@vuxu.org>
Link: https://public-inbox.org/meta/87msvlguqu.fsf@vuxu.org/
|
|
List-Unsubscribe headers with unique identifiers (such as those
generated by our examples/unsubscribe.milter) should not
end up in public archives. Add a new config knob to strip
List-Unsubscribe headers if they have the
`List-Unsubscribe-Post: List-Unsubscribe=One-Click'
header.
Unfortunately, this breaks DKIM signatures if the signature
covers either of these List-Unsubscribe* headers. However,
breaking DKIM is the lesser evil compared to any archive reader
being able to stop archival by an independent archivist.
As much as I would like this to be the default, it probably
affects few users at the moment since very few mailing lists
use unique identifiers in List-Unsubscribe (but that number
has grown, recently).
|
|
Systems with Yama can restrict ptrace(2) (the underlying syscall
used by strace(1)) and make it difficult to test error handling
via error injection. Just skip the tests on such systems since
it's probably not worth the effort to start using prctl(2) to
enable the test on such systems.
|
|
This seems like a easy (but WWW-specific) way to get recently
created and recently active topics as suggested by Konstantin.
To do this with Xapian will require a new columns and
reindexing; and I'm not sure if the current lei handling of
search results by dumping results to a format readable by common
MUAs would work well with this. A new TUI may be required...
Suggested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20231107-skilled-cobra-of-swiftness-a6ff26@meerkat/
|
|
This matches the behavior we have for multi-message mbox files
since we rely on ->close to detect errors on bad mboxes. This
ensures we'll notice errors reading single messages from stdin.
We'll also start relying more on strace error injection to test
error handling.
|
|
In the rare case sendmsg(2) isn't able to send the full amount
(due to buffers >=2GB on Linux), use print + (autodie)close
to send the remainder and retry on EINTR. `substr' should be
able to avoid a large malloc via offsets and CoW on modern Perl.
|
|
When dealing with large search results, we need to deal with
EPIPE not just from the pager, but also EPIPE or ECONNRESET
between lei_xsearch and lei2mail processes.
Without this fix, lei_xsearch processes could linger and get
stuck writing to dead lei2mail processes if a user aborts the
pager early during a large result set.
To ensure lei_xsearch processes don't linger around after
lei2mail workers all die, we must close $l2m->{-wq_s2} before
spawning lei_xsearch processes, since $l2m->{-wq_s2} is only
used in lei2mail workers.
For `git cat-file' processes, we also need to trigger
PublicInbox::Git->close to handle unpredictable destructor
ordering to avoid using uninitialized IO refs. This combines
with the `git_to_mail' change to deal with process cleanup
handling from premature shutdowns.
To test all this, we can't just rely on a single message being
large, but also need to rely on the result set being large
enough to saturate the lei_xsearch -> lei2mail socket so we
rely on GIANT_INBOX_DIR once again.
|
|
write_file is a new API which makes setting up config files more
pleasant, while autodie and scalarref redirects (in tests) have
been available for a while, now. So do what we can to reduce
the code burden we have.
|
|
The IO package seems like a better home for I/O subs than the
Git package. We lose the 60 second read timeout for `git
cat-file --batch-*' processes since it's probably not necessary
given how reliable the code has proven and things would fall
over hard in other ways if the storage device were completely
hosed.
|
|
This is pretty convenient way to create files for diff
generation in both WWW and lei. The test suite should also be
able to take advantage of it.
|
|
This fixes two major problems with the use of tie for filehandles:
* no way to do fcntl, stat, etc. calls directly on the tied handle,
forcing callers to use the `tied' perlop to access the underlying
IO::Handle
* needing separate classes to handle blocking and non-blocking I/O
As a result, Git->cleanup_if_unlinked, InputPipe->consume,
and Qspawn->_yield_start have fewer bizzare bits and we
can call `$io->blocking(0)' directly instead of
`(tied *$io)->{fh}->blocking(0)'
Having a PublicInbox::IO class will also allow us to support
custom read buffering which allows inspecting the current state.
|
|
This will open the door for us to drop `tie' usage from
ProcessIO completely in favor of OO method dispatch. While
OO method dispatches (e.g. `$fh->close') are slower than normal
subroutine calls, it hardly matters in this case since process
teardown is a fairly rare operation and we continue to use
`close($fh)' for Maildir writes.
|
|
Eric Wong <e@80x24.org> wrote:
> --- a/lib/PublicInbox/DS.pm
> +++ b/lib/PublicInbox/DS.pm
> @@ -341,8 +341,8 @@ sub greet {
> my $ev = EPOLLIN;
> my $wbuf;
> if ($sock->can('accept_SSL') && !$sock->accept_SSL) {
> - return CORE::close($sock) if $! != EAGAIN;
> - $ev = PublicInbox::TLS::epollbit() or return CORE::close($sock);
> + return $sock->close if $! != EAGAIN;
> + $ev = PublicInbox::TLS::epollbit() or return $sock->close;
> $wbuf = [ \&accept_tls_step, $self->can('do_greet')];
> }
> new($self, $sock, $ev | EPOLLONESHOT);
Noticed this on deploy:
-----8<-----
Subject: [PATCH] ds: don't try ->close after ->accept_SSL failure
->accept_SSL failures leaves the socket ref as a GLOB (not
IO::Handle) and unable to respond to the ->close method.
Calling close in any form isn't actually necessary at all,
so just let refcounting destroy the socket.
|
|
The epoll implementation is the only one which respects the
limit (kevent would, but IO::KQueue does not). In any case,
I'm not a fan of the maxevents=1000 historical default since
it leads to fairness problems with shared non-blocking listeners
across multiple daemon workers.
|
|
I hit this in via select running -cindex with some other
experimental patches. I can't reproduce the problem, though,
but this ensure we have a chance to diagnose it if it happens
again instead of looping on select(2) => EBADF.
|
|
This saves us some code, and is a small step towards getting
ProcessIO working with stat, fcntl and other perlops that don't
work with tied handles.
|
|
Most coderepos don't have extensions.objectFormat set,
so it's senseless to emit warnings on failures.
Fixes: 709fcf00c4d5 (cindex: use run_await to read extensions.objectFormat)
|
|
Now that psgi_yield is used everywhere, the more complex
psgi_return and it's helper bits can be removed. We'll also fix
some outdated comments now that everything on psgi_return has
switched to psgi_yield. GetlineResponse replaces GetlineBody
and does a better job of isolating generic PSGI-only code.
|
|
This ensures reused processes get a clean start and
avoids surprises as we develop more code around the
DS event loop.
|
|
This is similar to `backtick` but supports all our existing spawn
functionality (chdir, env, rlimit, redirects, etc.). It also
supports SCALAR ref redirects like run_script in our test suite
for std{in,out,err}.
We can probably use :utf8 by default for these redirects, even.
|
|
It's slightly better organized this way, especially since
`publicinboxLimiter' has its own user-facing config section
and knobs. I may use it in LeiMirror and CodeSearchIdx for
process management.
|
|
We need to gracefully continue when a user tries to associate
with --all but has basic (or completely unindexed) inboxes.
|
|
More tests to come, so cut down on the noise in the test code.
|
|
Oops :x
|
|
It's actually valid Perl syntax, but still confusing to look at.
Fixes: add90b9504f4 ("support -C (chdir) for most non-daemon commands")
|
|
I'm not sure why, but this test just failed for some odd reason
from `make check-run' on my Debian bullseye workstatation.
|
|
We use this in various places to minimize or maximize pipe
size on Linux. So keep it all in one place.
|
|
It's not worth the code and memory to have a setter method we
never use outside of tests.
|
|
We can share more code amongst stdin slurper (not streaming)
commands. This also fixes uninitialized variable warnings when
feeding an empty stdin to these commands.
|
|
v2 never suffered from this bug, apparently, but -learn didn't
seem able to handle indexlevel=basic (nor respect `medium')
for v1 inboxes. I only noticed this bug because I converted
some ancient v1 inboxes to `basic' to save space.
|
|
Delayed commits allows users to trade off immediate safety for
throughput and reduced storage wear when running multiple
discreet commands.
This feature is currently useful for providing a way to make
t/lei-store-fail.t reliable and for ensuring `lei blob' can
retrieve messages which have not yet been committed.
In the future, it'll also be useful for the FUSE layer to batch
git activity.
|
|
I really don't understand why this fails, sometimes; but it does.
|
|
Occasionally, t/nntp.t spews undefined variable warnings under
`make check-run'. While the test doesn't fail, it's annoying
to see them and it could be a source of deeper problems.
|
|
The `binmode' perlop can only take two scalars, so passing
`@_' blindly won't work since prototypes are checked. This
means we can get IO::Uncompress::Gunzip working properly
with ProcessIO and use it for curl.
We'll also just autodie (instead of warn) on FS errors when
dealing with curl stderr; since the process will likely be
in bigger trouble soon, anyways.
|
|
Specifying {cb_args} in the options hash felt awkward to me.
Instead, just use the Perl stack like we do with awaitpid()
and pass the list down directly.
|
|
Since we deal with pipes (of either direction) and bidirectional
stream sockets for this class, it's better to remove the `Pipe'
from the name and replace it with `IO' to communicate that it
works for any form of IO::Handle-like object tied to a process.
|
|
None of the lei internals works properly without forking and
sockets. The fallback code increases the potential to accidentally
call subs in the wrong process during the teardown phase.
We'll still support ipc_do w/o forking for now since it
forking doesn't benefit small indexing runs from -mda and
such.
|
|
It's safer against deadlocks and we still get proper error
reporting by passing stderr across in addition to the lei
socket.
|
|
require_bsd and require_mods(':fcntl_lock') are now
supported in TestCommon to make it easier to maintain
than a big list of regexps.
getsockopt for SO_ACCEPTFILTER seems to always succeed,
even if the retrieved struct is all zeroes.
|
|
Hopefully this makes it easier to diagnose portability
problems on new OSes we use.
|
|
This ensures script/lei $send_cmd usage is EINTR-safe (since
I prefer to avoid loading PublicInbox::IPC for startup time).
Overall, it saves us some code, too.
|
|
This gets rid of a few bare bless statements and helps
ensure we properly load Lock.pm before using it.
|
|
|