Date | Commit message (Collapse) |
|
Write barriers can take a long time to finish, especially when
commands are issues in parallel. So handle it asynchronously
without blocking lei-daemon by making EOFpipe a little more
flexible by supporting arguments to the callback function.
This is another step towards improving parallel use of lei.
|
|
barrier (synchronous checkpoint) is better than ->done with
parallel lei commands being issued (via '&' or different
terminals), since repeatedly stopping and restarting processes
doesn't play nicely with expensive tasks like `lei reindex'.
This introduces a slight regression in maintaining more
processes (and thus resource use) when lei is idle, but that'll
be fixed in the next commit.
|
|
Noticed while tracking down fast-import crash bug report.
Link: https://public-inbox.org/meta/CAL_JsqK7P4gjLPyvzxNEcYmxT4j6Ah5f3Pz1RqDHxmysTg3aEg@mail.gmail.com/
|
|
LeiToMail can't sort v2 output, but sorting MH input (and
NNTP spool + mlmmj archives) numerically makes sense.
|
|
I can't reproduce this in t/lei-sigpipe.t with GIANT_INBOX_DIR.
In real-world usage, having a large `lei q -f text ...' output
piped to a pager and killing the pager prematurely could
trigger:
non-fatal error from PublicInbox::LeiToMail $?=256
messages in my terminal. This is because $self->{lei} was
becoming undefined in the process cleanup process of
git_to_mail. So flip the cleanup logic around and
unconditionally check for Git::cleanup state to bail out
early.
With this change, the `non-fatal error ...' message no longer
appears when I stop reading results early.
|
|
Users may accidentally or unknowingly write `mbox' and not know
we support 4 incompatible mbox variants.
|
|
The MH format is widely-supported and used by various MUAs such
as mutt and sylpheed, and a MH-like format is used by mlmmj for
archives, as well. Locking implementations for writes are
inconsistent, so this commit doesn't support writes, yet.
inotify|EVFILT_VNODE watches aren't supported, yet, but that'll
have to come since MH allows packing unused integers and
renaming files.
|
|
I noticed this bug while developing another feature and tests
were getting SIGHUP (since SIGHUP == 1 on most systems).
|
|
We only care about error checking when stdout is an mbox output
pointed to a pathname. This is noticeable with `lei up' with
multiple non-mbox* destinations. We'll also ensure throwing
exceptions to trigger lei->x_it from lei->do_env results in the
epoll/kqueue watch being discarded, otherwise commands may never
terminate (leading to stuck tests)
|
|
We need to consistently check the exit code of pigz|gzip|xz|bzip2
when writing to compressed mboxes (or bad storage).
|
|
We've always forced LeiToMail to only have one process for v2
outputs anyways since v2 has its own sharding and IPC. Thus we
can use the single LeiToMail process directly to avoid extra IPC
overhead.
|
|
We should be able to treat v2 outputs just like any other mail
format, with the exception that content dedupe is always
enforced by the v2 format.
This allows users hosting v2 public-inboxes to catch up broken
synchronization from alternate archives such as the mbox
archives hosted by https://lists.gnu.org/
Link: https://public-inbox.org/meta/20231114-hypersonic-papaya-starling-e1cfc8@nitro/
|
|
read_all can be expanded to support FIFOs/pipes/sockets where
read-until-EOF behavior is desired. We can also rely on
wantarray to support splitting on EOL markers, but it's
hard-coded to support only `$/ eq "\n"' since (AFAIK)
it's the only way we use the wantarray form `readline'.
|
|
List-Unsubscribe headers with unique identifiers (such as those
generated by our examples/unsubscribe.milter) should not
end up in public archives. Add a new config knob to strip
List-Unsubscribe headers if they have the
`List-Unsubscribe-Post: List-Unsubscribe=One-Click'
header.
Unfortunately, this breaks DKIM signatures if the signature
covers either of these List-Unsubscribe* headers. However,
breaking DKIM is the lesser evil compared to any archive reader
being able to stop archival by an independent archivist.
As much as I would like this to be the default, it probably
affects few users at the moment since very few mailing lists
use unique identifiers in List-Unsubscribe (but that number
has grown, recently).
|
|
When dealing with large search results, we need to deal with
EPIPE not just from the pager, but also EPIPE or ECONNRESET
between lei_xsearch and lei2mail processes.
Without this fix, lei_xsearch processes could linger and get
stuck writing to dead lei2mail processes if a user aborts the
pager early during a large result set.
To ensure lei_xsearch processes don't linger around after
lei2mail workers all die, we must close $l2m->{-wq_s2} before
spawning lei_xsearch processes, since $l2m->{-wq_s2} is only
used in lei2mail workers.
For `git cat-file' processes, we also need to trigger
PublicInbox::Git->close to handle unpredictable destructor
ordering to avoid using uninitialized IO refs. This combines
with the `git_to_mail' change to deal with process cleanup
handling from premature shutdowns.
To test all this, we can't just rely on a single message being
large, but also need to rely on the result set being large
enough to saturate the lei_xsearch -> lei2mail socket so we
rely on GIANT_INBOX_DIR once again.
|
|
We must use eof() combined with close() to detect errors
in situations involving the readline()' op since `readline'
(and most buffered I/O libraries) have weak error detection
support.
This fixes error detection for files opened for read/write
access. The next commit will fix error detection for read-only
files.
|
|
This fixes two major problems with the use of tie for filehandles:
* no way to do fcntl, stat, etc. calls directly on the tied handle,
forcing callers to use the `tied' perlop to access the underlying
IO::Handle
* needing separate classes to handle blocking and non-blocking I/O
As a result, Git->cleanup_if_unlinked, InputPipe->consume,
and Qspawn->_yield_start have fewer bizzare bits and we
can call `$io->blocking(0)' directly instead of
`(tied *$io)->{fh}->blocking(0)'
Having a PublicInbox::IO class will also allow us to support
custom read buffering which allows inspecting the current state.
|
|
Aside from our prior import bugs (fixed in a0c07cba0e5d8b6a
(mda: drop leading "From " lines again, 2016-06-26)), we'll
always have to be dealing with mutt piping messages to us and
`git format-patch' output. So just share the regexp so we
can use it everywhere.
In may be desirable to allow importing messages with a leading
"From " line for FUSE, even.
Additionally, some instances of this regexp needlessly added
optional `\r?' (CR) checks ahead of the `\n' (LF) element; but
they're pointless anyways since [^\n]* is enough to exclude all
non-LF bytes.
|
|
Specifying {cb_args} in the options hash felt awkward to me.
Instead, just use the Perl stack like we do with awaitpid()
and pass the list down directly.
|
|
Since we deal with pipes (of either direction) and bidirectional
stream sockets for this class, it's better to remove the `Pipe'
from the name and replace it with `IO' to communicate that it
works for any form of IO::Handle-like object tied to a process.
|
|
We already have an ->incr callback we can enhance to support
multiple counters with a single request. Furthermore, we can
just flatten the object graph by storing counters directly in
the $lei object itself to reduce hash lookups.
|
|
nr_write may be undef if nothing was written due to dedupe, but
with seen messages.
|
|
The benefit of 1MB potential pipe buffer size (on Linux) doesn't
seem noticeable when reading from git (unlike when writing to v2
shards), so Unix stream sockets seem fine, here.
This allows us to simplify our process management by using the
same socket FD for reads and writes and enables us to use our
ProcessPipe class for reaping (as we can do with Gcf2Client).
Gcf2Client no longer relies on PublicInbox::DS for write
buffering, and instead just waits for requests to complete
once the number of inflight requests hits the MAX_INFLIGHT
threshold as we do with PublicInbox::Git.
We reuse the existing MAX_INFLIGHT limit (18) that was
determined by the minimum allowed PIPE_BUF (512). (AFAIK) Unix
stream sockets have no analogy to PIPE_BUF, but all *BSDs and
Linux I've checked have default SO_RCVBUF and SO_SNDBUF values
larger than the previously-required PIPE_BUF size of 512 bytes.
|
|
Using and memoizing the usability of `--rsyncable' is unsafe
since pigz (or GNU gzip) can be uninstalled and leave a user
with a non-rsync-aware gzip implementation in the long-running
daemon. So we stop passing --rsyncable by default to pigz/gzip
and no longer attempt to check for it (since it was a TOCTTOU
error, anyways).
Specifying --rsyncable explicitly didn't work, either, and
ended up passing `1' to the gzip/pigz argv :x
Finally, we now test --rsyncable on the CLI by adding support
for it in `lei convert' and testing it in t/lei-convert.t
|
|
This makes interesting parts of our code easier to read IMHO.
We can take advantage of `local' while avoiding `fileno' calls
since it's called in spawn() anyways to reduce LoC even further.
|
|
Our awaitpid API now exists and ProcessPipe uses it, so it's
immune to cyclic references. Thus there's no need to create
a duplicate of the lei object to prevent leaks.
|
|
Allow the exit code to be the first argument intead of the last
to match our ->child_error, as well as the BSD err(3) API.
We'll also avoid shifting user-passed exit codes so $? can be
passed as-is without losing signal information.
|
|
File::Path already accounts for the existence of directories,
handles races from redundant mkdir(2), and croaks on
unrecoverable errors. So there's no point in doing any
of that on our end.
Furthermore, avoiding the overhead of loading File::Path doesn't
seem worth it to save 20-60ms given the overhead of loading
our other code. Instead, try to reduce optree overhead on
our code, instead, since File::Path gets used in a bunch of
places.
We'll also favor the newer make_path for multi-directory
invocations to avoid bloating our own optree to create an
arrayref, but mkpath is one fewer subroutine call within
File::Path itself, right now.
|
|
This brings t/lei-index.t back down from ~8 to ~3s. I didn't
notice this before was because the LeiNoteEvent timer was firing
every 5s and clearing circular refs and parallel testing meant
the delay got hidden.
Fixes: 4a2a95bbc78f99c8 (ipc+lei: switch to awaitpid, 2023-01-17)
|
|
This avoids awkwardly stuffing an arrayref into callbacks
which expect multiple arguments. IPC->awaitpid_init now
allows pre-registering callbacks before spawning workers.
|
|
awaitpid is the new API which will eventually replace dwaitpid.
It enables early registration of callback handlers. Eventually
(once dwaitpid is gone) it'll be able to use fewer waitpid
calls.
The avoidance of waitpid(-1) in our earlier days was driven by
the belief that threads may eventually become relevant for Perl 5,
but that's extremely unlikely at this stage. I will still
introduce optional threads via C, but they definitely won't be
spawning/reaping processes.
Argument order to callbacks is swapped (PID first) to allow
flattened multiple arguments more natrually. The previous API
(allowing only a single argument, as influenced by
pthread_create(3)) was more tedious as it involved packing
multiple arguments into yet another array.
|
|
We need to rely on lei->fail to propagate errors in lei workers
to the script/lei client, otherwise tests and other scripts can
stumble forward with incomplete/incorrect/broken outputs.
This helps me focus on occasional t/lei-up.t failures I see on
CentOS 7.x where OverIdx->adj_counter fails on "lei up --all"...
|
|
Excessive IMAP connections can overload IMAP servers and cause
clients to be disconnected without diagnostic messages.
Use $lei->fail on these exceptions to propagate errors to the
CLI ASAP to avoid further errors down the line.
This ought to make problems more apparent for users using IMAP
destinations.
Reported-by: Ricardo Ribalda <ribalda@chromium.org>
Link: https://public-inbox.org/meta/CANiDSCsDfutAUMBLPZbxdyka+_jnhv+4YNYdL9QPRoC=wNUGCQ@mail.gmail.com/
|
|
This may help track down deduplication or other bugs in lei
which lead to occasionally missing messages.
Link: https://public-inbox.org/meta/CAL_JsqJH8xx_2NyZffNsRXbGXiv3kjmCETvKXt3Yfb0uToLm9Q@mail.gmail.com/
|
|
->DESTROY ordering via "exit()" calls is tricky, and dedupe
checks were causing problems.
AFAIK, this only affects users who manually enable WAL on
lei/store/ei*/over.sqlite3. Fortunately, there is no data
corruption as a result even though "read-only" WAL requires
write permissions.
|
|
"text" and "reply" outputs are intended for the pager, so
parallelizing them is a waste of resources.
v2 has shards, of course, so parallelizing writes to it
is also a waste since the deduplication work is a bit
more complex.
|
|
Mail synchronization in lei_to_mail only works for IMAP and
Maildir; so don't waste time preparing mbox* writers for it.
|
|
No need to go through the lei/store process when we write
mail_sync.sqlite3. This ought to reduce ENOBUFS errors (and the
sleep workaround) on RAM-starved systems.
|
|
One syscall is better than two for atomicity in Maildirs. This
means there's no window where another process can see both the
old and new file at the same time (link && unlink), nor a window
where we might inadvertantly clobber an existing file if we were
to do `stat && rename'.
|
|
We don't need to flood the terminal with "W: $oid is (!= blob)\n"
messages when somebody nukes a git cat-file process from under
us.
|
|
Simplify our APIs and force dwaitpid() to work in async mode for
all lei workers. This avoids having lingering zombies for
parallel searches if one worker finishes soon before another.
The old distinction between "old" and "new" workers was
needlessly complex, error-prone, and embarrasingly bad.
We also never handled v2:// writers properly before on
Ctrl-C/Ctrl-Z (SIGINT/SIGTSTP), so add them to @WQ_KEYS
to ensure they get handled by $lei when appropropriate.
|
|
By relying more on pgroups for remaining remaining processes,
this lets us pause all curl+tail subprocesses with a single
kill(2) to avoid cluttering stderr.
We won't bother pausing the pigz/gzip/bzip2/xz compressor
process not cat-file processes, though, since those don't write
to the terminal and they idle soon after the workers react to
SIGSTOP.
AutoReap is hoisted out from TestCommon.pm. CLONE_SKIP
is gone since we won't be using Perl threads any time
soon (they're discouraged by the maintainers of Perl).
|
|
Just in case it fails when there's many parallel invocations.
|
|
Since switching to SOCK_SEQUENTIAL, we no longer have to use
fixed-width records to guarantee atomic reads. Thus we can
maintain more human-readable/searchable PktOp opcodes.
Furthermore, we can infer the subroutine name in many cases
to avoid repeating ourselves by specifying a command-name
twice (e.g. $ops->{CMD} => [ \&CMD, $obj ]; can now simply be
written as: $ops->{CMD} => [ $obj ] if CMD is a method of
$obj.
|
|
I'm not sure what caused it, but $err was undef and caused print
to fail, leading to an event loop error. Guard the timer with
an eval and assume warn() can't trigger an event loop failure.
|
|
Overwriting existing destinations safe (but slow) by default,
so show a progress message noting what we're doing while
a user waits.
|
|
This has several advantages:
* no need to use ipc.lock to protect a pipe for non-atomic writes
* ability to pass FDs. In another commit, this will let us
simplify lei->sto_done_request and pass newly-created
sockets to lei/store directly.
disadvantages:
- an extra pipe is required for rare messages over several
hundred KB, this is probably a non-issue, though
The performance delta is unknown, but I expect shards
(which remain pipes) to be the primary bottleneck IPC-wise
for lei/store.
|
|
Since we can't use maxuid for remote externals, automatically
maintaining the last time we got results and appending a dt:
range to the query will prevent HTTP(S) responses from getting
too big.
We could be using "rt:", but no stable release of public-inbox
supports it, yet, so we'll use dt:, instead.
By default, there's a two day fudge factor to account for MTA
downtime and delays; which is hopefully enough. The fudge
factor may be changed per-invocation with the
--remote-fudge-factor=INTERVAL option
Since different externals can have different message transport
routes, "lastresult" entries are stored on a per-external basis.
|
|
Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"),
relying on lei/store to serialize access was a pointless endeavor.
Rely on flock(2) to serialize multiple writers since (in my
experience) it's the easiest way to deal with parallel writers
when using SQLite. This allows us to simplify existing callers
while speeding up 'lei refresh-mail-sync --all=local' by 5% or
so.
|
|
When composing replies in "git format-patch" cover letters,
I'd been relying on "lei q -f text ...", but that still requires
several steps to make it suitable for composing a reply:
* s/^/> / to quote the body
* drop existing In-Reply-To+References
* s/^Message-ID:/In-Reply-To:/;
* add an attribute line
...
"lei q -f reply" takes care of most of that and users will
only have to trim "From " lines, unnecessary results and
over-quoted text (and trimming is likely less error-prone
than doing all the steps above manually).
This should also be a good replacement for
"git format-patch --in-reply-to=...", since copying long
Message-IDs can be error-prone (and this lets you include
quoted text in replies).
|