Date | Commit message (Collapse) |
|
Schedule a timer to stop shard workers and the git-cat-file
process after a `barrier' command. This allows us to save some
memory again when the lei-daemon is idle but preserves the fork
overhead reduction when issuing many commands in parallel or in
quick succession.
|
|
barrier (synchronous checkpoint) is better than ->done with
parallel lei commands being issued (via '&' or different
terminals), since repeatedly stopping and restarting processes
doesn't play nicely with expensive tasks like `lei reindex'.
This introduces a slight regression in maintaining more
processes (and thus resource use) when lei is idle, but that'll
be fixed in the next commit.
|
|
Since data going to git is the most important, always ensure
data is written to git before attempting to write anything to
SQLite or Xapian.
|
|
getpid() isn't cached by glibc nowadays and system calls are
more expensive due to CPU vulnerability mitigations. To
ensure we switch to the new semantics properly, introduce
a new `on_destroy' function to simplify callers.
Furthermore, most OnDestroy correctness is often tied to the
process which creates it, so make the new API default to
guarded against running in subprocesses.
For cases which require running in all children, a new
PublicInbox::OnDestroy::all call is provided.
|
|
Most xap_terms callers do not benefit from the hashref
return value, and we can delay hashmap use until
List::Util::uniqstr if needed.
|
|
Delayed commits allows users to trade off immediate safety for
throughput and reduced storage wear when running multiple
discreet commands.
This feature is currently useful for providing a way to make
t/lei-store-fail.t reliable and for ensuring `lei blob' can
retrieve messages which have not yet been committed.
In the future, it'll also be useful for the FUSE layer to batch
git activity.
|
|
This can probably be made asynchronous in the future via
PublicInbox::InputPipe, but it's good enough for testing.
|
|
It's safer against deadlocks and we still get proper error
reporting by passing stderr across in addition to the lei
socket.
|
|
lei/store can get stuck if lei-daemon is blocked, and lei-daemon
can get stuck when a clients stderr is redirected to a pager
that isn't consumed.
So start relying on Time::HiRes::alarm to generate SIGALRM to
break out of the `print' perlop. Unfortunately, this isn't easy
since Perl auto-restarts all writes, so we dup(2) the
destination FD and close the copy in the SIGALRM handler to
force `print' to return.
Most programs (MUAs, editors, etc.) aren't equipped to deal with
non-blocking STDERR, so we can't make the stderr file description
non-blocking.
Another way to solve this problem would be to have script/lei
send a non-blocking pipe to lei-daemon in the {2} slot and
make script/lei splice messages from the pipe to stderr.
Unfortunately, that requires more work and forces more
complexity into script/lei and slow down normal cases where
stderr doesn't get blocked.
|
|
While we're at it, ensure we clear the Perl internal EOF
marker before attempting to read the appended-to file
handle since newer Perl may leave the internal EOF marker set.
|
|
When import hits blobs it's already seen, we'll add labels
regardless in order to match the behavior of other inexact
matches. This is useful when importing exact copies of
messages which exist in multiple mailboxes.
I noticed this when I had a message imported from my normal IMAP
`INBOX', but also copied it to a different folder for future
reference.
|
|
While ->wq_workers_start is idempotent, the pipe creation for
PublicInbox::LeiStoreErr was not and required several extra
syscalls and FD allocations. Check the correct field required
for SOCK_SEQPACKET workers rather than pipe-based workers.
Fixes: cbc2890cb89b81cb ("lei/store: use SOCK_SEQPACKET rather than pipe")
|
|
This brings t/lei-index.t back down from ~8 to ~3s. I didn't
notice this before was because the LeiNoteEvent timer was firing
every 5s and clearing circular refs and parallel testing meant
the delay got hidden.
Fixes: 4a2a95bbc78f99c8 (ipc+lei: switch to awaitpid, 2023-01-17)
|
|
This avoids awkwardly stuffing an arrayref into callbacks
which expect multiple arguments. IPC->awaitpid_init now
allows pre-registering callbacks before spawning workers.
|
|
I may be the only lei user who has redundantly-indexed messages
needing this, though...
|
|
We need to call eidx_init in each git->cat_async callback
since another requestor may've stopped the shard processes.
|
|
|
|
There's no need to initialize eidx if we already have an open
handle for mail_sync.sqlite3
|
|
No need to go through the lei/store process when we write
mail_sync.sqlite3. This ought to reduce ENOBUFS errors (and the
sleep workaround) on RAM-starved systems.
|
|
One syscall is better than two for atomicity in Maildirs. This
means there's no window where another process can see both the
old and new file at the same time (link && unlink), nor a window
where we might inadvertantly clobber an existing file if we were
to do `stat && rename'.
|
|
The lei/store process should only exit from EOF on the
socket, so make sure we note any unintended signals
|
|
Simplify our APIs and force dwaitpid() to work in async mode for
all lei workers. This avoids having lingering zombies for
parallel searches if one worker finishes soon before another.
The old distinction between "old" and "new" workers was
needlessly complex, error-prone, and embarrasingly bad.
We also never handled v2:// writers properly before on
Ctrl-C/Ctrl-Z (SIGINT/SIGTSTP), so add them to @WQ_KEYS
to ensure they get handled by $lei when appropropriate.
|
|
When importing several sources in parallel via http(s) mboxrd,
we need to be able to get keywords of uncommitted documents
directly from shard workers. Otherwise, Xapian DocNotFound
errors happen because the read-only LeiSearch won't see
documents from uncomitted transactions. Keep in mind that it's
possible the keywords can be changed on-the-fly even for
uncommitted documents because of inotify watches from LeiNoteEvent.
|
|
|
|
This is slighly more meaningful since the file is already
in ~/.local/share/lei/store, so "lei_store" was redundant
(and the "XXXX" are random characters replaced by File::Temp)
|
|
Some code paths may use maximum size checks, so ensure
any checks are waited on, too.
|
|
In retrospect, I don't think it's needed; and trying to wire up
a user interface for lei to manage process counts doesn't seem
worthwhile. It could be resurrected for public-facing daemon
use in the future, but that's what version control systems are for.
This also lets us automatically avoid setting up broadcast
sockets
Followup-to: 7b7939d47b336fb7 ("lei: lock worker counts")
|
|
With the switch from pipes to sockets for lei-daemon =>
lei/store IPC, we can send the script/lei client socket to the
lei/store process and rely on reference counting in both Perl
and the kernel to persist the script/lei.
|
|
This has several advantages:
* no need to use ipc.lock to protect a pipe for non-atomic writes
* ability to pass FDs. In another commit, this will let us
simplify lei->sto_done_request and pass newly-created
sockets to lei/store directly.
disadvantages:
- an extra pipe is required for rare messages over several
hundred KB, this is probably a non-issue, though
The performance delta is unknown, but I expect shards
(which remain pipes) to be the primary bottleneck IPC-wise
for lei/store.
|
|
Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"),
relying on lei/store to serialize access was a pointless endeavor.
Rely on flock(2) to serialize multiple writers since (in my
experience) it's the easiest way to deal with parallel writers
when using SQLite. This allows us to simplify existing callers
while speeding up 'lei refresh-mail-sync --all=local' by 5% or
so.
|
|
Merely pruning mail synchronization information was
insufficient for Maildir: renames are common in Maildir
and we need to detect them after-the-fact when lei-daemon
isn't running.
Running this command could make "lei index" far more
useful...
v2: close R/O mail_sync.sqlite3 dbh before fork
Keeping the DB file handle open across fork can cause bad things
to happen even if we don't use it since sqlite3 itself still knows
about it (but doesn't know Perl code doesn't know about it).
|
|
We'll be using binary SHA-1 and SHA-256 in-memory since that's
what mail_sync.sqlite3 stores.
|
|
IMHO, this greatly improves code sharing and organization
between v2, extindex, and lei/store. Common git-related
logic for these is lightly-refactored and easier to reason
about.
The impetus for this big change was to ensure inboxes
created+managed by public-inbox-{clone,fetch} could have
alternates and configs setup properly without depending on
SQLite (via V2Writable). This change does that while
making old code shorter and better factored.
|
|
ENOENT can be too common due to timing and concurrent access
from MUAs and "lei export-kw", and other mail synchronization
tools (e.g. mbsync and offlineimap).
|
|
This works with existing inotify/EVFILT_VNODE functionality to
propagate changes made from one Maildir to another Maildir.
I chose the lei/store worker process to handle this since
propagating changes back into lei-daemon on a massive scale
could lead to dead-locking while both processes are attempting
to write to each other. Eliminating IPC overhead is a nice
side effect, but could hurt performance if Maildirs are slow.
The code for "lei export-kw" is significantly revamped to match
the new code used in the "lei/store" daemon. It should be more
correct w.r.t. corner-cases and stale entries, but perhaps
better tests need to be written.
squashed:
t/lei-auto-watch: increase delay for FreeBSD kevent
My FreeBSD VM seems to need longer for this test than inotify
under Linux, likely because the kevent support code needs to be
more complicated.
|
|
This will be needed as we track changes in real-time, especially
for "lei index" since there's no storage involved.
|
|
For lei-index to work in parallel with MUA access and upcoming
inotify-based updates, mail_sync.sqlite3 needs to always be
up-to-date to read-only worker processes (ahead of everything
else). So rely on the default auto-commit behavior and hope
SQLite WAL can reduce some of the overheads involved with
writes.
|
|
Some of these errors were inadvertantly lost due to delayed
error reporting in the past.
|
|
We must set autoflush to ensure timely notification of clients;
and lei-daemon must not block when waiting on reads in case of
spurious wakeups.
Finally, if no clients are connected to lei-daemon, write to
syslog to ensure the error is visible.
|
|
Another step towards moving more of our internals to use binary
OIDs to avoid needless conversions before hitting disk.
|
|
This allows client sockets to wait for "done" commits to
lei/store while the daemon reacts asynchronously. The goal
of this change is to keep the script/lei client alive until
lei/store commits changes to the filesystem, but without
blocking the lei-daemon event loop. It depends on Perl
refcounting to close the socket.
This change also highlighted our over-use of "done" requests to
lei/store processes, which is now corrected so we only issue it
on collective socket EOF rather than upon reaping every single
worker.
This also fixes "lei forget-mail-sync" when it is the initial
command.
This took several iterations and much debugging to arrive at the
current implementation:
1. The initial iteration of this change utilized socket passing
from lei-daemon to lei/store, which necessitated switching
from faster pipes to slower Unix sockets.
2. The second iteration switched to registering notification sockets
independently of "done" requests, but that could lead to early
wakeups when "done" was requested by other workers. This
appeared to work most of the time, but suffered races under
high load which were difficult to track down.
Finally, this iteration passes the stringified socket GLOB ref
to lei/store which is echoed back to lei-daemon upon completion
of that particular "done" request.
|
|
This allows MUA-made flag changes to Maildirs to be instantly
read and acknowledged for future search results.
In the future, it may be used to speed up --augment and
--import-before (the default) with with "lei q".
|
|
As implied in commit 6ff03ba2be9247f1
("lei export-kw: do not write directly to mail_sync.sqlite3"),
modifying mail_sync.sqlite3 directly can lead to conflicts
and making everything go through lei/store is easier.
|
|
PublicInbox::Import never imports @UNWANTED_HEADERS, so ensure
our mock blob OIDs do the same. This ought to prevent
duplicates if the PSGI mboxrd download starts setting
"X-Status: F" like "lei q -tt .."
|
|
This ought to avoid /Document \d+ not found/ errors from Xapian
when seeing a message for the first time by not attempting to
read keywords for totally unseen messages.
|
|
Simplify oid2docid and filter out undefined docids in ->add_eml,
instead. This avoids SQLite "datatype mismatch" errors in
OverIdx->add_over
Fixes: d1052f03ea85d4af ("lei/store: cull redundant docids based on blob OID")
|
|
I'm not sure how this happened (only once for me in March), but
it should not happen... In any case, we'll operate on the
lowest numbered docid and cull redundant index entries when
lei/store is open for read-write.
This also fixes the normal lei/store removal path to clean up
the xref3 table (since it's not done automatically for
public-facing -eidx due to the multi-list nature of it).
|
|
Since users can't set IMAP flags in read-only IMAP folders,
we won't clobber local flags when importing from IMAP. This
also enables the local_blob fallback used for lei-index to
be used for index deduplication.
|
|
Sharing lms->{dbh} with eidx shards appears to be the cause of
the "Issuing rollback() due to DESTROY without explicit
disconnect() of DBD::SQLite::db handle" messages I've been
seeing from "lei up".
|
|
We mainly rely on ->done with lei/store, but moving to
->checkpoint probably makes sense. Note: over, msgmap, and
mail_sync all have slightly different transacation behavior;
perhaps they can be unified in the future.
|