Date | Commit message (Collapse) |
|
While inspect is intended for debugging, the Unix epoch in
seconds requires extra steps for human consumption; just
steal what we used for "lei q -f json" output.
|
|
This will make our code more flexible in case it gets used in
non-lei things.
|
|
As documented, File::Spec->canonpath does not canonicalize
"/../". While we want to do our best to preserve symlinks in
pathnames, leaving "/../" can mislead our inotify|kqueue usage.
|
|
This allows "lei-managed pseudo mailing lists" as described
by Konstantin.
Alternates use is optional and can be enables via --shared.
This doesn't manage or edit ~/.public-inbox/config; presumably
there'll need to be some tweaking of search parameters before
finalizing and making the inbox publicly accessible via HTTP/NNTP.
Link: https://public-inbox.org/meta/20210426164454.5zd5kgugfhfwfkpo@nitro.local/T/
|
|
lei already uses PktOp and SOCK_SEQPACKET throughout; whereas
EOFpipe had one single use in lei. Since PktOp is a strict
superset of EOFpipe functionality, we may be able to get rid of
EOFpipe entirely.
However, lei is considered a portability canary and I'm not sure
if the stable public-inbox-* code can drop EOFpipe just yet.
|
|
IMAP authentication info is only shared amongst lei2mail workers,
so we must ensure all IMAP writes go through lei2mail workers
even if we don't have to access the mail through git.
This allows us to decouple the latency of the remote mboxrd from
the latency of the IMAP --output at the expense of extra IPC
overhead within our own processes.
|
|
No point in munging user-supplied $lei->{opt} when %mset_opt
exists. We'll be depending on docid being in descending order
for saved search support.
|
|
File::Temp only requires four 'X' characters (unlike mkstemp(3),
which requires six). So only so only give it 4 to avoid an
80-column violation and maybe save metadata space on FSes.
|
|
Since lei-daemon won't have the same FDs as the client, we
need to special-case thse mappings and won't be able to open
arbitrary, non-standard FDs.
We also won't attempt to support /proc/self/fd/[0-2] since
that's a Linux-ism. /dev/fd/[0-2] and /dev/std{in,out,err}
are portable to FreeBSD, at least. mawk(1) also supports
/dev/std{out,err}, as does gawk(1) (which supports everything
we can support, and arbitrary /dev/fd/$FD).
|
|
JSON outputs won't write to lei/store at all, so there's
no point in forking the store worker if it's not already
running.
LeiSearch object ($lse) is also fork-safe until it opens a
persistent FD for Xapian/SQLite so we can unconditionally
carry it across fork.
|
|
"lei q" now displays labels in JSON output, "lei mark"
can add or remove labels for any messages.
"lei ls-label" is supported, too.
Unfortunately, "lei q" won't hande "kw:" or "L:" for
external messages, they must be imported, first.
|
|
Stop showing `docid' since it's not useful with shards.
`bytes' and `lines' are probably noise, but maybe could be
visible in some "fuller" view.
v2: t/lei_xsearch: fix warnings from {docid} removal
|
|
Don't waste precious terminal space when there are only a small
number of possible keywords supported/reserved for JMAP. In the
future, we may implement more sophisticated wrapping for labels,
but it we'll cross tha bridge when we come to it.
|
|
"lei q" now preserves changes per-message keywords across
invocations when it's --output (Maildir or mbox) is reused
(with or without --augment).
In the future, these changes will be monitored via inotify,
EVFILT_VNODE or IMAP IDLE, too.
Unfortunately, this currently prevents "lei import" from ever
importing a message that's in an external. That will be fixed
in a future change.
|
|
This will be used for keyword (and label) storage for externals.
We'll be using this to ensure we don't redundantly auto-import
messages into lei/store if they're already in a local external
(they can still be imported explicitly via "lei import").
|
|
Nothing like the -Wunused C compiler flag in perl, AFAIK...
|
|
They're unnecessary visual noise, and angle brackets don't
always work as intended when going through Xapian's query
parser.
Since we already use "m:" and "refs:" instead of the actual
header names, it should be obvious we're at liberty to
abbreviate such things
Link: https://public-inbox.org/meta/20210304184348.GA19350@dcvr/
|
|
Augment (and dedupe) aren't parallel, yet, so its more sensitive to
high-latency networks.
|
|
This will make testing IMAP support for other commands easier, as
it doesn't write to lei/store at all. Like the pager and MUA,
"git credential" is always spawned by script/lei (and not
lei-daemon) so it has a controlling terminal for password
prompts.
v2: fix missing requires, correct test ordering
v3: ensure config exists for IMAP auth
|
|
Using dashed keywords confuses the option parser without
"=" signs (and bash completion doesn't yet work with "=").
So use ":" instead of "-" as the prefix for internal ops,
since ":" is just as unlikely to be the first character of
an executable file in a user's $PATH.
|
|
For --mua users writing to lock-free -o MFOLDER destinations;
we'll keep -WINCH and send an ASCII terminal bell when results
are complete. This is intended to let early MUA spawners know
when lei2mail is done writing results.
We'll also support running arbitrary commands. It may be used
to run play(1) (from SoX), handle pipelines+redirects
(e.g. "/bin/sh -c 'echo search done | wall'") or other commands.
|
|
For early MUA spawners using lock-free outputs, we we need to
on the startq pipe to silence progress reporting. For
--augment users, we can start the MUA even earlier by
creating Maildirs in the pre-augment phase.
To improve progress reporting for non-MUA (or late-MUA)
spawners, we'll no longer blindly append "--compressed" to the
curl(1) command when POST-ing for the gzipped mboxrd.
Furthermore, we'll overload stringify ('""') in LeiCurl to
ensure the empty -d '' string shows up properly.
v2: fix startq waiting with --threads
mset_progress is never shown with early MUA spawning,
The plan is to still show progress when augmenting and
deduping. This fixes all local search cases.
A leftover debug bit is dropped, too
|
|
We will have a ->wq_do that doesn't pass FDs for I/O.
|
|
This was actually causing xt/lei-sigpipe.t failures,
presumably due to reused/recycled workers with many
externals.
|
|
Another step towards simplifying lei internals.
None of our current uses of ->wq_do involve FD passing, and the
plan is only rely on FD passing between lei-daemon and lei(1).
Internally, it ought to be possible for lei-daemon internal bits
to be ordered properly to not need FD passing.
|
|
No need to be starting a pager if we're writing to a regular file.
|
|
While FD passing is critical for script/lei <=> lei-daemon,
lei-daemon doesn't need to use it internally if FDs are
created in the proper order before forking.
|
|
This will be useful on shared machines when a user doesn't want
search queries visible to other users looking at the ps(1)
output or similar.
|
|
IO::Uncompress::Gunzip seems to be losing $? when closing
PublicInbox::ProcessPipe. To workaround this, do a synchronous
waitpid ourselves to force proper $? reporting update tests to
use the new --only feature for testing invalid URLs.
This improves internal code consistency by having {pkt_op}
parse the same ASCII-only protocol script/lei understands.
We no longer pass {sock} to worker processes at all,
further reducing FD pressure on per-user limits.
|
|
We don't need to be sending errors directly to the client, but
instead go through lei-daemon or the top-level one-shot process.
|
|
lei2mail doesn't need stdin anymore, so we can use the [0] slot
for the $not_done keepalive purposes.
|
|
We won't be reporting progress when output is going to stdout
since it can clutter up the terminal unless stderr != stdout,
which probably isn't worth checking.
We'll also use a more agnostic mset_progress which may
make it easier to support worker-less invocations.
|
|
We may reuse these objects in the non-worker code paths.
|
|
Avoid on-stack shortcuts which may prevent destructors from
firing since we're not inside the event loop. We'll also tidy
up the unlink mechanism in LeiOverview while we're at it.
|
|
It doesn't save us any code, and the action-at-a-distance
element was making it confusing to track down actual problems.
Another potential problem was keeping references alive too long.
So do like we would a C100K server and check every write
while still ensuring lei(1) exit with a proper SIGPIPE
iff needed.
|
|
This fixes "--dedupe none" with Maildir where we don't
create the object at all.
|
|
Keeping track of non-standard FDs gets tricky, so make it easier
by relying on st_dev/st_ino mapping in the transmitted objects.
We'll keep using numbers for the standard FDs since we need to
be able to easily redirect them in the producer (main daemon)
process for (gzip|bzip2|xz) if writing to a compressed mbox.
|
|
Declaring "my $buf" inside that `if' branch was useless, so
we've been declaring it per-callback when {l2m} isn't in use.
|
|
The default deduplication command-line arguments would be
non-sensical for such an option and probably confusing. It
doesn't seem worth the code to support OID-only output when it's
easy enough to use one of the JSON formats to extract the same
info.
We also don't have OIDs if using remotes, and the
to-be-implemented memoization will be optional.
|
|
This was accidentally clobbered completely in
("lei q: fix JSON overview with remote externals").
There are now more tests to prevent future regressions.
|
|
We can't (and don't need to) repeatedly get the $each_smsg
callback for each URI since that clobbers {ovv_buf} before
it can be output.
I initially thought this was a dedupe-related bug and
moved the dedupe code into the $each_smsg callback to
minimize differences. Nevertheless it's a nice code
reduction.
I also thought it was related to incomplete smsg info,
so {references} is now filled in correctly for dedupe.
|
|
Via curl(1), since that lets us easily use tor on a
per-connection basis via LD_PRELOAD (torsocks) or proxy.
We'll eventually support more curl options which can allow
users to get past firewalls and deal with other odd network
configurations.
|
|
From_ lines are shown when mbox* variants are output to stdout,
making {oid} and {pct} information visible without risking being
propagated to other importer processes if they were in
lei-specific X-* headers.
Maildirs already had OIDs in the filename, now they gain Xapian
{pct} in case anybody cares.
|
|
This isn't tested for now, so maybe it works.
|
|
The old name was too long compared to the rest of the field
names. With the Xapian method being named ->get_percent,
"pct" is a well known abbreviation for "percent" and already
used internally by our wrapper.
..And cleanup some excess whitespace while we're in the area.
|
|
We may attempt to write an mbox to any terminal, block, or
character device, not just regular files and FIFOs/pipes.
The only thing that is known to not work is a directory.
Sockets may be possible with some OSes (e.g. Plan 9) or
filesystems. This fixes t/lei.t on FreeBSD 11.x
|
|
We'll need it for IMAP support, at least. Proper mbox family
detection will be expensive, so deal with it later.
|
|
Because user errors happen...
|
|
We'll invalidate the {1} (stdout) field on SIGPIPE,
so don't trigger a Perl warning by writing to it.
|
|
With 4 dedicated workers, this seems to provide a 100-120%
speedup on a 4 core machine when writing thousands of search
results to a Maildir or mbox. This also sets us up for
high-latency IMAP destinations in the future.
This opens the door to more speedup opportunities such
as optimizing dedupe locking and other ways to reduce
contention.
This change is fairly complex and convoluted, unfortunately.
Further work may allow us to simplify it and even improve
performance.
|