Date | Commit message (Collapse) |
|
Partial (v2) clones should be useful addition for users wanting
to conserve storage while having fast access to recent messages.
Continuing work started in 876e74283ff3 (fetch: ignore
non-writable epoch dirs, 2021-09-17), this creates bare,
read-only epoch git repos. These git repos have the remotes
pre-configured, but does not fetch any objects.
The goal is to allow users to set the writable bit on a
previously-skipped epoch and start fetching it.
Shell completion support may not be necessary given how short
the epoch ranges are, here.
Cc: Luis Chamberlain <mcgrof@kernel.org>
Link: https://public-inbox.org/meta/20210917002204.GA13112@dcvr/T/#u
|
|
It's probably least confusing for user-facing messages to
display times in the user's configured timezone. I considered
appending "UTC" to the message and sticking with gmtime(), too,
but this output isn't intended to be web-cache friendly nor
expect users from across multiple timezones to view the same
output.
|
|
It helps to be consistent and reduce the learning curve, here.
|
|
It's possible for the rename() sequence to cause read-only
daemons using ->xdb_shards_flat to load an incomplete set of
contiguous shards and get invalid docids for search results.
With this change, we favor the case where search is momentarily
unavailable rather than giving wrong results during the small
window where Xapcmd->commit_changes runs.
|
|
"Correct" meaning the permissions match that of the parent
xap15 or ei15 directory.
|
|
public-inbox-init sets umask for git <2.1.0, so our fork+exec
replacement needs to restore the original umask of the "parent".
|
|
Neither Inboxes nor ExtSearch objects were retrying correctly
when there are live git processes, but the inboxes were getting
rescanned for search or other reasons. Ensure the scan retries
eventually if there's live processes.
We also need to update the cleanup task to detect Xapian shard
count changes, since Xapian ->reopen is enough to detect any
other Xapian changes. Otherwise, we just issue an inexpensive
->reopen call and let Xapian check whether there's anything
worth reopening.
This also lets us eliminate the Devel::Peek dependency.
|
|
Check for unlinked mmap-ed files via /proc/$PID/maps every 60s
or so.
ExtSearch (extindex) is compatible-enough with Inbox objects to
be wired into the old per-inbox code, but the startup cost is
projected to be much higher down the line when there's >30K
inboxes, so we scan /proc/$PID/maps for deleted files before
unlinking. With old Inbox objects, it was (and is) simpler to
just kill processes w/o checking due to the low startup cost
(and non-portability of checking).
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210921144754.gulkneuulzo27qbw@meerkat.local/
|
|
Redundant code is noise and therefore confusing :<
|
|
We shouldn't dispatch all outputs right away since they
can be expensive CPU-wise. Instead, rely on DESTROY to
trigger further redispatches.
This also fixes a circular reference bug for the single-output
case that could lead to a leftover script/lei after MUA exit.
I'm not sure how --jobs/-j should work when the actual xsearch
and lei2mail has it's own parallelism ("--jobs=$X,$M"), but
it's better than having thousands of subtasks running.
Fixes: b34a267efff7b831 ("lei up: fix --mua with single output")
|
|
A few dozen bytes saved here can add up when we have thousands
of inboxes. It also makes Data::Dumper debug output a bit cleaner.
|
|
The bit about reap_compress is no longer true since
LeiXSearch->query_done triggers it, instead. I only noticed
this while working on "lei up".
|
|
It looks dumb, but I'm not about to take a runtime penalty to
use signalfd|EVFILT_SIGNAL, here, either.
|
|
This fixes the occasional t/lei-sigpipe.t infinite loop
under "make check-run".
Link: http://nntp.perl.org/group/perl.perl5.porters/258784
<CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com>
Followup-to: b552bb9150775fe4 ("daemon+watch: fix localization of %SIG for non-signalfd users")
|
|
It's needless noise and misleads users reading "ps" into
thinking there's more workers when there's only one.
|
|
There's a chance some sensitive information (e.g. folder names)
can end up in errors.log, though $XDG_RUNTIME_DIR or
/tmp/lei-$UID/ will have 0700 permissions, anyways.
|
|
Sometimes it's useful to pause an expensive query or
refresh-mail-sync to do something else. While lei-daemon and
lei/store can't be paused since they're shared across clients,
per-invocation WQ workers can be paused safely using the
unblockable SIGSTOP.
While we're at it, drop the ETOOMANYREFS hint since it
hasn't been a problem since we drastically reduced FD passing
early in development.
|
|
We could redirect, too, but just use -q since we don't care
for the output with run_mode => 0.
|
|
Avoid slurping gigantic (e.g. 100000) result sets into a single
response if a giant limit is specified, and instead use 10000
as a window for the mset with a given offset. We'll also warn
and hint towards about the --limit= switch when the estimated
result set is larger than the default limit.
|
|
I wanted to try --dedupe=none for something, but it failed
since I forgot --no-save :x So hint users towards --no-save
if necessary.
|
|
It's needless noise in syslogs for daemons and unnecessarily
alarming to users on the command-line.
|
|
Overwriting existing destinations safe (but slow) by default,
so show a progress message noting what we're doing while
a user waits.
|
|
"lei export-kw" no longer completes for anonymous sources.
More commands use "lei refresh-mail-sync" as a basis for their
completion work, as well.
";AUTH=ANONYMOUS@" is stripped from completions since it was
preventing bash completion from working on AUTH=ANONYMOUS IMAP
URLs. I'm not sure if there's a better way, but all of our code
works fine without specifying AUTH=ANONYMOUS as a command-line
arg.
Finally, we fallback to using more candidates if none can
be found, allowing multiple URLs to be completed.
|
|
NNTP URLs are probably more prevalent in public message archives
than IMAP URLs.
|
|
Lots of stuff out there that becomes a pain to setup
configuration for and test...
|
|
If lcat-ing multiple argument types (blobs vs folders),
maintain the original order of the arguments instead of
dumping all blobs before folder contents.
|
|
We can set opt->{quiet} for (internal) 'note-event' command
to quiet ->qerr, since we use ->qerr everywhere else. And
we'll just die() instead of setting a ->{fail} message, since
eval + die are more inline with the rest of our Perl code.
|
|
NNTP servers, IMAP servers, and various MUAs may recycle
"unique" identifiers due to software bugs or careless BOFHs.
Warn about them, but always be prepared to account for them.
|
|
No reason not to support them, since there's more
public-inbox-nntpd instances than -imapd instances,
currently.
|
|
Xapian and SQLite access can be slow when a DB is large and/or
on high-latency storage.
|
|
We need to waitpid synchronously on pkg-config to use $?.
When loading Gcf2 inside the event loop, implicit dwaitpid
done by PublicInbox::ProcessPipe would not call waitpid in
time to zero $?. This was causing one of my -httpd to
occasionally fall back to git(1) instead of using Gcf2.
This was noted in:
Link: https://public-inbox.org/meta/20210914085322.25517-1-e@80x24.org/
|
|
NNTP article numbers are stored separately from folder names
in mail_sync.sqlite3.
Recovering from this is optional, worse case is wasting
bandwidth refetching some messages. To (optionally) recover
from this, use:
lei forget-mail-sync $URL_WITH_ARTNUMS
Some articles will be refetched on the next import, but
duplicate data won't be indexed in Xapian.
|
|
It's still a work-in-progress, but the basic debug knob
comes in handy for new users; as does proxy support.
|
|
A batch size of zero is nonsensical and causes infinite loops.
|
|
As with "lei edit-search", "lei config --edit" may
spawn an interactive editor which works best from
the terminal running script/lei.
So implement LeiConfig as a superclass of LeiEditSearch
so the two commands can share the same verification
hooks and retry logic.
|
|
At least not by default, to match existing NNTP behavior.
Tor .onions are already encrypted, and there's no point
in encrypting traffic on localhost outside of testing.
|
|
This brings -watch up to feature parity with lei with
SOCKS support.
|
|
I'm not sure what caused it, but I've noticed two missing
messages that failed from "lei up" on an https:// external;
and I've also seen some duplicates in the past (which I
think I fixed...).
|
|
While NNTP ranges was already working, fetching a single message
was broken. We'll also simplify the code a bit and ensure
incremental synchronization is ignored when ranges are
specified.
|
|
As with other commands, we enable pretty JSON by default if
stdout is a terminal or if --pretty is specified. While the
->pretty JSON output has excessive vertical whitespace, too many
lines is preferable to having everything on one line.
|
|
The meanings of "hwm" and "lwm" may not be obvious abbreviations
for (high|low) water mark descriptions used by RFC 3977.
"high" and "low" should be obvious to anyone.
|
|
"All" my CPUs is only 4, but it's probably ridiculous for
somebody with a 16-core system to have 16 processes for
accessing SQLite DBs.
We do the same thing in Pmdir for parallel Maildir access
(and V2Writable).
|
|
In retrospect, I don't think it's needed; and trying to wire up
a user interface for lei to manage process counts doesn't seem
worthwhile. It could be resurrected for public-facing daemon
use in the future, but that's what version control systems are for.
This also lets us automatically avoid setting up broadcast
sockets
Followup-to: 7b7939d47b336fb7 ("lei: lock worker counts")
|
|
We're not using Data::Dumper for JSON output.
|
|
With the switch from pipes to sockets for lei-daemon =>
lei/store IPC, we can send the script/lei client socket to the
lei/store process and rely on reference counting in both Perl
and the kernel to persist the script/lei.
|
|
This has several advantages:
* no need to use ipc.lock to protect a pipe for non-atomic writes
* ability to pass FDs. In another commit, this will let us
simplify lei->sto_done_request and pass newly-created
sockets to lei/store directly.
disadvantages:
- an extra pipe is required for rare messages over several
hundred KB, this is probably a non-issue, though
The performance delta is unknown, but I expect shards
(which remain pipes) to be the primary bottleneck IPC-wise
for lei/store.
|
|
Since some lei worker classes only use a single worker,
there's no sense in having broadcast for those cases.
|
|
This brings the wq_* SOCK_SEQPACKET API functionality
on par with the ipc_do (pipe-based) API.
|
|
git 2.33+ contains important optimizations for the
thousands-of-inboxes case. And combine the Inline::C stuff
with libgit2, since our use of libgit2 requires Inline::C.
|
|
We can't assume -imapd will be ready by the time we try to
connect to it after restart when using "-l $ADDR". So recreate
the (closed-for-testing) listen socket in the parent and hand it
off to -imapd as we do normally
|