Date | Commit message (Collapse) |
|
"lei index" support for IMAP and NNTP is incomplete, so there's
no point in requiring them.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210927124056.kj5okiefvs4ztk27@meerkat.local/
|
|
The "-w" perlop always succeeds as root, so we need to check
st_mode for writability bits to detect directories we shouldn't
write to.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210927124056.kj5okiefvs4ztk27@meerkat.local/
|
|
Apparently, sendmsg can fail in less common ways when
network buffers are gigantic. Add some diagnostics for
future failures, as well.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210927124056.kj5okiefvs4ztk27@meerkat.local/
|
|
More switches which can be useful for users who pipe from text
editors. --drq can be helpful while writing patch review email
replies, and perhaps --dequote-only, too.
|
|
lei rediff is expected to see partial patch fragments and such,
so silence warnings when something isn't exactly a valid email
message.
|
|
The $sigchld handler was reporting the last test (successful or
not) for a given PID in case a worker dies prematurely.
Instead, redisplay all failed test in $run_log to ensure the
report only shows failed tests, and not the last started (and
possibly successful) one.
|
|
We still need Email::MIME to test against old revisions.
We'll also depend on the revision just prior to the
manifest.js.gz introduction to avoid loading Danga::Socket,
since it was getting loaded even with `plackup'.
Finally, we'll disable Inline::C usage with old Spawn.pm
since our old code included alloca.h, which is not
portable to FreeBSD.
|
|
There may still be pre-manifest.js.gz versions of
PublicInbox::WWW running and serving v2 inboxes.
While -clone and "add-external --mirror" were working, -fetch
was failing due to 301 redirect to $INBOX_URL/manifest.js.gz/
and not the expected 404. Update the code to deal with a JSON
decode error (from the 301) and ensure v2 epochs detection is
correct (and not using a shadowed variable).
|
|
This makes it easier for users to enable fetching on a
previously read-only epoch. Prior to this change, users were
required to delete manifest.js.gz in addition to adding the
writable bit. Now, they just have to "chmod +w $EPOCH_DIR".
|
|
There may still be pre-manifest.js.gz versions of PublicInbox::WWW.
running and serving v2 inboxes.
Since $INBOX_URL/manifest.js.gz was not understood, it was
assumed to be a Message-ID and 301-ed to
"$INBOX_URL/manifest.js.gz/" with a trailing slash, so our 404
checks were invalid. Update our fallbacks to deal with 301
by catching JSON decoding errors to trigger HTML scraping.
For HTML parsing, be sure to not be fooled by potential
user-generated content and only scan the part after the last
<hr>.
We also need to avoid propagating $? from curl unnecessarily
when we can continue safely.
Finally, update v2mirror.t with tests to use PublicInbox::WWW
from our "v1.1.0-pre1" tag to ensure these code paths get tested
|
|
Partial (v2) clones should be useful addition for users wanting
to conserve storage while having fast access to recent messages.
Continuing work started in 876e74283ff3 (fetch: ignore
non-writable epoch dirs, 2021-09-17), this creates bare,
read-only epoch git repos. These git repos have the remotes
pre-configured, but does not fetch any objects.
The goal is to allow users to set the writable bit on a
previously-skipped epoch and start fetching it.
Shell completion support may not be necessary given how short
the epoch ranges are, here.
Cc: Luis Chamberlain <mcgrof@kernel.org>
Link: https://public-inbox.org/meta/20210917002204.GA13112@dcvr/T/#u
|
|
"Correct" meaning the permissions match that of the parent
xap15 or ei15 directory.
|
|
Neither Inboxes nor ExtSearch objects were retrying correctly
when there are live git processes, but the inboxes were getting
rescanned for search or other reasons. Ensure the scan retries
eventually if there's live processes.
We also need to update the cleanup task to detect Xapian shard
count changes, since Xapian ->reopen is enough to detect any
other Xapian changes. Otherwise, we just issue an inexpensive
->reopen call and let Xapian check whether there's anything
worth reopening.
This also lets us eliminate the Devel::Peek dependency.
|
|
This fixes the occasional t/lei-sigpipe.t infinite loop
under "make check-run".
Link: http://nntp.perl.org/group/perl.perl5.porters/258784
<CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com>
Followup-to: b552bb9150775fe4 ("daemon+watch: fix localization of %SIG for non-signalfd users")
|
|
We could redirect, too, but just use -q since we don't care
for the output with run_mode => 0.
|
|
I wanted to try --dedupe=none for something, but it failed
since I forgot --no-save :x So hint users towards --no-save
if necessary.
|
|
Overwriting existing destinations safe (but slow) by default,
so show a progress message noting what we're doing while
a user waits.
|
|
NNTP URLs are probably more prevalent in public message archives
than IMAP URLs.
|
|
No reason not to support them, since there's more
public-inbox-nntpd instances than -imapd instances,
currently.
|
|
NNTP article numbers are stored separately from folder names
in mail_sync.sqlite3.
Recovering from this is optional, worse case is wasting
bandwidth refetching some messages. To (optionally) recover
from this, use:
lei forget-mail-sync $URL_WITH_ARTNUMS
Some articles will be refetched on the next import, but
duplicate data won't be indexed in Xapian.
|
|
As with "lei edit-search", "lei config --edit" may
spawn an interactive editor which works best from
the terminal running script/lei.
So implement LeiConfig as a superclass of LeiEditSearch
so the two commands can share the same verification
hooks and retry logic.
|
|
At least not by default, to match existing NNTP behavior.
Tor .onions are already encrypted, and there's no point
in encrypting traffic on localhost outside of testing.
|
|
While NNTP ranges was already working, fetching a single message
was broken. We'll also simplify the code a bit and ensure
incremental synchronization is ignored when ranges are
specified.
|
|
In retrospect, I don't think it's needed; and trying to wire up
a user interface for lei to manage process counts doesn't seem
worthwhile. It could be resurrected for public-facing daemon
use in the future, but that's what version control systems are for.
This also lets us automatically avoid setting up broadcast
sockets
Followup-to: 7b7939d47b336fb7 ("lei: lock worker counts")
|
|
This brings the wq_* SOCK_SEQPACKET API functionality
on par with the ipc_do (pipe-based) API.
|
|
We can't assume -imapd will be ready by the time we try to
connect to it after restart when using "-l $ADDR". So recreate
the (closed-for-testing) listen socket in the parent and hand it
off to -imapd as we do normally
|
|
I configured this for public-inbox.org, but wasn't 100% sure it
worked. This test ensures it stays working :>
|
|
Since we can't use maxuid for remote externals, automatically
maintaining the last time we got results and appending a dt:
range to the query will prevent HTTP(S) responses from getting
too big.
We could be using "rt:", but no stable release of public-inbox
supports it, yet, so we'll use dt:, instead.
By default, there's a two day fudge factor to account for MTA
downtime and delays; which is hopefully enough. The fudge
factor may be changed per-invocation with the
--remote-fudge-factor=INTERVAL option
Since different externals can have different message transport
routes, "lastresult" entries are stored on a per-external basis.
|
|
This will eventually be useful for maintaing partial mirrors.
Keeping inline with the original public-inbox-fetch philosophy,
there are no additional config files to manage:
the user merely needs to remove write permissions to an $N.git
directory to prevent it from being updated.
Re-enabling updates just requires restoring write permission.
|
|
While git respects a user's local timezone and returns
seconds-since-the-Epoch, we were unnecessarily and incorrectly
calling gmtime+strftime on its result. So ignore calling
gmtime+strftime when the strftime format is "%s", just feed
the output time from git directly to Xapian.
This is mainly for lei, which will likely run in a variety of
timezones. While we're at it, add a recommendation to use
TZ=UTC in public-inbox-httpd, in case there are (misguided :P)
sysadmins who set a non-UTC TZ.
|
|
Like with Maildir, IMAP folders can be deleted entirely.
Ensure they can be eliminated, but don't be fooled into
removing them if they're temporarily unreachable.
|
|
There's no point in keeping mail_sync.sqlite3 entries around
if the folder is gone. We do keep saved-search configs around,
however, since somebody may decide to blow away a search and
start over.
|
|
Merely pruning mail synchronization information was
insufficient for Maildir: renames are common in Maildir
and we need to detect them after-the-fact when lei-daemon
isn't running.
Running this command could make "lei index" far more
useful...
v2: close R/O mail_sync.sqlite3 dbh before fork
Keeping the DB file handle open across fork can cause bad things
to happen even if we don't use it since sqlite3 itself still knows
about it (but doesn't know Perl code doesn't know about it).
|
|
We no longer waste a precious hash slot for a per-Inbox
{nntpserver} if it's only configured globally for all inboxes.
|
|
The full pathname for "curl -o ..." was too noisy and confusing.
Reduce confusion by adding the ".tmp" suffix and relying on
"-C". We'll also avoid displaying "-C" in run_reap() and
rely on "--git-dir=" with "git fetch" to display progress for
users.
|
|
Since the beginning of time, I've been dropping Makefiles
in $INBOX_DIR (and above hiearchies) to organize groups
of commands.
make(1) is widely available in various flavors and a familiar
tool for our target audience. It is easy to run in the right
directory, typically has built-in shell completion, and doesn't
silently ignore errors by default like Bourne shell.
|
|
As noted in the new manpage entry, this is useful for avoiding
public-inbox-index invocations when there's nothing to update.
We use 127 to match "grok-pull", and also because it doesn't
conflict with any of the current curl(1) exit codes.
|
|
IMHO, this greatly improves code sharing and organization
between v2, extindex, and lei/store. Common git-related
logic for these is lightly-refactored and easier to reason
about.
The impetus for this big change was to ensure inboxes
created+managed by public-inbox-{clone,fetch} could have
alternates and configs setup properly without depending on
SQLite (via V2Writable). This change does that while
making old code shorter and better factored.
|
|
Again, we were failing to account for '/' use in mailbox names :x
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210914210547.akdp4cqmwaheayp5@meerkat.local/
|
|
Untested at the moment(*), but we were inadvertantly truncating
mailbox names with '/' due to our work-in-progress handling of
"/;UID=$NUM" parameter.
(*) strangely, my dovecot instance doesn't allow '/' by default,
so the change to xt/net_writer-imap.t is untested.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210914175025.eq7s2shkc323itaf@meerkat.local/
|
|
While persisting lei-daemon across different test cases isn't
the default anymore, we can notice problems more quickly if
the daemon PID changes since the daemon gets auto-restarted
after failures.
|
|
Oops :x
Fixes: b584a53f053a7629 ("lei up: support --all for IMAP folders")
|
|
PIPE_BUF accounts for Linux being 4096 (and presumably other
OSes differing), while _POSIX_PIPE_BUF is the minimum 512
value.
|
|
t/v2mirror.t and t/lei-mirror.t are now skipped when curl
is missing (instead of failing in appropriate places).
A bunch of which() checks are updated to use require_cmd
to avoid explicitly loading Spawn.
|
|
Timestamp comparisons only have 1 second granularity, which
isn't nearly enough for our test cases, and probably not for
real world use for "git send-email" bursts and fast SMTP
servers.
We'll continue to check modification times inside the manifest,
though, in case an extremely rare SHA-1 collision is found...
|
|
It was also totally broken by the change to use manifest.js.gz
for v1 :x
Fixes: ffb7fbda6869db4b ("fetch: use manifest.js.gz for v1")
|
|
The v1 code path was totally half-baked after the change
to use manifest.js.gz :x
Fixes: ffb7fbda6869db4b ("fetch: use manifest.js.gz for v1")
|
|
And try to improve the message about Inline::C while we're at
it, since Socket::Msghdr isn't widely-packaged, yet.
|
|
This ensures tests are skipped properly if SQLite or Xapian
are missing and don't bail out.
|
|
"Unnamed repository" for v1 inboxes was misleading, and having a
non-existent description for v2 was equally annoying, so set a
short description based on the primary address.
We remove descriptions when setting up new test inboxes to
preserve the behavior of the t/lei-mirror.t test case.
|