Date | Commit message (Collapse) |
|
It's inlined into the main function, which we'll shorten
slightly with the defined-or (`//') operator. Also noticed
and fixed a mismatched HTML tag.
|
|
We switched to Parse::RecDescent during development and left
some dead code behind.
|
|
Only inbox accesses the read-only {over}, now, instead of going
through ->search. This simplifies our object graph and avoids
potentially redundant FDs and DB handles pointing to the same
over.sqlite3 file.
|
|
Nearly all of the search uses in the production code rely on
a Xapian mset iterator being returned (instead of an array
of $smsg objects). So default to returning the mset and move
the burden of smsg array conversion into the test cases.
|
|
The special case (if any) belongs at a higher-level,
and this is another step towards removing {over_ro}-dependence
in our Search object.
|
|
We'll use {oidx} as the common field name for the read-write
OverIdx, here, to disambiguate it from the read-only {over}
field. This hopefully makes it clearer which code paths are
read-only and which are read-write.
|
|
It'll likely be used in the future for JMAP, detached indices,
and maybe other things.
|
|
Just some golfing to reduce scrolling and hopefully readability.
|
|
For consistency with other commands, though the
protocol-specific options should refer users to
the manpage.
|
|
And while we're at it, note edit is *destructive* to encourage
reading the fine manual.
|
|
"inboxes 1 inboxes not supported by ..." was non-sensical.
Now it'll show "-V1 inbox not supported by ...", instead.
|
|
ParentPipe was a subset of EOFpipe, except EOFpipe correctly
accounts for theoretical(*) spurious wakeups on the pipe.
(*) AFAIK, spurious wakeups are/were more likely on TCP sockets
due to checksum failures, something that's not a problem on
local pipes. We're also not sharing pipes like we do with
listen sockets on accept(2), so there's no chance of another
process grabbing bytes (unless we have bugs in our code).
|
|
It doesn't seem necessary, since we won't call dwaitpid()
until we see an EOF.
|
|
It's a bit inefficient to use a pipe, here. However, using
dwaitpid() on a process that's not expected to exit soon is
also inefficient as it causes excessive wakeups as most of
our inbox-writing code expects synchronous waitpid().
This only affects -watch instances configured for NNTP and IMAP
clients.
|
|
We should not enqueue reap_pids() to run more than once per
EventLoop iteration. We'll start reformatting reap_pids
to tabs, too, since we're no longer Danga::Socket.
We should also be able to remove timer usage for reaping
down-the-line once we stop abusing dwaitpid() in -watch.
|
|
Get rid of an unused variable, prefix a warning and try to
better document control flow around various callbacks.
|
|
In case there's non-Linux or BSD users w/o IO::KQueue, we
shouldn't let signal handlers fire in the child processes.
The child processes always assumed signals were blocked by
the parent, so no changes were necessary, there.
|
|
This should further mitigate lock contention problems
when -watch is configured to watch on a Maildir for spam
while performing a large NNTP import.
There is now a small risk a message won't get removed because if
it's in the current (uncommitted) fast-import batch, but
unlikely given the batch size is now only 10 messages.
If a that small window is hit, flipping the \Seen flag
(e.g. marking it unread, and then read again) will trigger
another removal attempt via IMAP or Maildir.
|
|
This is no longer limited to Maildirs now that IMAP and NNTP
support exist; so give it a shorter name.
|
|
Declare 5.10.1 to avoid potential compatibility problems with
Perl 7/8 down the line. We'll rely on the command-line to set
or drop warnings during development, at least.
|
|
We don't want to monopolize locks because processes can easily
block each other if using `watchspam' on a Maildir while a big
NNTP or IMAP import is happening.
This can also happen if somebody configured a single inbox to
watch from several sources to merge several mailboxes into one
(e.g. both an IMAP and Maildir are watched).
|
|
Quiet down logs from -imapd when clients are blindly
sending some unsupported flag conditions (e.g. "DRAFT",
"DELETED") specified in RFC 3501.
|
|
By making it a no-op if last_uid is not defined. This isn't a
hot code path, so the extra method dispatch isn't an issue.
It'll save some indentation/wrapping in future commits.
|
|
Data needs to hit inboxes, first. Otherwise it's possible to
skip messages in case git-fast-import is killed before it sees
"done\n". Now, -watch will just waste a little bandwidth in
re-downloading a seen message if it's interrupted immediately
before updating IMAPTracker.
|
|
Being an easily confused person, I find "next" and "prev"
ambiguous as to whether messages on the next or previous page
will be newer or older than the current page. Clarify that for
the threaded /$INBOX/ view and search results.
For search results sorted by relevance, we'll use "[>= $SCORE]"
or "[<= $SCORE]" to indicate to indicate directionality.
This also fixes $INBOX/new.html for unindexed v1 inboxes.
|
|
Sometimes it's useful to quickly get to threads and messages
which are contemporaries of the current thread/message being
focused on. This hopefully improves navigation by making:
a) the top line (where $INBOX_DIR/description) is shown
a link to the latest topics in search results and
per-thread/per-message views.
b) providing a link to contemporaries ("~YYYY-MM-DD") at
around the thread overview skeleton area for per-thread
and per-message views
|
|
This matches the behavior of Maildir `watchspam' handling in not
removing unseen messages. NNTP can't match this behavior, since
NNTP servers don't store flags, clients do.
|
|
There's no need for this to be a separate sub since there's
only a single caller. This saves a few kilobytes at least
in short-lived processes.
|
|
It's no problem for most users to enable WAL, here, since
there's only a single process doing both reading and writing
(unlike the read-only daemons). However, WAL doesn't work on
network filesystems, so it can't be enabled by default.
|
|
For consistency in output, any URL/path-context-dependent
prefixes should have the same prefix as the actual warning which
triggered it.
|
|
I'm seeing "read: Connection timed out" from in my syslog from
-httpd. The fail() calls in PublicInbox::Git seems to be the
only code path of ours which could trigger it...
ETIMEDOUT shouldn't happen on pipes, only sockets; and all of
our socket operations are non-blocking. So this could be
cgit-wwwhighlight-filter.lua, but that's connecting over
localhost, though on fairly loaded HW.
|
|
A `PI_XAPIAN' environment variable is now exposed for testing
purposes. We'll also deal with the removal of
`NumberValueRangeProcessor' and use `NumberRangeProcessor'
in its place, but continue favoring the old Search::Xapian
since that's all that's packaged for Debian 10.x stable.
|
|
We use the defined-or (`//', `//=') operators in 5.10,
so require 5.10.1 like the rest of our codebase. Update
an outdated comment while we're at it.
|
|
v5.10.1 lets us use the lighter parent.pm instead of base.pm,
and we'll rely on the shebang to enable warnings (or not).
While we're in the area, drop a no-longer-necessary import for
PublicInbox::Search, since OverIdx doesn't require search.
|
|
As noted in commit 87dca6d8d5988c5eb54019cca342450b0b7dd6b7
("www: rework query responses to avoid COUNT in SQLite"),
COUNT on many rows is expensive on big SQLite DBs.
We've already stopped using that code path long ago in WWW
while -imapd and -nntpd never used it. So we'll adjust our
remaining test cases to not need it, either.
|
|
Since we got rid of over->connect, `disconnect' no longer pairs
with it. So name it after the `close(2)' syscall it ultimately
issues.
|
|
`->connect' is confused with the perlfunc for the `connect(2)'
syscall, and also `DBI->connect'. Since SQLite doesn't use
sockets, the word "connect" needlessly confuses me. Give
it a short name to match the field name we use for it, which
also matches the variable name used by the DBI(3pm) and
DBD::SQLite(3pm) manpages.
|
|
The SWIG binding won't auto-convert IV/UV to PV like the XS
Search::Xapian binding would, so workaround that shortcoming
for now.
Fixes: a367ec1b15a2458 ("mbox: disable "&t" on existing Xapian until full reindex")
|
|
WAL actually seems to have ideal locking characteristics given
concurrency problems I'm experiencing with --reindex running
in parallel with expensive read-only SQLite queries:
<https://public-inbox.org/meta/20200825001204.GA840@dcvr/>
Unfortunately, we cannot blindly use WAL while preserving
compatibility with existing setups nor our guarantees that
read-only daemons are indeed "read-only".
However, respect an user's the choice to set WAL on their
own if they're comfortable with giving -nntpd/-httpd/-imapd
processes write permission to the directory storing SQLite DBs.
|
|
It's fewer queries and matches what we do in OverIdx.
|
|
This file gets truncated anyhow, so it won't fragment.
|
|
croak() can give more context on the failure, and setting
`PERL5OPT=-MCarp=verbose' can force a stacktrace.
|
|
There's no reason we'd want Xapian to defer flushing once we've
indexed everything belonging to a particular shard.
|
|
Expanding threads via over.sqlite3 for mbox.gz downloads without
Xapian effectively collapsing on the THREADID column leads to
repeated messages getting downloaded.
To avoid that situation, use a "has_threadid" Xapian metadata
flag that's only set on --reindex (and brand new Xapian DBs).
This allows admins to upgrade WWW or do --reindex in any order;
without worrying about users eating up bandwidth and CPU cycles.
|
|
Finally, the addition of THREADID for collapsing results
in Xapian lets us emulate the "mairix --threads" feature.
That is, instead of returning only the matching messages,
the entire thread is included in the downloaded mbox.gz
This requires a "public-inbox-index --reindex" to be usable.
|
|
This is the `tid' column from over.sqlite3; and will be used for
IMAP and JMAP search (among other things).
|
|
We'll also rename the /^remote_/ prefix to "shard_", since
remote implies the process is on a different host. These
methods only pass messages to a child process on the same host
OR perform operations within the same process.
|
|
Merely assigning `undef' to a scalar does not free the
underlying buffer memory of a scalar.
|
|
Unlike w3m and links, the lynx browser seems to require a `name'
attribute for `<input type=submit>' elements. Maybe some other
browsers do, too. The `name' attribute for submit elements
doesn't seem to cause any harm for w3m or links, users, either;
despite not (AFAIK) being part of historical or current HTML
specs.
|
|
We can avoid importing mdocid() in several places by using
this method, simplifying callers.
|