Date | Commit message (Collapse) |
|
Running a full "public-inbox-index --reindex" in parallel
with "public-inbox-xcpdb" on the same inbox can still cause
problems, though.
|
|
Copying an entire Xapian DB is horribly slow whether it's done
via Perl or copydatabase(1). So displaying some progress
indication is good for user experience.
While we're at it, prefix xapian-compact output, too; since
parallel processes end up clobbering each other.
|
|
Copying an entire Xapian DB takes a long time, so update our
reindexing code to support partial reindexing, snapshot the
pre-copydatabase git revisions, perform the lengthy copy,
and do a partial reindex when the copy + renames are done.
|
|
This is preparation to to support partial reindexing
|
|
To minimize the delay on active inboxes, it's actually ideal to
run xapian-compact at the end of the per-partition cpdb process;
since the new DB isn't accessible yet and so we don't have to
deal with lock contention with -mda or -watch processes. The
downside is temporary file overhead (3x instead of 2x) required.
|
|
By avoid copydatabase(1) entirely, we can make further changes
to avoid locking the entire inbox for a long operation and
switch to fine-grained locking.
|
|
We will be reindexing after copydatabase
|
|
We move the old directory into the new directory, so avoid the
situation where a bug or error could cause the tempdir cleanup to run
and destroy both our old and new directories.
|
|
copydatabase(1) is exceptionally noisy and it's output is
confusing when run in parallel. Support redirects at least, and
env while we're at it to give us future options.
We can also stuff a -jobs parameter into the options to limit
parallelism since it can be useful for low-priority upgrade
jobs.
|
|
Both of these index-affecting commands should work similarly
on the command-line.
public-inbox-index no longer complains about unconfigured
~/.public-inbox/config; but often I found myself being
annoyed by that, anyways...
|
|
Port public-inbox-compact(1) over to using it, and we will need
to wrap copydatabase(1) to ease glass migrations, too.
|
|
This is assuming nobody uses flint or earlier, anymore;
as flint predates the existence of this project.
|
|
In retrospect, introducing V1Writable was unnecessary and
InboxWritable->importer is in a better position to abstract
away differences between v1 and v2 writers.
So teach InboxWritable to initialize inboxes and get rid
of V1Writable.
|
|
Enabling deprecation warnings didn't seem to have any
noticeable effects with "perl -w -c", so whatever reason
Danga had for it is long irrelevant.
|
|
Unlike Danga::Socket, we do not support TCP_CORK, either
|
|
It was only relevant to Danga::Socket.
|
|
It's easy enough to wrap FDs in classes that can use
all of the functionality of the event loop, not just
the read-only interface AddOtherFds provided.
|
|
"make syntax" is clean, now
|
|
It's a non-standard package on CentOS-7, actually; and we
shouldn't bloat the PSGI server by loading a module which
isn't strictly needed.
|
|
git < 2.5.0 was missing --git-path support. This means any
users relying on some rare environment variables will need git
2.5.0+
|
|
* origin/xap-optional:
admin: improve warnings and errors for missing modules
searchidx: do not create empty Xapian partitions for basic
lazy load Xapian and make it optional for v2
www: use Inbox->over where appropriate
nntp: use Inbox->over directly
inbox: add ->over method to ease access
|
|
We only compress in OverIdx, now; since we no longer do overview
stuff in Xapian (and Xapian compresses document data, anyways).
|
|
Consolidate subject handling in the add function to make it easier to
read and understand.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
This allows searching for terms with "+" in them properly.
|
|
This was buggy and was causing non-diff text to have extra
leading spaces. The diff parsing code needs to be cleaned up,
so this will be fixed, later.
This reverts commit 1a67b91c1326efa372d1ec957e2494849d894f0b.
|
|
There probably needs to be an option to enable this
independently of indexlevel; but for now this is
the safest option.
And, as I discovered during the development of the
indexlevel option, Xapian does a pretty good job of
finding phrases without position data, anyways.
|
|
"git format-patch --interdiff" and similar can prefix diffs
with leading white space. Teach our diff parser to account
for it and set appropriate CSS classes for them.
|
|
It's only useful for a corner case in long-running daemons when
an admin decides to compact or vacuum a Xapian or SQLite DB.
As a result, other scripts should run slightly faster. For
instance, this saves about 80ms (2.710s => 2.630s) in t/mda.t
on my remote workstation.
While we're at it, make sure EvCleanup is properly require'd
in Daemon.pm and HTTP.pm and document our use of Devel::Peek.
|
|
We no longer need it since Inbox->recent hits the overview
DB instead of Xapian
|
|
Since we lazy-load Xapian now, some errors may become
more cryptic or buried. Try to improve that by making
Admin show better errors.
|
|
No point in leaving a mess of empty directories when Xapian
doesn't load.
|
|
More tests work without Search::Xapian, now.
Usability issues still need to be fixed
|
|
We don't need to rely on Xapian search functionality for the
majority of the WWW code, even. subject_normalized is moved to
SearchMsg, where it (probably) makes more sense, anyways.
|
|
None of the NNTP code actually relies on Xapian, anymore.
|
|
One small step towards making installing Xapian optional for v2
and providing more WWW and NNTP functionality without it.
|
|
We were reindexing the full history every invocation of -index
when Xapian was not used because we were incorrectly relying on
'last_commit' metadata stored in Xapian.
Rewrite the indexing logic to be less confusing while we're
at it, since we rely on `git merge-base --is-ancestor' nowadays.
Furthermore, we need to handle message removals from the
overview index correctly when Xapian is not in use.
Co-authored-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Avoiding reliance on environment variables is a bit cleaner
for writing tests
|
|
Import initialization is a little strange from history, but we
also can't change it too much because it's technically a public
API which external code may rely on...
And we may need to support v1 repos indefinitely. This should
make it easier to write tests for both formats.
|
|
* origin/danga-bundle:
DS: epoll: fix misordered EPOLL_CTL_DEL call
DS: drop unused "_undef" sub
syscall: drop readahead wrapper
build: do not manify DS and Syscall pods
DS: handle EINTR in IO::Poll path, too
DS: workaround IO::Kqueue EINTR (mis-)handling
DS: drop profiling support
DS: remove unused fields and functions
listener: use EPOLLEXCLUSIVE for listen sockets
bundle Danga::Socket and Sys::Syscall
|
|
Any operations on an fd after POSIX::close() are invalid, so
epoll_ctl will fail. Worse off, in a multi-threaded Perl, the
fd may be reused by another thread and EPOLL_CTL_DEL can hit the
wrong file description as a result.
cf. https://rt.cpan.org/Ticket/Display.html?id=129487
|
|
No longer used since we removed the *_ip_string fields
|
|
No backwards compatibility to worry about for us; and fadvise
is superior anyways.
|
|
IO::Poll::_poll returns -1, which is "true" to Perl.
cf. https://rt.cpan.org/Ticket/Display.html?id=129484
|
|
We'll ignore blank lines from clients, since that's what innd
seems to do.
|
|
It's unneeded since commit e358bd7a3833f8c5 (2016-07-02)
("inbox: base_url method takes PSGI env hashref instead")
So we only depend on URI::Escape from the "URI" CPAN distribution,
at the moment.
|
|
Noticed while testing on FreeBSD 11.2 amd64 with the optional
Inline::C extension using clang 6.0.0. The end result on
FreeBSD was spawning processes failed badly and things were
immediately unusable with this enabled.
av_len is a misleading API, and I failed to read the API
comments in perl:/av.c which state:
> Note that, unlike what the name implies, it returns
> the highest index in the array, so to get the size of
> the array you need to use "av_len(av) + 1".
> This is unlike "sv_len", which returns what you would expect.
If this bug affected anybody, it would've only affected users
using both the optional Inline::C module AND set the
PERL_INLINE_DIRECTORY environment variable.
That said, I've never seen any evidence of it on Debian
GNU/Linux + gcc on any x86 variant. That includes full 64-bit
systems, a full 32-bit system, a 64-bit system with 32-bit
userspace, across multiple gcc versions since 2016.
|
|
This can help users track down the source of warnings
when presented with imperfect emails.
While we're at it, make the __WARN__ callback in t/v2writable.t
a no-op since we don't check for warnings, there.
|
|
* origin/wwwlisting:
www: support listing of inboxes
start depending on Perl 5.10.1+
|
|
IO::Kqueue seems unmaintained, so workaround a long-standing
bug where it falls over on signals:
https://rt.cpan.org/Ticket/Display.html?id=116615
|
|
There's other ways to profile and we don't need to add runtime
branches to do this.
|