about summary refs log tree commit homepage
path: root/lib/PublicInbox
DateCommit message (Collapse)
2019-05-23xcpdb: cleanup error handling and diagnosis
Running a full "public-inbox-index --reindex" in parallel with "public-inbox-xcpdb" on the same inbox can still cause problems, though.
2019-05-23xcpdb: implement progress reporting
Copying an entire Xapian DB is horribly slow whether it's done via Perl or copydatabase(1). So displaying some progress indication is good for user experience. While we're at it, prefix xapian-compact output, too; since parallel processes end up clobbering each other.
2019-05-23xcpdb: use fine-grained locking
Copying an entire Xapian DB takes a long time, so update our reindexing code to support partial reindexing, snapshot the pre-copydatabase git revisions, perform the lengthy copy, and do a partial reindex when the copy + renames are done.
2019-05-23v2writable: hoist out log_range sub for readability
This is preparation to to support partial reindexing
2019-05-23xapcmd: xcpdb supports compaction
To minimize the delay on active inboxes, it's actually ideal to run xapian-compact at the end of the per-partition cpdb process; since the new DB isn't accessible yet and so we don't have to deal with lock contention with -mda or -watch processes. The downside is temporary file overhead (3x instead of 2x) required.
2019-05-23xcpdb: implement using Perl bindings
By avoid copydatabase(1) entirely, we can make further changes to avoid locking the entire inbox for a long operation and switch to fine-grained locking.
2019-05-23admin: move index_inbox over
We will be reindexing after copydatabase
2019-05-23xapcmd: do not cleanup on errors
We move the old directory into the new directory, so avoid the situation where a bug or error could cause the tempdir cleanup to run and destroy both our old and new directories.
2019-05-23xapcmd: support spawn options
copydatabase(1) is exceptionally noisy and it's output is confusing when run in parallel. Support redirects at least, and env while we're at it to give us future options. We can also stuff a -jobs parameter into the options to limit parallelism since it can be useful for low-priority upgrade jobs.
2019-05-23admin: hoist out resolve_inboxes for -compact and -index
Both of these index-affecting commands should work similarly on the command-line. public-inbox-index no longer complains about unconfigured ~/.public-inbox/config; but often I found myself being annoyed by that, anyways...
2019-05-23xapcmd: new module for wrapping Xapian commands
Port public-inbox-compact(1) over to using it, and we will need to wrap copydatabase(1) to ease glass migrations, too.
2019-05-23search: reenable phrase search on non-chert Xapian
This is assuming nobody uses flint or earlier, anymore; as flint predates the existence of this project.
2019-05-23v1writable: retire in favor of InboxWritable
In retrospect, introducing V1Writable was unnecessary and InboxWritable->importer is in a better position to abstract away differences between v1 and v2 writers. So teach InboxWritable to initialize inboxes and get rid of V1Writable.
2019-05-22DS: warn on deprecations
Enabling deprecation warnings didn't seem to have any noticeable effects with "perl -w -c", so whatever reason Danga had for it is long irrelevant.
2019-05-22DS: remove IPPROTO_TCP import
Unlike Danga::Socket, we do not support TCP_CORK, either
2019-05-22DS: drop $VERSION var
It was only relevant to Danga::Socket.
2019-05-22DS: remove support OtherFds code
It's easy enough to wrap FDs in classes that can use all of the functionality of the event loop, not just the read-only interface AddOtherFds provided.
2019-05-22DS: get rid of unused methods and aliases
"make syntax" is clean, now
2019-05-22usercontent: stop relying on autodie
It's a non-standard package on CentOS-7, actually; and we shouldn't bloat the PSGI server by loading a module which isn't strictly needed.
2019-05-22git: workaround old git-rev-parse(1) (--git-path)
git < 2.5.0 was missing --git-path support. This means any users relying on some rare environment variables will need git 2.5.0+
2019-05-21Merge remote-tracking branch 'origin/xap-optional' into master
* origin/xap-optional: admin: improve warnings and errors for missing modules searchidx: do not create empty Xapian partitions for basic lazy load Xapian and make it optional for v2 www: use Inbox->over where appropriate nntp: use Inbox->over directly inbox: add ->over method to ease access
2019-05-21searchidx: remove unused Compress::Zlib import
We only compress in OverIdx, now; since we no longer do overview stuff in Xapian (and Xapian compresses document data, anyways).
2019-05-17PublicInbox::Import::add: Consolidate subject handling
Consolidate subject handling in the add function to make it easier to read and understand. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-05-16www: unescape '+' => ' ' before general URI unescape
This allows searching for terms with "+" in them properly.
2019-05-16Revert "view: perform highlighting for space-prefixed diffs"
This was buggy and was causing non-diff text to have extra leading spaces. The diff parsing code needs to be cleaned up, so this will be fixed, later. This reverts commit 1a67b91c1326efa372d1ec957e2494849d894f0b.
2019-05-16search: disable phrase searching, for now
There probably needs to be an option to enable this independently of indexlevel; but for now this is the safest option. And, as I discovered during the development of the indexlevel option, Xapian does a pretty good job of finding phrases without position data, anyways.
2019-05-16view: perform highlighting for space-prefixed diffs
"git format-patch --interdiff" and similar can prefix diffs with leading white space. Teach our diff parser to account for it and set appropriate CSS classes for them.
2019-05-15remove hard Devel::Peek dependency and lazy load for daemons
It's only useful for a corner case in long-running daemons when an admin decides to compact or vacuum a Xapian or SQLite DB. As a result, other scripts should run slightly faster. For instance, this saves about 80ms (2.710s => 2.630s) in t/mda.t on my remote workstation. While we're at it, make sure EvCleanup is properly require'd in Daemon.pm and HTTP.pm and document our use of Devel::Peek.
2019-05-15inbox: remove POSIX strftime import
We no longer need it since Inbox->recent hits the overview DB instead of Xapian
2019-05-15admin: improve warnings and errors for missing modules
Since we lazy-load Xapian now, some errors may become more cryptic or buried. Try to improve that by making Admin show better errors.
2019-05-15searchidx: do not create empty Xapian partitions for basic
No point in leaving a mess of empty directories when Xapian doesn't load.
2019-05-15lazy load Xapian and make it optional for v2
More tests work without Search::Xapian, now. Usability issues still need to be fixed
2019-05-15www: use Inbox->over where appropriate
We don't need to rely on Xapian search functionality for the majority of the WWW code, even. subject_normalized is moved to SearchMsg, where it (probably) makes more sense, anyways.
2019-05-15nntp: use Inbox->over directly
None of the NNTP code actually relies on Xapian, anymore.
2019-05-15inbox: add ->over method to ease access
One small step towards making installing Xapian optional for v2 and providing more WWW and NNTP functionality without it.
2019-05-14searchidx: fix incremental index with indexlevel=basic on v1
We were reindexing the full history every invocation of -index when Xapian was not used because we were incorrectly relying on 'last_commit' metadata stored in Xapian. Rewrite the indexing logic to be less confusing while we're at it, since we rely on `git merge-base --is-ancestor' nowadays. Furthermore, we need to handle message removals from the overview index correctly when Xapian is not in use. Co-authored-by: Eric W. Biederman <ebiederm@xmission.com>
2019-05-14v2writable: allow setting nproc via creat options
Avoiding reliance on environment variables is a bit cleaner for writing tests
2019-05-14v1writable: new wrapper which is closer to v2writable
Import initialization is a little strange from history, but we also can't change it too much because it's technically a public API which external code may rely on... And we may need to support v1 repos indefinitely. This should make it easier to write tests for both formats.
2019-05-08Merge remote-tracking branch 'origin/danga-bundle'
* origin/danga-bundle: DS: epoll: fix misordered EPOLL_CTL_DEL call DS: drop unused "_undef" sub syscall: drop readahead wrapper build: do not manify DS and Syscall pods DS: handle EINTR in IO::Poll path, too DS: workaround IO::Kqueue EINTR (mis-)handling DS: drop profiling support DS: remove unused fields and functions listener: use EPOLLEXCLUSIVE for listen sockets bundle Danga::Socket and Sys::Syscall
2019-05-08DS: epoll: fix misordered EPOLL_CTL_DEL call
Any operations on an fd after POSIX::close() are invalid, so epoll_ctl will fail. Worse off, in a multi-threaded Perl, the fd may be reused by another thread and EPOLL_CTL_DEL can hit the wrong file description as a result. cf. https://rt.cpan.org/Ticket/Display.html?id=129487
2019-05-08DS: drop unused "_undef" sub
No longer used since we removed the *_ip_string fields
2019-05-08syscall: drop readahead wrapper
No backwards compatibility to worry about for us; and fadvise is superior anyways.
2019-05-08DS: handle EINTR in IO::Poll path, too
IO::Poll::_poll returns -1, which is "true" to Perl. cf. https://rt.cpan.org/Ticket/Display.html?id=129484
2019-05-08nntp: avoid uninitialized variable from blank requests
We'll ignore blank lines from clients, since that's what innd seems to do.
2019-05-07wwwstream: do not load URI.pm
It's unneeded since commit e358bd7a3833f8c5 (2016-07-02) ("inbox: base_url method takes PSGI env hashref instead") So we only depend on URI::Escape from the "URI" CPAN distribution, at the moment.
2019-05-07spawn (Inline::C): fix off-by-one error
Noticed while testing on FreeBSD 11.2 amd64 with the optional Inline::C extension using clang 6.0.0. The end result on FreeBSD was spawning processes failed badly and things were immediately unusable with this enabled. av_len is a misleading API, and I failed to read the API comments in perl:/av.c which state: > Note that, unlike what the name implies, it returns > the highest index in the array, so to get the size of > the array you need to use "av_len(av) + 1". > This is unlike "sv_len", which returns what you would expect. If this bug affected anybody, it would've only affected users using both the optional Inline::C module AND set the PERL_INLINE_DIRECTORY environment variable. That said, I've never seen any evidence of it on Debian GNU/Linux + gcc on any x86 variant. That includes full 64-bit systems, a full 32-bit system, a 64-bit system with 32-bit userspace, across multiple gcc versions since 2016.
2019-05-06index: warn with info about the message as context
This can help users track down the source of warnings when presented with imperfect emails. While we're at it, make the __WARN__ callback in t/v2writable.t a no-op since we don't check for warnings, there.
2019-05-05Merge remote-tracking branch 'origin/wwwlisting'
* origin/wwwlisting: www: support listing of inboxes start depending on Perl 5.10.1+
2019-05-05DS: workaround IO::Kqueue EINTR (mis-)handling
IO::Kqueue seems unmaintained, so workaround a long-standing bug where it falls over on signals: https://rt.cpan.org/Ticket/Display.html?id=116615
2019-05-05DS: drop profiling support
There's other ways to profile and we don't need to add runtime branches to do this.