about summary refs log tree commit homepage
path: root/t
DateCommit message (Collapse)
2019-06-04t: avoid "subtest" for Perl 5.10.1 compatibility
The version of Test::More from Perl 5.10.1 did not support "subtest", and the earliest version which did is Perl 5.12.0 The good news is this gives me an excuse to parallelize the indexlevels-mirror test by splitting it into two. (it could be further split, even). Update t/nntpd. to use PI_TEST_VERSION consistently while we're at it.
2019-06-04linkify: support Internationalized Domain Names in URLs
The "\w" character class in Perl matches any word characters in the Unicode database, not just ASCII characters. So we must be prepared for that and generate links to IDNs.
2019-06-03t/psgi_search.t: require DBD::SQLite
In case we encounter an odd system which has Search::Xapian but not DBD::SQLite.
2019-06-01ds: fix and test for FD leaks with kqueue on ->Reset
Even though we currently don't use it repeatedly, ->Reset should close() kqueue FDs and not cause the process to run out of descriptors. Add a close-on-exec test while we're at it.
2019-06-01git: unconditional expiry
A constant stream of traffic to either httpd/nntpd would mean git-cat-file processes never expire. Things can go bad after a full repack, as a full repack will unlink old pack indices and git-cat-file does not currently detect unlinked files. We could do something complicated by recursively stat-ing objects/pack of every git directory and alternate; but that's probably not worth the trouble compared to occasionally restarting the cat-file process. So simplify the code and let httpd/nntpd expire them periodically, since spawning a "git-cat-file --batch" process isn't too expensive. We already spawn for every request which hits git-http-backend, cgit, and git-apply. In the future, we may optionally support the Git::Raw module to avoid IPC; but we must remain careful to not leave lingering FDs open to unlinked files after repack.
2019-05-29searchidx: store indexlevel=medium as metadata
And use it from Admin. It's easy to tell what indexlevel=basic is from unconfigured inboxes, but distinguishing between 'medium' and 'full' would require stat()-ing position.* files which is fragile and Xapian-implementation-dependent. So use the metadata facility of Xapian and store it in the main partition so Admin tools can deal better with unconfigured inboxes copied using generic tools like cp(1) or rsync(1).
2019-05-27v2: fix reindex skipping NNTP article numbers
`public-inbox-index --reindex' could cause NNTP article number gaps to form when it also has to deal with new, never-before-seen commits in mirrors running off `git fetch'. Fix this by running two distinct invocations of ->index_sync; once to only reindex old commits, and a second time to index new commits. This does not appear to be a problem on v1 at the moment, but I'll need more time to analyze this.
2019-05-27t/v1reindex.t: fix typo in setting `indexlevel'
It did not cause a test failure because the default fallback is `indexlevel=full'
2019-05-25t/indexlevels: fix indexlevel of ro_mirror
Don't hard-code "basic", since we already ran -init with the intended indexlevel.
2019-05-23xcpdb: implement progress reporting
Copying an entire Xapian DB is horribly slow whether it's done via Perl or copydatabase(1). So displaying some progress indication is good for user experience. While we're at it, prefix xapian-compact output, too; since parallel processes end up clobbering each other.
2019-05-23xcpdb: new tool which wraps Xapian's copydatabase(1)
copydatabase(1) is an existing Xapian tool which is the recommended way to upgrade existing DBs to the latest Xapian database format (currently "glass" for stable/released versions). Our use of Xapian relies on preserving document IDs, so we'll wrap it like we do xapian-compact(1) and use the "--no-renumber" switch. I could not name the tool "public-inbox-copydatabase" since it would be ambiguous as to which DB it's actually copying. So, I abbreviated the suffix to "xcpdb" (Xapian CoPy DataBase), which I hope is acceptable and unambiguous.
2019-05-23search: reenable phrase search on non-chert Xapian
This is assuming nobody uses flint or earlier, anymore; as flint predates the existence of this project.
2019-05-23v1writable: retire in favor of InboxWritable
In retrospect, introducing V1Writable was unnecessary and InboxWritable->importer is in a better position to abstract away differences between v1 and v2 writers. So teach InboxWritable to initialize inboxes and get rid of V1Writable.
2019-05-23t/convert-compact: skip on missing xapian-compact(1)
Can't run the test if the required Xapian tools are missing.
2019-05-22t/search*: require DBI and DBD::SQLite, too
None of the Search::Xapian-dependent stuff works without DBI and DBD::SQLite. There are no plans to support Xapian w/o DBD::SQLite since SQLite is more common and less resource-intensive than Xapian.
2019-05-22t/watch_filter_rubylang: disable v2 test for git < 2.6
This test was not disabled properly for ancient versions of git without get-mark support.
2019-05-21Merge remote-tracking branch 'origin/xap-optional' into master
* origin/xap-optional: admin: improve warnings and errors for missing modules searchidx: do not create empty Xapian partitions for basic lazy load Xapian and make it optional for v2 www: use Inbox->over where appropriate nntp: use Inbox->over directly inbox: add ->over method to ease access
2019-05-15remove hard Devel::Peek dependency and lazy load for daemons
It's only useful for a corner case in long-running daemons when an admin decides to compact or vacuum a Xapian or SQLite DB. As a result, other scripts should run slightly faster. For instance, this saves about 80ms (2.710s => 2.630s) in t/mda.t on my remote workstation. While we're at it, make sure EvCleanup is properly require'd in Daemon.pm and HTTP.pm and document our use of Devel::Peek.
2019-05-15searchidx: do not create empty Xapian partitions for basic
No point in leaving a mess of empty directories when Xapian doesn't load.
2019-05-15lazy load Xapian and make it optional for v2
More tests work without Search::Xapian, now. Usability issues still need to be fixed
2019-05-15www: use Inbox->over where appropriate
We don't need to rely on Xapian search functionality for the majority of the WWW code, even. subject_normalized is moved to SearchMsg, where it (probably) makes more sense, anyways.
2019-05-14tests: get rid of unnecessary Cwd module use
We only need it for tests that chdir, and maybe for ENV{PATH} portability (dash seems fine, not sure about others). v2: revert change to solver_git.t for FreeBSD 11.2 and document
2019-05-14t/nntp.t: skip if Data::Dumper is missing
We can revisit this, later; but Data::Dumper requires a separate package for CentOS-7 users, at least.
2019-05-14t/config.t: remove Data::Dumper dependency
CentOS-7 needs the perl-Data-Dumper package, and the test is small enough to roll our own escaping, here.
2019-05-14tests: remove unnecessary loading of ::DS and Socket
PublicInbox::DS works for every platform we we care about, nowadays; so checking for it is a waste of time. Cleanup a few POSIX and Socket imports while we're in the area.
2019-05-14searchidx: fix incremental index with indexlevel=basic on v1
We were reindexing the full history every invocation of -index when Xapian was not used because we were incorrectly relying on 'last_commit' metadata stored in Xapian. Rewrite the indexing logic to be less confusing while we're at it, since we rely on `git merge-base --is-ancestor' nowadays. Furthermore, we need to handle message removals from the overview index correctly when Xapian is not in use. Co-authored-by: Eric W. Biederman <ebiederm@xmission.com>
2019-05-14v2writable: allow setting nproc via creat options
Avoiding reliance on environment variables is a bit cleaner for writing tests
2019-05-08t/purge.t: fix unreferenced variable
2019-05-08Merge remote-tracking branch 'origin/danga-bundle'
* origin/danga-bundle: DS: epoll: fix misordered EPOLL_CTL_DEL call DS: drop unused "_undef" sub syscall: drop readahead wrapper build: do not manify DS and Syscall pods DS: handle EINTR in IO::Poll path, too DS: workaround IO::Kqueue EINTR (mis-)handling DS: drop profiling support DS: remove unused fields and functions listener: use EPOLLEXCLUSIVE for listen sockets bundle Danga::Socket and Sys::Syscall
2019-05-06index: warn with info about the message as context
This can help users track down the source of warnings when presented with imperfect emails. While we're at it, make the __WARN__ callback in t/v2writable.t a no-op since we don't check for warnings, there.
2019-05-05t/search.t: fix permissions check on FreeBSD
FreeBSD does not allow non-root users to set S_ISGID; so git skips this bit on FreeBSD and Debian/kFreeBSD platforms.
2019-05-04bundle Danga::Socket and Sys::Syscall
These modules are unmaintained upstream at the moment, but I'll be able to help with the intended maintainer once/if CPAN ownership is transferred. OTOH, we've been waiting for that transfer for several years, now... Changes I intend to make: * EPOLLEXCLUSIVE for Linux * remove unused fields wasting memory * kqueue bugfixes e.g. https://rt.cpan.org/Ticket/Display.html?id=116615 * accept4 support And some lower priority experiments: * switch to EV_ONESHOT / EPOLLONESHOT (incompatible changes) * nginx-style buffering to tmpfile instead of string array * sendfile off tmpfile buffers * io_uring maybe?
2019-04-19t/hl_mod: workaround w3m not handling &apos;
This fixes a test failure on my Debian buster system. Bug report filed for w3m to handle "&apos;": https://bugs.debian.org/927409 and for "highlight" to favor "&#39;" in case other browsers fail: https://bugs.debian.org/927410
2019-04-18linkify: require parentheses pairs in URLs
Dangling parentheses with trailing punctuation usually means the parentheses is not intended as part of the URL.
2019-04-18linkify: don't get confused by URLs in Perl code, at least
The URLs at the top of WwwStream.pm weren't getting linkified correctly.
2019-04-18git: calculate modified time of repository
This will be used for generating an HTML listing for v1 inboxes, at least. The logic for this follows that of grokmirror, and we may dynamically generate manifest.js.gz natively...
2019-04-16cleanup: use '$ibx' consistently when referring to Inbox refs
'$inbox' is more human-readable, so that is for the more human-readable name in most cases. Making our variable naming more consistent should make the code easier-to-review and harder to screw up.
2019-04-04spawn: require soft and hard entries in RLIMIT_* handling
Our high-level config already treats single limits as a soft==hard limit for limiters; so stop handling that redundant in the low-level spawn() sub.
2019-04-04spawn: support RLIMIT_CPU, RLIMIT_DATA and RLIMIT_CORE
We'll be spawning cgit and git-diff, which can take gigantic amounts of CPU time and/or heap given the right (ermm... wrong) input. Limit the damage that large/expensive diffs can cause.
2019-04-04git: add "commit_title" method
This will be useful for extracting titles/subjects from commit objects when displaying commits.
2019-02-13ensure bytes::length is available to callers
We were relying on Danga::Socket using the "bytes" pragma, previously. Nowadays, the "bytes" pragma is not recommended in general, but bytes::length remains acceptable for getting the byte-size of a scalar.
2019-02-07t/perf-msgview.t: fix broken performance test
WwwStream started depending on the WWW::style method for configurable CSS, so mock ::style so the benchmark runs properly. Fixes: f026dbdd392c9dd5 ('www: admin-configurable CSS via "publicinbox.css"')
2019-02-07t/perf-msgview: don't warn about --unordered if skipping
No point in making noise about something that isn't used.
2019-02-05hlmod: support "```$LANG" blocks in text
This is compatible with Markdown; but we still keep the WYSIWYG nature of plain-text with this. This is only intended for use with our documentation. Enabling any type of Markdown support for emails can lead to incompatibilities or interopability problems with alternative implementations.
2019-02-05hlmod: make into a singleton
It turns out there's no point in having multiple instances of this or having to worry about destruction or destruction ordering. This will make it easier to reuse the one instance we have across different modules.
2019-02-05hlmod: hoist out do_hl_lang sub
We'll want to use to support highlighting syntax used by Markdown and possibly other markup languages (while retaining the raw plain-text layout and formatting).
2019-02-05viewvcs: cleanup utf8 handling
Favor in-place utf8::decode since it's a bit faster without method dispatch overhead; and don't care about validity just yet. HlMod->do_hl itself should return "utf8" strings, since other parts of our code can use it, so it's not the job of ViewVCS to post-process HlMod output.
2019-02-01newswww: add /$MESSAGE_ID global redirector endpoint
This is the fallback for the normal WWW endpoint. Adding this to the top-level seems to be alright, since lynx and w3m both understand nntp://<HOSTNAME>/<Message-ID> anyways. If newsgroup and inbox names conflict, then consider it the fault of the original sender. Since NewsWWW is intended to support buggy linkifiers in mail clients, they can interpret nntp:// URLs as http://<HOSTNAME>/<Message-ID> Inbox ordering from the config file is preserved since commit cfa8ff7c256e20f3240aed5f98d155c019788e3b ("config: each_inbox iteration preserves config order"), so admins can rely on that to configure how scanning works. Requested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> cf. https://public-inbox.org/meta/20190107190719.GE9442@pure.paranoia.local/ nntp://news.public-inbox.org/20190107190719.GE9442@pure.paranoia.local
2019-02-01linkify: support proto://hostname without trailing slash
Sometimes users will write "http://example.com" without the trailing slash, which every browser and tool I've tested seems to understand.
2019-02-01hval: routines for attribute escaping
We'll use HTML attributes + anchor links to link to filenames in coming commits.