Date | Commit message (Collapse) |
|
The version of Test::More from Perl 5.10.1 did not support
"subtest", and the earliest version which did is Perl 5.12.0
The good news is this gives me an excuse to parallelize
the indexlevels-mirror test by splitting it into two.
(it could be further split, even).
Update t/nntpd. to use PI_TEST_VERSION consistently while
we're at it.
|
|
The "\w" character class in Perl matches any word characters
in the Unicode database, not just ASCII characters. So we
must be prepared for that and generate links to IDNs.
|
|
In case we encounter an odd system which has Search::Xapian
but not DBD::SQLite.
|
|
Even though we currently don't use it repeatedly, ->Reset
should close() kqueue FDs and not cause the process to run
out of descriptors.
Add a close-on-exec test while we're at it.
|
|
A constant stream of traffic to either httpd/nntpd would mean
git-cat-file processes never expire. Things can go bad after a
full repack, as a full repack will unlink old pack indices and
git-cat-file does not currently detect unlinked files.
We could do something complicated by recursively stat-ing
objects/pack of every git directory and alternate;
but that's probably not worth the trouble compared to
occasionally restarting the cat-file process.
So simplify the code and let httpd/nntpd expire them
periodically, since spawning a "git-cat-file --batch" process
isn't too expensive. We already spawn for every request which
hits git-http-backend, cgit, and git-apply.
In the future, we may optionally support the Git::Raw module
to avoid IPC; but we must remain careful to not leave lingering
FDs open to unlinked files after repack.
|
|
And use it from Admin.
It's easy to tell what indexlevel=basic is from unconfigured
inboxes, but distinguishing between 'medium' and 'full' would
require stat()-ing position.* files which is fragile and
Xapian-implementation-dependent.
So use the metadata facility of Xapian and store it in the main
partition so Admin tools can deal better with unconfigured
inboxes copied using generic tools like cp(1) or rsync(1).
|
|
`public-inbox-index --reindex' could cause NNTP article number
gaps to form when it also has to deal with new,
never-before-seen commits in mirrors running off `git fetch'.
Fix this by running two distinct invocations of ->index_sync;
once to only reindex old commits, and a second time to index
new commits.
This does not appear to be a problem on v1 at the moment,
but I'll need more time to analyze this.
|
|
It did not cause a test failure because the default fallback
is `indexlevel=full'
|
|
Don't hard-code "basic", since we already ran -init with the
intended indexlevel.
|
|
Copying an entire Xapian DB is horribly slow whether it's done
via Perl or copydatabase(1). So displaying some progress
indication is good for user experience.
While we're at it, prefix xapian-compact output, too; since
parallel processes end up clobbering each other.
|
|
copydatabase(1) is an existing Xapian tool which is the
recommended way to upgrade existing DBs to the latest Xapian
database format (currently "glass" for stable/released
versions). Our use of Xapian relies on preserving document IDs,
so we'll wrap it like we do xapian-compact(1) and use the
"--no-renumber" switch.
I could not name the tool "public-inbox-copydatabase" since it
would be ambiguous as to which DB it's actually copying. So, I
abbreviated the suffix to "xcpdb" (Xapian CoPy DataBase), which
I hope is acceptable and unambiguous.
|
|
This is assuming nobody uses flint or earlier, anymore;
as flint predates the existence of this project.
|
|
In retrospect, introducing V1Writable was unnecessary and
InboxWritable->importer is in a better position to abstract
away differences between v1 and v2 writers.
So teach InboxWritable to initialize inboxes and get rid
of V1Writable.
|
|
Can't run the test if the required Xapian tools are missing.
|
|
None of the Search::Xapian-dependent stuff works without DBI
and DBD::SQLite.
There are no plans to support Xapian w/o DBD::SQLite since
SQLite is more common and less resource-intensive than Xapian.
|
|
This test was not disabled properly for ancient versions of
git without get-mark support.
|
|
* origin/xap-optional:
admin: improve warnings and errors for missing modules
searchidx: do not create empty Xapian partitions for basic
lazy load Xapian and make it optional for v2
www: use Inbox->over where appropriate
nntp: use Inbox->over directly
inbox: add ->over method to ease access
|
|
It's only useful for a corner case in long-running daemons when
an admin decides to compact or vacuum a Xapian or SQLite DB.
As a result, other scripts should run slightly faster. For
instance, this saves about 80ms (2.710s => 2.630s) in t/mda.t
on my remote workstation.
While we're at it, make sure EvCleanup is properly require'd
in Daemon.pm and HTTP.pm and document our use of Devel::Peek.
|
|
No point in leaving a mess of empty directories when Xapian
doesn't load.
|
|
More tests work without Search::Xapian, now.
Usability issues still need to be fixed
|
|
We don't need to rely on Xapian search functionality for the
majority of the WWW code, even. subject_normalized is moved to
SearchMsg, where it (probably) makes more sense, anyways.
|
|
We only need it for tests that chdir, and maybe for ENV{PATH}
portability (dash seems fine, not sure about others).
v2: revert change to solver_git.t for FreeBSD 11.2 and document
|
|
We can revisit this, later; but Data::Dumper requires a separate
package for CentOS-7 users, at least.
|
|
CentOS-7 needs the perl-Data-Dumper package, and the
test is small enough to roll our own escaping, here.
|
|
PublicInbox::DS works for every platform we we care about,
nowadays; so checking for it is a waste of time. Cleanup a
few POSIX and Socket imports while we're in the area.
|
|
We were reindexing the full history every invocation of -index
when Xapian was not used because we were incorrectly relying on
'last_commit' metadata stored in Xapian.
Rewrite the indexing logic to be less confusing while we're
at it, since we rely on `git merge-base --is-ancestor' nowadays.
Furthermore, we need to handle message removals from the
overview index correctly when Xapian is not in use.
Co-authored-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Avoiding reliance on environment variables is a bit cleaner
for writing tests
|
|
|
|
* origin/danga-bundle:
DS: epoll: fix misordered EPOLL_CTL_DEL call
DS: drop unused "_undef" sub
syscall: drop readahead wrapper
build: do not manify DS and Syscall pods
DS: handle EINTR in IO::Poll path, too
DS: workaround IO::Kqueue EINTR (mis-)handling
DS: drop profiling support
DS: remove unused fields and functions
listener: use EPOLLEXCLUSIVE for listen sockets
bundle Danga::Socket and Sys::Syscall
|
|
This can help users track down the source of warnings
when presented with imperfect emails.
While we're at it, make the __WARN__ callback in t/v2writable.t
a no-op since we don't check for warnings, there.
|
|
FreeBSD does not allow non-root users to set S_ISGID;
so git skips this bit on FreeBSD and Debian/kFreeBSD
platforms.
|
|
These modules are unmaintained upstream at the moment, but I'll
be able to help with the intended maintainer once/if CPAN
ownership is transferred. OTOH, we've been waiting for that
transfer for several years, now...
Changes I intend to make:
* EPOLLEXCLUSIVE for Linux
* remove unused fields wasting memory
* kqueue bugfixes e.g. https://rt.cpan.org/Ticket/Display.html?id=116615
* accept4 support
And some lower priority experiments:
* switch to EV_ONESHOT / EPOLLONESHOT (incompatible changes)
* nginx-style buffering to tmpfile instead of string array
* sendfile off tmpfile buffers
* io_uring maybe?
|
|
This fixes a test failure on my Debian buster system.
Bug report filed for w3m to handle "'":
https://bugs.debian.org/927409
and for "highlight" to favor "'" in case other browsers fail:
https://bugs.debian.org/927410
|
|
Dangling parentheses with trailing punctuation usually means the
parentheses is not intended as part of the URL.
|
|
The URLs at the top of WwwStream.pm weren't getting linkified
correctly.
|
|
This will be used for generating an HTML listing for v1 inboxes,
at least. The logic for this follows that of grokmirror,
and we may dynamically generate manifest.js.gz natively...
|
|
'$inbox' is more human-readable, so that is for the more
human-readable name in most cases. Making our variable naming
more consistent should make the code easier-to-review and
harder to screw up.
|
|
Our high-level config already treats single limits as a
soft==hard limit for limiters; so stop handling that redundant
in the low-level spawn() sub.
|
|
We'll be spawning cgit and git-diff, which can take gigantic
amounts of CPU time and/or heap given the right (ermm... wrong)
input. Limit the damage that large/expensive diffs can cause.
|
|
This will be useful for extracting titles/subjects from
commit objects when displaying commits.
|
|
We were relying on Danga::Socket using the "bytes" pragma,
previously. Nowadays, the "bytes" pragma is not recommended in
general, but bytes::length remains acceptable for getting the
byte-size of a scalar.
|
|
WwwStream started depending on the WWW::style method
for configurable CSS, so mock ::style so the benchmark
runs properly.
Fixes: f026dbdd392c9dd5 ('www: admin-configurable CSS via "publicinbox.css"')
|
|
No point in making noise about something that isn't used.
|
|
This is compatible with Markdown; but we still keep the WYSIWYG
nature of plain-text with this. This is only intended for use
with our documentation. Enabling any type of Markdown support
for emails can lead to incompatibilities or interopability
problems with alternative implementations.
|
|
It turns out there's no point in having multiple instances of
this or having to worry about destruction or destruction
ordering.
This will make it easier to reuse the one instance we have
across different modules.
|
|
We'll want to use to support highlighting syntax used by
Markdown and possibly other markup languages (while retaining
the raw plain-text layout and formatting).
|
|
Favor in-place utf8::decode since it's a bit faster without
method dispatch overhead; and don't care about validity just
yet.
HlMod->do_hl itself should return "utf8" strings, since other
parts of our code can use it, so it's not the job of ViewVCS to
post-process HlMod output.
|
|
This is the fallback for the normal WWW endpoint.
Adding this to the top-level seems to be alright, since lynx and
w3m both understand nntp://<HOSTNAME>/<Message-ID> anyways.
If newsgroup and inbox names conflict, then consider it the
fault of the original sender.
Since NewsWWW is intended to support buggy linkifiers in mail clients,
they can interpret nntp:// URLs as http://<HOSTNAME>/<Message-ID>
Inbox ordering from the config file is preserved since
commit cfa8ff7c256e20f3240aed5f98d155c019788e3b
("config: each_inbox iteration preserves config order"),
so admins can rely on that to configure how scanning
works.
Requested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
cf. https://public-inbox.org/meta/20190107190719.GE9442@pure.paranoia.local/
nntp://news.public-inbox.org/20190107190719.GE9442@pure.paranoia.local
|
|
Sometimes users will write "http://example.com" without the
trailing slash, which every browser and tool I've tested seems
to understand.
|
|
We'll use HTML attributes + anchor links to link to filenames
in coming commits.
|