about summary refs log tree commit homepage
path: root/t
DateCommit message (Collapse)
2019-06-20t/httpd-corner: ensure chunk payload read doesn't overreach
It never has, AFAIK, but I'm making some changes to this code in another branch and nearly introduced a bug where it would be overreading and discarding the pipelined request.
2019-06-20t/httpd-corner.t: fix braino :x
Plack is for Perl, Rack is for Ruby; this a Perl project :x
2019-06-16t/replace.t: fix SKIP label for testing w/o Xapian
2019-06-16Merge remote-tracking branch 'origin/newspeak' into xcpdb
* origin/newspeak: comments: replace "partition" with "shard" t/xcpdb-reshard: use 'shard' term in local variables xapcmd: favor 'shard' over 'part' in local variables search: use "shard" for local variable v2writable: use "epoch" consistently when referring to git repos adminedit: "part" => "shard" for local variables v2writable: rename local vars to match Xapian terminology v2writable: avoid "part" in internal subs and fields search*: rename {partition} => {shard} xapcmd: update comments referencing "partitions" v2: rename SearchIdxPart => SearchIdxShard inboxwritable: s/partitions/shards/ in local var tests: change messages to use "shard" instead of partition v2writable: rename {partitions} field to {shards} v2writable: count_partitions => count_shards searchidxpart: start using "shard" in user-visible places rename reference to git epochs as "partitions" admin|xapcmd: user-facing messages say "shard" v2writable: update comments regarding xcpdb --reshard doc: rename our Xapian "partitions" to "shards"
2019-06-16t/psgi_search.t: use higher-level APIs
No point in using lower-level APIs for a PSGI test.
2019-06-15searchview: add link at bottom to reverse results
I could not find a place to put the link the top without making navigation too cluttered. Putting it at the bottom of the page seems reasonable...
2019-06-15t/git-http-backend: explain purpose of test
I found myself tempted to switch to HTTP::Tiny, here, since it's distributed with Perl since 5.14, unlike Net::HTTP (which AFAIK was never a part of Perl proper). But we really want to use Net::HTTP, here, since it's lower-level and allows us to trigger server-side buffering by not reading the entity body.
2019-06-14t/xcpdb-reshard: use 'shard' term in local variables
Another step in maintaining consistency with Xapian docs.
2019-06-14tests: change messages to use "shard" instead of partition
Another potentially user-facing piece made consistent with Xapian terminology.
2019-06-14v2writable: rename {partitions} field to {shards}
Our internal data structure should be consistent with Xapian terminology.
2019-06-14rename reference to git epochs as "partitions"
Try to remain consistent with our own documentation regarding v2 git "epochs", first.
2019-06-14search: require PublicInbox::Inbox ref here
No sense in supporting multiple methods of initialization for an internal class.
2019-06-14searchidx: require PublicInbox::Inbox (or InboxWritable) ref
PublicInbox::Inbox objects have minimal dependencies, so drop code to support old tests which existed before the PublicInbox::Inbox object came into existence.
2019-06-14t/www_listing: favor HTTP::Tiny over Net::HTTP
More testers are likely to have HTTP::Tiny than Net::HTTP, since HTTP::Tiny is a dual-life module and distributed with Perl since Perl 5.14 (2011-05-14), whereas Net::HTTP will likely live in a separate package forever.
2019-06-14Merge remote-tracking branch 'origin/reshard' into next
* origin/reshard: xcpdb: support resharding v2 repos xcpdb: use destination shard as progress prefix xapcmd: preserve indexlevel based on the destination v2writable: use a smaller default for Xapian partitions
2019-06-14Merge remote-tracking branch 'origin/manifest' into next
* origin/manifest: git: ensure ->modified returns an integer www: support $INBOX/git/$EPOCH.git for v2 cloning www: wire up /$INBOX/manifest.js.gz, too wwwlisting: generate grokmirror-compatible manifest.js.gz wwwlisting: allow hiding entries from manifest
2019-06-14Merge remote-tracking branch 'origin/edit' into next
* origin/edit: edit: unlink temporary file when done v2writable: replace: kill git processes before reindexing edit: drop unwanted headers before noop check edit|purge: improve output on rewrites edit: new tool to perform edits doc: document the --prune option for -index admin: expose ->config AdminEdit: move editability checks from -purge admin: beef up resolve_inboxes to handle purge options purge: start moving common options to AdminEdit module admin: remove warning arg for unconfigured inboxes v2writable: implement ->replace call import: switch to "replace_oids" interface for purge import: extract_author_info becomes extract_commit_info v2writable: consolidate overview and indexing call
2019-06-14xcpdb: support resharding v2 repos
v2 repos are sometimes created on machines where CPU parallelization exceeds the capability of the storage devices. In that case, users may reshard the Xapian DB to any smaller, positive integer to avoid excessive overhead and contention when bottlenecked by slow storage. Resharding can also be used to increase shard count after hardware upgrades.
2019-06-14git: remove cat_file sub callback interface
We weren't using it, and in retrospect, it makes no sense to use this API cat_file for giant responses which can't read quickly with minimal context-switching (or sanely fit into memory for Email::Simple/Email::MIME). For giant blobs which we don't want slurped in memory, we'll spawn a short-lived git-cat-file process like we do in ViewVCS. Otherwise, monopolizing a git-cat-file process for a giant blob is harmful to other PSGI/NNTP users. A better interface is coming which will be more suitable for for batch processing of "small" objects such as commits and email blobs.
2019-06-14nntp: filter out duplicate Message-IDs for leafnode
It's the unfortunate reality that there are some clients which reuse Message-IDs (in which we generate + use another) or set multiple Message-IDs on their own. While the v2 format addresses that, NNTP clients such as leafnode are not always prepared to deal with that case. So, ensure NNTP clients only see a single Message-ID, and show the others as 'X-Alt-Message-ID'.
2019-06-13nntp: ensure Message-ID is not folded for leafnode
Leafnode cannot handle Message-ID headers which are too long and require folding via Email::Simple::Header. Since there are already many of these messages in git with the header already folded, we need to handle the unfolding when emitting the message via NNTP. As far as we know, Leafnode is the only client software incapable of handling this case.
2019-06-13t/common.perl: fix error message for git requirements
And enable strict + warnings in the scope of t/common.perl, too.
2019-06-10edit: drop unwanted headers before noop check
mutt will set Content-Length, Lines, and Status headers unconditionally, so we need to account for that before doing header comparisons to avoid making expensive changes when noop edits are made.
2019-06-10edit|purge: improve output on rewrites
Fill in undef as "(unchanged)" when displaying commits and prefix the epoch name.
2019-06-10git: ensure ->modified returns an integer
We don't want to serialize timestamps as strings to JSON. I only noticed this bug on a 32-bit system.
2019-06-09edit: new tool to perform edits
This wrapper around V2Writable->replace provides a user-interface for editing messages as single-message mboxes (or the raw text via $EDITOR).
2019-06-09v2writable: implement ->replace call
Much of the existing purge code is repurposed to a general "replace" functionality. ->purge is simpler because it can just drop the information. Unlike ->purge, ->replace needs to edit existing git commits (in case of From: and Subject: headers) and reindex the modified message. We currently disallow editing of References:, In-Reply-To: and Message-ID headers because it can cause bad side effects with our threading (and our lack of rethreading support to deal with excessive matching from incorrect/invalid References).
2019-06-09www: support $INBOX/git/$EPOCH.git for v2 cloning
And use it in manifest.js. To ease maintaining mirrors with grokmirror(1), we can accept a "git/" directory prefix before the epoch, and ".git" suffix after the epoch number. We maintain compatibility with "$INBOX/$EPOCH" cloning, of course, and it's still easier-to-type on the command-line.
2019-06-09www: wire up /$INBOX/manifest.js.gz, too
I can imagine myself just wanting to clone a single v2 inbox and all its epochs without thinking about include/exclude rules in a grokmirror config file.
2019-06-09wwwlisting: generate grokmirror-compatible manifest.js.gz
Support on-demand generation of "/manifest.js.gz" for inboxes. By default, this matches inboxes with URLs matching the given request hostname by default. This makes it easier to create full mirrors of several inboxes without needing to configure static file serving. cf. https://git.kernel.org/pub/scm/utils/grokmirror/grokmirror.git
2019-06-04t: avoid "subtest" for Perl 5.10.1 compatibility
The version of Test::More from Perl 5.10.1 did not support "subtest", and the earliest version which did is Perl 5.12.0 The good news is this gives me an excuse to parallelize the indexlevels-mirror test by splitting it into two. (it could be further split, even). Update t/nntpd. to use PI_TEST_VERSION consistently while we're at it.
2019-06-04linkify: support Internationalized Domain Names in URLs
The "\w" character class in Perl matches any word characters in the Unicode database, not just ASCII characters. So we must be prepared for that and generate links to IDNs.
2019-06-03t/psgi_search.t: require DBD::SQLite
In case we encounter an odd system which has Search::Xapian but not DBD::SQLite.
2019-06-01ds: fix and test for FD leaks with kqueue on ->Reset
Even though we currently don't use it repeatedly, ->Reset should close() kqueue FDs and not cause the process to run out of descriptors. Add a close-on-exec test while we're at it.
2019-06-01git: unconditional expiry
A constant stream of traffic to either httpd/nntpd would mean git-cat-file processes never expire. Things can go bad after a full repack, as a full repack will unlink old pack indices and git-cat-file does not currently detect unlinked files. We could do something complicated by recursively stat-ing objects/pack of every git directory and alternate; but that's probably not worth the trouble compared to occasionally restarting the cat-file process. So simplify the code and let httpd/nntpd expire them periodically, since spawning a "git-cat-file --batch" process isn't too expensive. We already spawn for every request which hits git-http-backend, cgit, and git-apply. In the future, we may optionally support the Git::Raw module to avoid IPC; but we must remain careful to not leave lingering FDs open to unlinked files after repack.
2019-05-29searchidx: store indexlevel=medium as metadata
And use it from Admin. It's easy to tell what indexlevel=basic is from unconfigured inboxes, but distinguishing between 'medium' and 'full' would require stat()-ing position.* files which is fragile and Xapian-implementation-dependent. So use the metadata facility of Xapian and store it in the main partition so Admin tools can deal better with unconfigured inboxes copied using generic tools like cp(1) or rsync(1).
2019-05-27v2: fix reindex skipping NNTP article numbers
`public-inbox-index --reindex' could cause NNTP article number gaps to form when it also has to deal with new, never-before-seen commits in mirrors running off `git fetch'. Fix this by running two distinct invocations of ->index_sync; once to only reindex old commits, and a second time to index new commits. This does not appear to be a problem on v1 at the moment, but I'll need more time to analyze this.
2019-05-27t/v1reindex.t: fix typo in setting `indexlevel'
It did not cause a test failure because the default fallback is `indexlevel=full'
2019-05-25t/indexlevels: fix indexlevel of ro_mirror
Don't hard-code "basic", since we already ran -init with the intended indexlevel.
2019-05-23xcpdb: implement progress reporting
Copying an entire Xapian DB is horribly slow whether it's done via Perl or copydatabase(1). So displaying some progress indication is good for user experience. While we're at it, prefix xapian-compact output, too; since parallel processes end up clobbering each other.
2019-05-23xcpdb: new tool which wraps Xapian's copydatabase(1)
copydatabase(1) is an existing Xapian tool which is the recommended way to upgrade existing DBs to the latest Xapian database format (currently "glass" for stable/released versions). Our use of Xapian relies on preserving document IDs, so we'll wrap it like we do xapian-compact(1) and use the "--no-renumber" switch. I could not name the tool "public-inbox-copydatabase" since it would be ambiguous as to which DB it's actually copying. So, I abbreviated the suffix to "xcpdb" (Xapian CoPy DataBase), which I hope is acceptable and unambiguous.
2019-05-23search: reenable phrase search on non-chert Xapian
This is assuming nobody uses flint or earlier, anymore; as flint predates the existence of this project.
2019-05-23v1writable: retire in favor of InboxWritable
In retrospect, introducing V1Writable was unnecessary and InboxWritable->importer is in a better position to abstract away differences between v1 and v2 writers. So teach InboxWritable to initialize inboxes and get rid of V1Writable.
2019-05-23t/convert-compact: skip on missing xapian-compact(1)
Can't run the test if the required Xapian tools are missing.
2019-05-22t/search*: require DBI and DBD::SQLite, too
None of the Search::Xapian-dependent stuff works without DBI and DBD::SQLite. There are no plans to support Xapian w/o DBD::SQLite since SQLite is more common and less resource-intensive than Xapian.
2019-05-22t/watch_filter_rubylang: disable v2 test for git < 2.6
This test was not disabled properly for ancient versions of git without get-mark support.
2019-05-21Merge remote-tracking branch 'origin/xap-optional' into master
* origin/xap-optional: admin: improve warnings and errors for missing modules searchidx: do not create empty Xapian partitions for basic lazy load Xapian and make it optional for v2 www: use Inbox->over where appropriate nntp: use Inbox->over directly inbox: add ->over method to ease access
2019-05-15remove hard Devel::Peek dependency and lazy load for daemons
It's only useful for a corner case in long-running daemons when an admin decides to compact or vacuum a Xapian or SQLite DB. As a result, other scripts should run slightly faster. For instance, this saves about 80ms (2.710s => 2.630s) in t/mda.t on my remote workstation. While we're at it, make sure EvCleanup is properly require'd in Daemon.pm and HTTP.pm and document our use of Devel::Peek.
2019-05-15searchidx: do not create empty Xapian partitions for basic
No point in leaving a mess of empty directories when Xapian doesn't load.
2019-05-15lazy load Xapian and make it optional for v2
More tests work without Search::Xapian, now. Usability issues still need to be fixed