about summary refs log tree commit homepage
path: root/t/v2reindex.t
DateCommit message (Collapse)
2023-12-13tests: attempt compatibility w/ busybox lsof
BusyBox lsof(1) ignores the `-p PID' argument and shows the open files for every process it knows about. BusyBox lsof also lacks the `NODE' column of the non-BusyBox implementation, so we'll rely on /proc/PID/fd/ in those cases since the deleted file checks are Linux-only and it's common to have procfs is mounted on /proc on Linux.
2023-09-11treewide: favor Xapian (SWIG binding) over Search::Xapian
The Xapian SWIG bindings are favored by Xapian upstream for ease-of-maintenance compared to the XS version. While Debian lags on this front, the SWIG bindings are widely available on all *BSDs.
2021-09-13tests: add require_cmd, require curl when needed
t/v2mirror.t and t/lei-mirror.t are now skipped when curl is missing (instead of failing in appropriate places). A bunch of which() checks are updated to use require_cmd to avoid explicitly loading Spawn.
2021-03-24v2writable: cleanup SQLite handles on --xapian-only
I'm not sure exactly why this is needed with run_script localizing %SIG and everything else, but explictly cleaning up seems to fix the occasional test failures I see. Followup-to: 4c6c853494b49368 ("tests: show lsof output on deleted-file-check failures")
2021-03-15t/*: disable fsync on tests were create_inbox isn't worth it
Using create_inbox doesn't seem worth the trouble, here, at the moment, but disabling fsync(2) gives a noticeable speedup on my system even with an SSD.
2021-03-12t/v2reindex: avoid reading ~/.public-inbox/config in test
2021-03-11v2writable: fix undocumented --xapian-only
We can't pass $self and GLOBs across IPC channels transparently. I only noticed this because I'm testing the application/octet-stream fallback with https://public-inbox.org/meta/20210311014539.19756-1-e@80x24.org/ Fixes: bf8df8160076d7a1 ("searchidxshard: use PublicInbox::IPC to kill lots of code")
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-09-03search: replace ->query with ->mset
Nearly all of the search uses in the production code rely on a Xapian mset iterator being returned (instead of an array of $smsg objects). So default to returning the mset and move the burden of smsg array conversion into the test cases.
2020-08-27over: rename ->connect method to ->dbh
`->connect' is confused with the perlfunc for the `connect(2)' syscall, and also `DBI->connect'. Since SQLite doesn't use sockets, the word "connect" needlessly confuses me. Give it a short name to match the field name we use for it, which also matches the variable name used by the DBI(3pm) and DBD::SQLite(3pm) manpages.
2020-07-25index: support --rethread switch to fix old indices
Older versions of public-inbox < 1.3.0 had subtly different semantics around threading in some corner cases. This switch (when combined with --reindex) allows us to fix them by regenerating associations.
2020-06-08index: v2: parallel by default
InboxWritable should only set $v2w->{parallel} if the $parallel flag is defined to 0 or 1. We want indexing a new inbox to utilize SMP, just like --reindex. -index once again allows -j0/--jobs=0 to force single-process use, and we'll be ensuring that works in tests to maintain performance on small systems. Fixes: 61a2fff5b34a3e32 ("admin: move index_inbox over")
2020-05-12rename "ContentId" to "ContentHash"
The old name may be confused with "Content-ID" as described in RFC 2392, so use an alternate name to avoid confusing future readers.
2020-05-09remove most internal Email::MIME usage
We no longer load or use Email::MIME outside of comparison tests.
2020-04-22t/*.t: reduce dependency on Email::MIME APIs
Instead, favor PublicInbox::MIME->new for non-attachment emails. We may support alternatives to Email::MIME down the line. We'll still keep Email::MIME->create to deal with attachments, for now, but there's also a fair amount of test duplication we should eliminate, later.
2020-04-22t/*.t: use Email::MIME->create over PublicInbox::MIME->create
PublicInbox::MIME only supports ->new, and is only different from Email::MIME for old versions of Email::MIME. In the future, PublicInbox::MIME may not be a subclass of Email::MIME at all.
2020-04-19favor `do {}' over `eval {}' for localized slurp
I did not know to use the return value of `do' back in the day. There's probably no practical difference in these cases, but `eval' is overkill for these uses and may hide actual errors. We can get rid of a few redundant `scalar' ops and pass scalar refs to Email::MIME->new to avoid copies in a few more places, too.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-01-28t/v2reindex.t: 5.10.1 glob compatibility
I'm not sure when `for (<"quoted string/glob/*">)' became supported, and maybe it was inadvertant, but it fails with Perl 5.10.1. Just use the glob() function to be explicit.
2019-12-24testcommon: add require_mods method and use it
This cuts down on lines of code in individual test cases and fixes some misnamed error messages by using "$0" consistently. This will also provide us with a method of swapping out dependencies which provide equivalent functionality (e.g "Xapian" SWIG can replace "Search::Xapian" XS bindings).
2019-12-19tests: move t/common.perl to PublicInbox::TestCommon
We want to be able to use run_script with *.t files, so t/common.perl putting subs into the top-level "main" namespace won't work. Instead, make it a module which uses Exporter like other libraries.
2019-11-24tests: use File::Temp->newdir instead of tempdir()
We'll also introduce a tmpdir() API to give tempdirs consistent names.
2019-10-28search: support multiple From/To/Cc/Subject headers
We can easily support searching on messages with multiple From/To/Cc/Subject headers just like we do with multiple Message-ID headers. This matches the normal mutt pager display behavior.
2019-10-21v2writable: reindex handles 3-headered monsters
And maybe 8-headered ones, too... I noticed --reindex failing on the linux-renesas-soc mirror due one 3-headed monster of a message having 3 sets of headers; while another normal message had a Message-ID that matched one of the 3 IDs of the 3-headed monster. We still try to do the majority of indexing backwards, but we defer indexing multi-Message-ID'd messages until the end to ensure we get all the "good" messages in before we process the multi-headered ones. Link: https://public-inbox.org/meta/20191016211415.GA6084@dcvr/
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-09-09run update-copyrights from gnulib for 2019
2019-05-15lazy load Xapian and make it optional for v2
More tests work without Search::Xapian, now. Usability issues still need to be fixed
2019-05-14v2writable: allow setting nproc via creat options
Avoiding reliance on environment variables is a bit cleaner for writing tests
2019-01-10check git version requirements
This allows v1 tests to continue working on git 1.8.0 for now. This allows git 2.1.4 packaged with Debian 8 ("jessie") to run old tests, at least. I suppose it's safe to drop Debian 7 ("wheezy") due to our dependency on git 1.8.0 for "merge-base --is-ancestor". Writing V2 repositories requires git 2.6 for "get-mark" support, so mask out tests for older gits.
2019-01-02t/v2reindex: use the larger text to increase test reliability
libxapian30:amd64 1.4.9-1 on Debian sid seems to give an 8KB position.glass database with "hello world" as the document regardless of our indexlevel. Use the text of the AGPL-3.0 for a more realisitic Xapian database size. And perhaps tying our tests to the AGPL will make life more difficult for would-be copyright violators :>
2018-08-03t/v[12]reindex.t: Verify the num highwater is as expected
Instrument the tests to verify the highwater num highwater mark is where it is expected. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-03t/v[12]reindex.t Verify num_highwater
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-02t/v[12]reindex.t: Test incremental indexing works
Capture interesting commits of the test repository in mark variables. Use those marks to build interesting scenarios where index_sync proceeds as if those marks are the heads of the repositor. Use this capability to test what happens when adds and deletes are mixed within a repository. Be sad because things don't yet work as they should. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-02t/v[12]reindex.t: Test that the resulting msgmap is as expected
Deeply inspect the entire message map in the reindexing tests as the actual message order is significant and can result in surprises. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-02t/v[12]reindex.t: Place expected second in Xapian tests
Place the expected value second in is and isnt tests because when these tests fail they report the second value as the expected value. A report saying got 0 expected 8 'no Xapian search results' can be confusing. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-02t/v2reindex.t: Isolate the test cases more
While inspecting the tests I realized that because we have been reusing variables there can be a memory between one test case and another. Add scopes and local variables to prevent an unintended memory between one test cases. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-19tests: fixup indexlevel setting in tests
The correct field is underscore-less for consistency with git-config naming conventions. While we're at it, beef up the v2 tests with actual size checks, too. I also noticed phrase searching still seems to work for the limited test case, so I left it documented; but the size checking verifies the space savings.
2018-07-19SearchIdx: Allow the amount of indexing be configured
This adds a new inbox configuration option 'indexlevel' that can take the values 'full', 'medium', and 'basic'. When set to 'full' everything is indexed including the positions of all terms. When set to 'medium' everything except the positions of terms is indexed. When set to 'basic' terms and positions are not indexed. Just the Overview database for NNTP is created. Which is still quite good and allows searching for messages by Message-ID. But there are no indexes to support searching inside the email messages themselves. Update the reindex tests to exercise the full medium and basic code paths Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-18t/v2reindex.t: Swap the order of minmax tests so errors make sense
Previously if a minmax test failed it would say it was expecting the incorrect value, which is confusing when looking into why the test fails. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-18t/v2reindex.t: Don't reuse $ibx as two different kinds of variable
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-18t/v2reindex.t: Ensure the numbers 1 to 10 are used
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-04v2: support incremental indexing + purge
This is important for people running mirrors via "git fetch", as they need to be kept up-to-date. Purging is also now supported in mirrors. The short-lived "--regenerate" option is gone and is now implicitly enabled as a result. It's still cheap when article number regeneration is unnecessary, as we track the range for each git repository.
2018-04-01v2writable: fix parallel termination
I was too aggressively disabling parallelization to speed up the test suite and broke this :x Re-enable parallelization for the v2reindex test so we can catch it later.
2018-03-22v2writable: add NNTP article number regeneration support
Allow best-effort regeneration of NNTP article numbers from cloned git repositories in addition to indexing Xapian Article numbers will not remain consistent when we add purge support, though.