about summary refs log tree commit homepage
path: root/t/over.t
DateCommit message (Collapse)
2021-10-12extindex: avoid invalid blobs after unref
When unref-ing a blob from xref3, make sure the "preferred" smsg->{blob} doesn't point to the blob we just unrefed. This is necessary because we periodically checkpoint our extindex process to allow -watch and -mda processes to run. This also gets rid of a lot of redundant code for ->remove_xref3, since it's all handled in ExtSearchIdx, now.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-11-29extindex: fix delete (`d') handling
We need to completely remove a message from over.sqlite3 and Xapian when no references remain, otherwise users will still see the removed messages in NNTP overviews and WWW search results/summaries. References to messages are now solely handled by the `xref3' table of over.sqlite3. We can also trust `xref3' when deciding whether to remove only the "O$eidx_key" and "G$lid" terms from a document in Xapian or to remove the entire Xapian document.
2020-11-07over: store xref3 data in over.sqlite3
We may not end up storing xref3 data in Xapian, actually. This will make indexlevel=basic possible, and along with --sequential-shard indexing support for slow storage. Making oidmap a separate table seems unnecessary, too, so fold it into the xref3 table since it's unlikely a git blob will be responsible for multiple xref3 rows.
2020-11-07overidx: introduce changes for external index
Since external indices won't have msgmap.sqlite3, we'll need to store last_commit-* metadata in over.sqlite3 instead. This has a longer limits to account for path names or newsgroup names stored in keys. We'll also rely on built-in counters for Xapian document IDs, since msgmap.sqlite3 no longer provides an AUTOINCREMENT column.
2020-08-27overidx: inline create_ghost sub
There's no need for this to be a separate sub since there's only a single caller. This saves a few kilobytes at least in short-lived processes.
2020-08-27over: rename ->disconnect to ->dbh_close
Since we got rid of over->connect, `disconnect' no longer pairs with it. So name it after the `close(2)' syscall it ultimately issues.
2020-08-27over: rename ->connect method to ->dbh
`->connect' is confused with the perlfunc for the `connect(2)' syscall, and also `DBI->connect'. Since SQLite doesn't use sockets, the word "connect" needlessly confuses me. Give it a short name to match the field name we use for it, which also matches the variable name used by the DBI(3pm) and DBD::SQLite(3pm) manpages.
2020-08-26over+msgmap: respect WAL journal_mode if set
WAL actually seems to have ideal locking characteristics given concurrency problems I'm experiencing with --reindex running in parallel with expensive read-only SQLite queries: <https://public-inbox.org/meta/20200825001204.GA840@dcvr/> Unfortunately, we cannot blindly use WAL while preserving compatibility with existing setups nor our guarantees that read-only daemons are indeed "read-only". However, respect an user's the choice to set WAL on their own if they're comfortable with giving -nntpd/-httpd/-imapd processes write permission to the directory storing SQLite DBs.
2020-08-23searchidx: index THREADID in Xapian
This is the `tid' column from over.sqlite3; and will be used for IMAP and JMAP search (among other things).
2020-06-13imap: STATUS/EXAMINE: rely on SQLite overview
We can get exact values for EXISTS, UIDNEXT using SQLite rather than calculating off $ibx->mm->max ourselves. Furthermore, $ibx->mm is less useful than $ibx->over for IMAP (and for our read-only daemons in general) so do not depend on $ibx->mm outside of startup/reload to save FDs and reduce kernel page cache footprint.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-02-04over: simplify read-only vs read-write checking
No need to call ref() and do a string comparison. Add some extra tests using the {ReadOnly} attribute in DBI.pm.
2019-12-24testcommon: add require_mods method and use it
This cuts down on lines of code in individual test cases and fixes some misnamed error messages by using "$0" consistently. This will also provide us with a method of swapping out dependencies which provide equivalent functionality (e.g "Xapian" SWIG can replace "Search::Xapian" XS bindings).
2019-12-19tests: move t/common.perl to PublicInbox::TestCommon
We want to be able to use run_script with *.t files, so t/common.perl putting subs into the top-level "main" namespace won't work. Instead, make it a module which uses Exporter like other libraries.
2019-11-24tests: use File::Temp->newdir instead of tempdir()
We'll also introduce a tmpdir() API to give tempdirs consistent names.
2019-09-09run update-copyrights from gnulib for 2019
2019-05-15lazy load Xapian and make it optional for v2
More tests work without Search::Xapian, now. Usability issues still need to be fixed
2018-04-19fix tests to run without Xapian installed
We'll be ensuring we can run more of the HTTP and all of the NNTP interface with only SQLite (and not Xapian) installed in the future.
2018-04-06search: index and allow searching by date-time
Dscho found this useful for finding matching git commits based on AuthorDate in git. Add it to the overview DB format, too; so in the future we can support v2 repos without Xapian. https://public-inbox.org/git/nycvar.QRO.7.76.6.1804041821420.55@ZVAVAG-6OXH6DA.rhebcr.pbec.zvpebfbsg.pbz https://public-inbox.org/git/alpine.DEB.2.20.1702041206130.3496@virtualbox/
2018-04-02www: rework query responses to avoid COUNT in SQLite
In many cases, we do not care about the total number of messages. It's a rather expensive operation in SQLite (Xapian only provides an estimate). For LKML, this brings top-level /$INBOX/ loading time from ~375ms to around 60ms on my system. Days ago, this operation was taking 800-900ms(!) for me before introducing the SQLite overview DB.
2018-04-02t/over: test empty Subject: line matching
We need to ensure we don't match NULL 'sid' columns in the `over' table.
2018-04-02replace Xapian skeleton with SQLite overview DB
This ought to provide better performance and scalability which is less dependent on inbox size. Xapian does not seem optimized for some queries used by the WWW homepage, Atom feeds, XOVER and NEWNEWS NNTP commands. This can actually make Xapian optional for NNTP usage, and allow more functionality to work without Xapian installed. Indexing performance was extremely bad at first, but DBI::Profile helped me optimize away problematic queries.