about summary refs log tree commit homepage
DateCommit message (Collapse)
2019-05-25contrib/css: mark as CC0 (public domain)
No reason to copyright colour schemes :P
2019-05-25v2writable: drop unused $last_commits var
Apparently it's never been used and we write to msgmap directly.
2019-05-25t/indexlevels: fix indexlevel of ro_mirror
Don't hard-code "basic", since we already ran -init with the intended indexlevel.
2019-05-25msgmap: remove double negative
I have never not found double negatives to be confusing...
2019-05-24TODO: more stuff: bundles, synonyms, dogfooding
git bundles could/should make self-hosting easier. Being able to configure synonym (and spelling) lists would make some searches more useful. Might as well dogfood kernel stuff, too, given the overlap and history between this project, git and the Linux kernel. Would be interesting to have *BSD folks throw their hat in the ring, too. Building/testing userspace stuff is often the most time-consuming, but necessary to ensure future compatibility.
2019-05-24MANIFEST: add extman.perl
Oops :x
2019-05-24doc: add URLs for Xapian manpages
Since we go through the effort of hosting these manpages, link to them.
2019-05-24doc: xcpdb: add switch documentation
In particular, the '--compact' switch is really useful since it works without holding the inbox-wide lock for minutes at a time on giant inboxes (inboxes where copies can take dozens, if not hundreds of minutes).
2019-05-24doc: generate manpages for some Xapian commands
They're nowhere to be found on Xapian.org, and links to external services are either too long (for manpages.debian.org) or have privacy-invasive tracking JS on them.
2019-05-24doc: sync .txt mtime to .pod mtime
Otherwise timestamps for .html files get screwed up, too; and that hurts caching.
2019-05-24doc: don't barf on missing `git set-file-times'
It's not critical, but it's nice to have for cache-friendliness (otherwise I would not have written it :P) I guess I should follow up on getting it into 'git contrib/': https://public-inbox.org/git/20100702033709.GA6818@burratino/
2019-05-24doc: daemon: fix manpage section for nginx
The nginx manpage is in section 8.
2019-05-24doc: index: fix miscapitalization of "SQLite"
Oops :x
2019-05-24search: don't log all warnings on retry_reopen
Some users (or bots :P) can trigger horrible queries which the caller can choose to either log or ignore. This prevents horrible queries from ExtMsg from logging confusing "ref: " messages when $@ is not a Perl reference.
2019-05-23doc: various updates to reflect current state
-index documentation avoid redundant v1 information and refers readers to apropriate v1/v2 manpages. Search::Xapian can also be optional, now, as only the PSGI search interface uses it. Favor "INBOX_DIR" where appropriate, since "REPO_DIR" can be confused for code repos which we also support. XAPIAN_FLUSH_THRESHOLD is documented for all relevant bulk commands.
2019-05-23xapcmd: do not reset %SIG until last Xtmpdir is done
To properly handle compact tmpdir cleanup in single process situations, we need to carefully account for Xtmpdir not being a singleton and ensuring we don't clobber signal handlers which belong to other Xtmpdirs.
2019-05-23xcpdb|compact: support --jobs/-j flag like gmake(1)
We don't have to be tied to the number of partitions in case we made a bad choice at initialization. This doesn't affect reindexing, but the copying phase is already intensive. And optimize away the extra process when we only have a single job which won't parallelize. The wording for the (v2) reindexing phase could be improved, later. I also plan to allow repartitioning of existing Xapian DBs.
2019-05-23xapcmd: cleanup on interrupted xcpdb "--compact"
We should not have leftover junk on interrupted invocations.
2019-05-23xcpdb|compact: support some xapian-compact switches
Allow users to specify the --blocksize <B>, --no-full, --fuller options for xapian-compact(1) for fine-tuning compact behavior for low-traffic/inactive inboxes. We also won't support --multipass, since it doesn't seem compatible with our requirement to use --no-renumber. We also won't support --single-file, since it only seems intended for totally dead inboxes; and it doesn't seem worth the support overhead when "totally dead" turns out to be a misdiagnosis.
2019-05-23compact: reuse infrastructure from xcpdb
Since -xcpdb is a superset of -compact, we can reuse much of that code used for driving compact. For compact (only), this is slightly less memory efficient since it requires an extra process per-partition, but we get to prefix the output with the partition name for more readable output.
2019-05-23xcpdb: remove temporary directories on aborts
Cleanup temporary directories on common termination signals (INT, HUP, PIPE, TERM), but only if it's not in the process of being committed via rename() sequence.
2019-05-23xcpdb: show re-indexing progress
Emit information about reindexing git revision ranges when used with xcpdb. Additionally, distinguish Xapian copy output from v2 git epoch counting by increasing directory context info. For now, v1 batches batches are emitted. v2 indexing is still missing progress reporting for batches, as the data structures for reindexing would benefit from a refactoring, first. This does not currently affect the use of public-inbox-index, but may in the future.
2019-05-23xapcmd: use "print STDERR" for progress reporting
`warn' is reserved for actual warnings, as it respects $SIG{__WARN__} and we rely on that override to print message context information when we are indexing.
2019-05-23doc: xcpdb: update to reflect the current state
It is no longer a wrapper around copydatabase(1), since copydatabase did not recover from DatabaseModifiedError.
2019-05-23xapcmd: avoid EXDEV when finalizing changes
By creating temporary directories as deep as possible, we can allow v2 repositories to have `xap$SCHEMA_VERSION' (e.g. `xap15') reside on a separate FS. We also check st_dev ahead-of-time to avoid doing work which will fail with EXDEV. Of course, another process may still move/change things around.
2019-05-23xcpdb: cleanup error handling and diagnosis
Running a full "public-inbox-index --reindex" in parallel with "public-inbox-xcpdb" on the same inbox can still cause problems, though.
2019-05-23xcpdb: implement progress reporting
Copying an entire Xapian DB is horribly slow whether it's done via Perl or copydatabase(1). So displaying some progress indication is good for user experience. While we're at it, prefix xapian-compact output, too; since parallel processes end up clobbering each other.
2019-05-23xcpdb: use fine-grained locking
Copying an entire Xapian DB takes a long time, so update our reindexing code to support partial reindexing, snapshot the pre-copydatabase git revisions, perform the lengthy copy, and do a partial reindex when the copy + renames are done.
2019-05-23v2writable: hoist out log_range sub for readability
This is preparation to to support partial reindexing
2019-05-23xapcmd: xcpdb supports compaction
To minimize the delay on active inboxes, it's actually ideal to run xapian-compact at the end of the per-partition cpdb process; since the new DB isn't accessible yet and so we don't have to deal with lock contention with -mda or -watch processes. The downside is temporary file overhead (3x instead of 2x) required.
2019-05-23xcpdb: implement using Perl bindings
By avoid copydatabase(1) entirely, we can make further changes to avoid locking the entire inbox for a long operation and switch to fine-grained locking.
2019-05-23admin: move index_inbox over
We will be reindexing after copydatabase
2019-05-23xapcmd: do not cleanup on errors
We move the old directory into the new directory, so avoid the situation where a bug or error could cause the tempdir cleanup to run and destroy both our old and new directories.
2019-05-23xcpdb: new tool which wraps Xapian's copydatabase(1)
copydatabase(1) is an existing Xapian tool which is the recommended way to upgrade existing DBs to the latest Xapian database format (currently "glass" for stable/released versions). Our use of Xapian relies on preserving document IDs, so we'll wrap it like we do xapian-compact(1) and use the "--no-renumber" switch. I could not name the tool "public-inbox-copydatabase" since it would be ambiguous as to which DB it's actually copying. So, I abbreviated the suffix to "xcpdb" (Xapian CoPy DataBase), which I hope is acceptable and unambiguous.
2019-05-23xapcmd: support spawn options
copydatabase(1) is exceptionally noisy and it's output is confusing when run in parallel. Support redirects at least, and env while we're at it to give us future options. We can also stuff a -jobs parameter into the options to limit parallelism since it can be useful for low-priority upgrade jobs.
2019-05-23admin: hoist out resolve_inboxes for -compact and -index
Both of these index-affecting commands should work similarly on the command-line. public-inbox-index no longer complains about unconfigured ~/.public-inbox/config; but often I found myself being annoyed by that, anyways...
2019-05-23xapcmd: new module for wrapping Xapian commands
Port public-inbox-compact(1) over to using it, and we will need to wrap copydatabase(1) to ease glass migrations, too.
2019-05-23search: reenable phrase search on non-chert Xapian
This is assuming nobody uses flint or earlier, anymore; as flint predates the existence of this project.
2019-05-23doc: document the reason for --no-renumber
We're going to need copydatabase, too
2019-05-23v1writable: retire in favor of InboxWritable
In retrospect, introducing V1Writable was unnecessary and InboxWritable->importer is in a better position to abstract away differences between v1 and v2 writers. So teach InboxWritable to initialize inboxes and get rid of V1Writable.
2019-05-23t/convert-compact: skip on missing xapian-compact(1)
Can't run the test if the required Xapian tools are missing.
2019-05-22Merge branch 'ds-cleanup'
* ds-cleanup: DS: warn on deprecations DS: remove IPPROTO_TCP import DS: drop $VERSION var DS: remove support OtherFds code DS: get rid of unused methods and aliases
2019-05-22DS: warn on deprecations
Enabling deprecation warnings didn't seem to have any noticeable effects with "perl -w -c", so whatever reason Danga had for it is long irrelevant.
2019-05-22DS: remove IPPROTO_TCP import
Unlike Danga::Socket, we do not support TCP_CORK, either
2019-05-22DS: drop $VERSION var
It was only relevant to Danga::Socket.
2019-05-22DS: remove support OtherFds code
It's easy enough to wrap FDs in classes that can use all of the functionality of the event loop, not just the read-only interface AddOtherFds provided.
2019-05-22DS: get rid of unused methods and aliases
"make syntax" is clean, now
2019-05-22usercontent: stop relying on autodie
It's a non-standard package on CentOS-7, actually; and we shouldn't bloat the PSGI server by loading a module which isn't strictly needed.
2019-05-22ci: support CentOS-7
Tested on an amd64 chroot built with rinse 3.4
2019-05-22t/search*: require DBI and DBD::SQLite, too
None of the Search::Xapian-dependent stuff works without DBI and DBD::SQLite. There are no plans to support Xapian w/o DBD::SQLite since SQLite is more common and less resource-intensive than Xapian.