about summary refs log tree commit homepage
path: root/script/public-inbox-index
DateCommit message (Collapse)
2018-04-07index: allow specifying --jobs=0 to disable multiprocess
Not everybody needs multiprocess support.
2018-04-04v2: support incremental indexing + purge
This is important for people running mirrors via "git fetch", as they need to be kept up-to-date. Purging is also now supported in mirrors. The short-lived "--regenerate" option is gone and is now implicitly enabled as a result. It's still cheap when article number regeneration is unnecessary, as we track the range for each git repository.
2018-03-22v2writable: add NNTP article number regeneration support
Allow best-effort regeneration of NNTP article numbers from cloned git repositories in addition to indexing Xapian Article numbers will not remain consistent when we add purge support, though.
2018-03-22v2writable: support reindexing Xapian
This still requires a msgmap.sqlite3 file to exist, but it allows us to tweak Xapian indexing rules and reindex the Xapian database online while -watch is running.
2018-03-19index: s/GIT_DIR/REPO_DIR/
No functional changes, yet, but this makes future changes easier-to-read.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2016-11-18index: allow indexing before configuration
One may build the initial index on a powerful host and transfer it to a weaker one for incremental indexing. Thus there is no requirement to have a configured public-inbox for building the index unless a user needs altid support or some such.
2016-08-11search: support alt-ID for mapping legacy serial numbers
For some existing mailing list archives, messages are identified by serial number (such as NNTP article numbers in gmane). Those links may become inaccessible (as is the current case for gmane), so ensure users can still search based on old serial numbers. Now, I run the following periodically to get article numbers from gmane (while news.gmane.org remains): NNTPSERVER=news.gmane.org export NNTPSERVER GROUP=gmane.comp.version-control.git perl -I lib scripts/xhdr-num2mid $GROUP --msgmap=/path/to/gmane.sqlite3 (I might integrate this further with public-inbox-* scripts one day). My ~/.public-inbox/config as an added "altid" snippet which now looks like this: [publicinbox "git"] address = git@vger.kernel.org mainrepo = /path/to/git.vger.git newsgroup = inbox.comp.version-control.git ; relative pathnames expand to $mainrepo/public-inbox/$file altid = serial:gmane:file=gmane.sqlite3 And run "public-inbox-index --reindex /path/to/git.vger.git" periodically. This ought to allow searching for "gmane:12345" to work for Xapian-enabled instances. Disclaimer: while public-inbox supports NNTP and stable article serial numbers, use of those for public links is discouraged since it encourages centralization.
2016-07-31search: support reindexing existing search indices
This should make tweaking the way we search more efficiet by allowing us to avoid doubling destroying the index every time we want to change something. We also give priority to incremental indexing via public-inbox-{watch,mda} and have manual invocations of public-inbox-index perform batch updates while releasing ssoma.lock.
2016-04-28import: run git-update-server-info when done
We should update $GIT_DIR/info/refs for dumb HTTP clients whenever we make changes to the repository. The best place to update is immediately after making commits. This fixes a bug where public-inbox-learn did not properly update $GIT_DIR/info/refs after inserting or removing messages.
2016-02-27move executables to script/ directory
This seems to match more closely with what is expected of Perl packages based on how blib is used. Hopefully makes the top-level source tree less cluttered and things easier-to-find.