PUBLIC-INBOX-EXTINDEX(1)   public-inbox user manual   PUBLIC-INBOX-EXTINDEX(1)

NAME
       public-inbox-extindex - create and update external search indices

SYNOPSIS
       public-inbox-extindex [OPTIONS] EXTINDEX_DIR INBOX_DIR...

       public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all

DESCRIPTION
       public-inbox-extindex creates and updates an external search and
       overview database used by the read-only public-inbox PSGI (HTTP), NNTP,
       and IMAP interfaces.  This requires either the Search::Xapian XS
       bindings OR the Xapian SWIG bindings, along with DBD::SQLite and DBI
       Perl modules.

OPTIONS
       -j JOBS
       --jobs=JOBS
           ... TODO, see public-inbox-index(5)

       --all
           Index all "publicinbox" entries in "PI_CONFIG".

           "publicinbox" entries indexed by "public-inbox-extindex" can have
           full Xapian searching abilities with the per-"publicinbox"
           "indexlevel" set to "basic" and their respective Xapian ("xap15" or
           "xapian15") directories removed.  For multiple public-inboxes where
           cross-posting is common, this allows significant space savings on
           Xapian indices.

       --gc
           Perform garbage collection instead of indexing.  Use this if
           inboxes are removed from the extindex, or if messages are purged or
           removed from some inboxes.

       --reindex
           Forces a re-index of all messages in the extindex.  This can be
           used for in-place upgrades and bugfixes while read-only server
           processes are utilizing the index.  Keep in mind this roughly
           doubles the size of the already-large Xapian database.

           The extindex locks will be released roughly every 10s to allow
           public-inbox-mda(1) and public-inbox-watch(1) processes to write to
           the extindex.

       --fast
           Used with "--reindex", it will only look for new and stale entries
           and not touch already-indexed messages.

FILES
       public-inbox-extindex-format(5)

CONFIGURATION
       public-inbox-extindex does not currently write to the
       public-inbox-config(5) file, configuration may be entered manually.
       The extindex name of "all" is a special case which corresponds to
       indexing "--all" inboxes.  An example for "--all" is as follows:

               [extindex "all"]
                       topdir = /path/to/extindex_dir
                       url = all
                       coderepo = foo
                       coderepo = bar

       See public-inbox-config(5) for more details.

ENVIRONMENT
       PI_CONFIG
               Used to override the default "~/.public-inbox/config" value.

       XAPIAN_FLUSH_THRESHOLD
               The number of documents to update before committing changes to
               disk.  This environment is handled directly by Xapian, refer to
               Xapian API documentation for more details.

               Setting "XAPIAN_FLUSH_THRESHOLD" or
               "publicinbox.indexBatchSize" for a large "--reindex" may cause
               public-inbox-mda(1), public-inbox-learn(1) and
               public-inbox-watch(1) tasks to wait long and unpredictable
               periods of time during "--reindex".

               Default: none, uses "publicinbox.indexBatchSize"

UPGRADING
       Occasionally, public-inbox will update it's schema version and require
       a full index by running this command.

CONTACT
       Feedback welcome via plain-text mail to <mailto:meta@public-inbox.org>

       The mail archives are hosted at <https://public-inbox.org/meta/> and
       <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>

COPYRIGHT
       Copyright 2021 all contributors <mailto:meta@public-inbox.org>

       License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>

SEE ALSO
       Search::Xapian, DBD::SQLite

public-inbox.git                  1993-10-02          PUBLIC-INBOX-EXTINDEX(1)