public-inbox.git  about / heads / tags
an "archives first" approach to mailing lists
blob 9731dfb0d2a77e03fb342b3f65da9a1bad7d80b0 3421 bytes (raw)
$ git show v1.7.0:Documentation/public-inbox-extindex.pod	# shows this blob on the CLI

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
 
=head1 NAME

public-inbox-extindex - create and update external search indices

=head1 SYNOPSIS

public-inbox-extindex [OPTIONS] EXTINDEX_DIR INBOX_DIR...

public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all

=head1 DESCRIPTION

public-inbox-extindex creates and updates an external search and
overview database used by the read-only public-inbox PSGI (HTTP),
NNTP, and IMAP interfaces.  This requires either the
L<Search::Xapian> XS bindings OR the L<Xapian> SWIG bindings,
along with L<DBD::SQLite> and L<DBI> Perl modules.

=head1 OPTIONS

=over

=item -j JOBS

=item --jobs=JOBS

... TODO, see L<public-inbox-index(5)>

=item --all

Index all C<publicinbox> entries in C<PI_CONFIG>.

C<publicinbox> entries indexed by C<public-inbox-extindex> can
have full Xapian searching abilities with the per-C<publicinbox>
C<indexlevel> set to C<basic> and their respective Xapian
(C<xap15> or C<xapian15>) directories removed.  For multiple
public-inboxes where cross-posting is common, this allows
significant space savings on Xapian indices.

=item --gc

Perform garbage collection instead of indexing.  Use this if
inboxes are removed from the extindex, or if messages are
purged or removed from some inboxes.

=item --reindex

Forces a re-index of all messages in the extindex.  This can be
used for in-place upgrades and bugfixes while read-only server
processes are utilizing the index.  Keep in mind this roughly
doubles the size of the already-large Xapian database.

The extindex locks will be released roughly every 10s to
allow L<public-inbox-mda(1)> and L<public-inbox-watch(1)>
processes to write to the extindex.

=item --fast

Used with C<--reindex>, it will only look for new and stale
entries and not touch already-indexed messages.

=back

=head1 FILES

L<public-inbox-extindex-format(5)>

=head1 CONFIGURATION

public-inbox-extindex does not currently write to the
L<public-inbox-config(5)> file, configuration may be entered
manually.  The extindex name of C<all> is a special case which
corresponds to indexing C<--all> inboxes.  An example for
C<--all> is as follows:

	[extindex "all"]
		topdir = /path/to/extindex_dir
		url = all
		coderepo = foo
		coderepo = bar

See L<public-inbox-config(5)> for more details.

=head1 ENVIRONMENT

=over 8

=item PI_CONFIG

Used to override the default "~/.public-inbox/config" value.

=item XAPIAN_FLUSH_THRESHOLD

The number of documents to update before committing changes to
disk.  This environment is handled directly by Xapian, refer to
Xapian API documentation for more details.

Setting C<XAPIAN_FLUSH_THRESHOLD> or
C<publicinbox.indexBatchSize> for a large C<--reindex> may cause
L<public-inbox-mda(1)>, L<public-inbox-learn(1)> and
L<public-inbox-watch(1)> tasks to wait long and unpredictable
periods of time during C<--reindex>.

Default: none, uses C<publicinbox.indexBatchSize>

=back

=head1 UPGRADING

Occasionally, public-inbox will update it's schema version and
require a full index by running this command.

=head1 CONTACT

Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>

The mail archives are hosted at L<https://public-inbox.org/meta/> and
L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>

=head1 COPYRIGHT

Copyright 2021 all contributors L<mailto:meta@public-inbox.org>

License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>

=head1 SEE ALSO

L<Search::Xapian>, L<DBD::SQLite>

git clone https://public-inbox.org/public-inbox.git
git clone http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git