1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
| | =head1 NAME
public-inbox-extindex - create and update external search indices
=head1 SYNOPSIS
public-inbox-extindex [OPTIONS] EXTINDEX_DIR INBOX_DIR...
public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all
=head1 DESCRIPTION
public-inbox-extindex creates and updates an external search and
overview database used by the read-only public-inbox PSGI (HTTP),
NNTP, and IMAP interfaces. This requires either the
L<Search::Xapian> XS bindings OR the L<Xapian> SWIG bindings,
along with L<DBD::SQLite> and L<DBI> Perl modules.
=head1 OPTIONS
=over
=item -j JOBS
=item --jobs=JOBS
... TODO, see L<public-inbox-index(5)>
=item --all
Index all C<publicinbox> entries in C<PI_CONFIG>.
C<publicinbox> entries indexed by C<public-inbox-extindex> can
have full Xapian searching abilities with the per-C<publicinbox>
C<indexlevel> set to C<basic> and their respective Xapian
(C<xap15> or C<xapian15>) directories removed. For multiple
public-inboxes where cross-posting is common, this allows
significant space savings on Xapian indices.
=item --gc
Perform garbage collection instead of indexing. Use this if
inboxes are removed from the extindex, or if messages are
purged or removed from some inboxes.
=item --reindex
Forces a re-index of all messages in the extindex. This can be
used for in-place upgrades and bugfixes while read-only server
processes are utilizing the index. Keep in mind this roughly
doubles the size of the already-large Xapian database.
The extindex locks will be released roughly every 10s to
allow L<public-inbox-mda(1)> and L<public-inbox-watch(1)>
processes to write to the extindex.
=item --fast
Used with C<--reindex>, it will only look for new and stale
entries and not touch already-indexed messages.
=back
=head1 FILES
L<public-inbox-extindex-format(5)>
=head1 CONFIGURATION
public-inbox-extindex does not currently write to the
L<public-inbox-config(5)> file, configuration may be entered
manually. The extindex name of C<all> is a special case which
corresponds to indexing C<--all> inboxes. An example for
C<--all> is as follows:
[extindex "all"]
topdir = /path/to/extindex_dir
url = all
coderepo = foo
coderepo = bar
See L<public-inbox-config(5)> for more details.
=head1 ENVIRONMENT
=over 8
=item PI_CONFIG
Used to override the default "~/.public-inbox/config" value.
=item XAPIAN_FLUSH_THRESHOLD
The number of documents to update before committing changes to
disk. This environment is handled directly by Xapian, refer to
Xapian API documentation for more details.
Setting C<XAPIAN_FLUSH_THRESHOLD> or
C<publicinbox.indexBatchSize> for a large C<--reindex> may cause
L<public-inbox-mda(1)>, L<public-inbox-learn(1)> and
L<public-inbox-watch(1)> tasks to wait long and unpredictable
periods of time during C<--reindex>.
Default: none, uses C<publicinbox.indexBatchSize>
=back
=head1 UPGRADING
Occasionally, public-inbox will update it's schema version and
require a full index by running this command.
=head1 CONTACT
Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
The mail archives are hosted at L<https://public-inbox.org/meta/> and
L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
=head1 COPYRIGHT
Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
=head1 SEE ALSO
L<Search::Xapian>, L<DBD::SQLite>
|