From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 65E911F462; Thu, 23 May 2019 10:37:38 +0000 (UTC) Date: Thu, 23 May 2019 10:37:38 +0000 From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 27/26] doc: various updates to reflect current state Message-ID: <20190523103738.GA24435@dcvr> References: <20190523093704.18367-1-e@80x24.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190523093704.18367-1-e@80x24.org> List-Id: -index documentation avoid redundant v1 information and refers readers to apropriate v1/v2 manpages. Search::Xapian can also be optional, now, as only the PSGI search interface uses it. Favor "INBOX_DIR" where appropriate, since "REPO_DIR" can be confused for code repos which we also support. XAPIAN_FLUSH_THRESHOLD is documented for all relevant bulk commands. --- Documentation/public-inbox-compact.pod | 25 ++++---- Documentation/public-inbox-index.pod | 80 +++++++++--------------- Documentation/public-inbox-v1-format.pod | 12 +++- Documentation/public-inbox-v2-format.pod | 5 +- Documentation/public-inbox-xcpdb.pod | 5 +- lib/PublicInbox/Inbox.pm | 2 +- script/public-inbox-compact | 2 +- script/public-inbox-index | 4 +- script/public-inbox-init | 2 +- 9 files changed, 64 insertions(+), 73 deletions(-) diff --git a/Documentation/public-inbox-compact.pod b/Documentation/public-inbox-compact.pod index 4a519ce..7d37f6f 100644 --- a/Documentation/public-inbox-compact.pod +++ b/Documentation/public-inbox-compact.pod @@ -9,15 +9,12 @@ public-inbox-compact - compact Xapian DBs =head1 DESCRIPTION public-inbox-compact is a wrapper for L -designed for "v2" inboxes. It combines multiple Xapian -partitions into one to reduce space overhead after an initial -mass import (using multiple partitions) is done. +which locks the inbox and prevents other processes such as +L or L from +writing while it operates. -It locks the inbox and prevents other processes such as -L from writing while it operates. - -It also supports "v1" (ssoma) inboxes with limited -usefulness over L +It enforces the use of the C<--no-renumber> option of +L =head1 ENVIRONMENT @@ -28,9 +25,15 @@ usefulness over L The default config file, normally "~/.public-inbox/config". See L -=back +=item XAPIAN_FLUSH_THRESHOLD + +The number of documents to update before committing changes to +disk. This environment is handled directly by Xapian, refer to +Xapian API documentation for more details. -=head1 UPGRADING +Default: 10000 + +=back =head1 CONTACT @@ -41,7 +44,7 @@ and L =head1 COPYRIGHT -Copyright 2018 all contributors L +Copyright 2018-2019 all contributors L License: AGPL-3.0+ L diff --git a/Documentation/public-inbox-index.pod b/Documentation/public-inbox-index.pod index acc9039..2e0ff69 100644 --- a/Documentation/public-inbox-index.pod +++ b/Documentation/public-inbox-index.pod @@ -4,14 +4,15 @@ public-inbox-index - create and update search indices =head1 SYNOPSIS -public-inbox-index [OPTIONS] REPO_DIR +public-inbox-index [OPTIONS] INBOX_DIR =head1 DESCRIPTION -public-inbox-index creates and updates the search and NNTP -article number database used by the read-only public-inbox HTTP -and NNTP interfaces. Currently, this requires L -and L and L Perl modules. +public-inbox-index creates and updates the search, overview and +NNTP article number database used by the read-only public-inbox +HTTP and NNTP interfaces. Currently, this requires +L and L Perl modules. L +is optional, only to support the PSGI search interface. Once the initial indices are created by public-inbox-index, L and L will @@ -22,10 +23,10 @@ relying on L to mirror an existing public-inbox; or if upgrading to a new version of public-inbox using the C<--reindex> option. -Having a search and article number database is essential to +Having the overview and article number database is essential to running the NNTP interface, and strongly recommended for the -HTTP interface as it provides thread grouping in addition -to normal search functionality. +HTTP interface as it provides thread grouping in addition to +normal search functionality. =head1 OPTIONS @@ -45,50 +46,11 @@ This does not touch the NNTP article number database. =head1 FILES +For v1 (ssoma) repositories described in L. All public-inbox-specific files are contained within the -C<$REPO_DIR/public-inbox/> directory. All files are expected to -grow in size as more messages are archived, so using compaction -commands (e.g. L) is not recommended unless -the list is no longer active. +C<$GIT_DIR/public-inbox/> directory. -=over - -=item $REPO_DIR/public-inbox/msgmap.sqlite3 - -The stable NNTP article number to Message-ID mapping is -stored in an SQLite3 database. - -This is required for users of L, but -users of the L interface will find it -useful for attempting recovery from copy-paste truncations of -URLs containing long Message-IDs. - -Avoid removing this file and regenerating it; it may cause -existing NNTP readers to lose sync and miss (or see duplicate) -messages. - -This file is relatively small, and typically less than 5% -of the space of the mail stored in a packed git repository. - -=item $REPO_DIR/public-inbox/xapian* - -The database used by L. This directory name is -followed by a number indicating the index schema version this -installation of public-inbox uses. - -These directories may be safely deleted or removed in full -while the NNTP and HTTP interfaces are no longer accessing -them. - -In addition to providing a search interface for the HTTP -interface, the Xapian database is used to group and combine -related messages into threads. For NNTP servers, it also -provides a cache of metadata and header information often -requested by NNTP clients. - -This directory is large, often two to three times the size of -the objects stored in a packed git repository. Using the -C<--reindex> option makes it larger, still. +v2 repositories are described in L. =back @@ -100,8 +62,24 @@ C<--reindex> option makes it larger, still. Used to override the default "~/.public-inbox/config" value. +=item XAPIAN_FLUSH_THRESHOLD + +The number of documents to update before committing changes to +disk. This environment is handled directly by Xapian, refer to +Xapian API documentation for more details. + +Default: our indexing code flushes every megabyte of mail seen +to keep memory usage low. Setting this environment variable to +any positive value will switch to a document count-based +threshold in Xapian. + =back +=head1 UPGRADING + +Occasionally, public-inbox will update it's schema version and +require a full index by running this command. + =head1 CONTACT Feedback welcome via plain-text mail to L @@ -111,7 +89,7 @@ and L =head1 COPYRIGHT -Copyright 2016-2018 all contributors L +Copyright 2016-2019 all contributors L License: AGPL-3.0+ L diff --git a/Documentation/public-inbox-v1-format.pod b/Documentation/public-inbox-v1-format.pod index 3b0e70e..c960913 100644 --- a/Documentation/public-inbox-v1-format.pod +++ b/Documentation/public-inbox-v1-format.pod @@ -104,6 +104,10 @@ SQLite3 database maintaining a stable mapping of Message-IDs to NNTP article numbers. Used by L and created and updated by L. +Users of the L interface will find it +useful for attempting recovery from copy-paste truncations of +URLs containing long Message-IDs. + Automatically updated by L, L and L. @@ -135,8 +139,12 @@ the "overview" DB also exists in the xapian directory for v1 repositories. See L Our use of the L requires Xapian document IDs to -remain stable. Thus, use of L and -L require the use of C<--no-renumber> switch. +remain stable. Using L and +L wrappers are recommended over tools +provided by Xapian. + +This directory is large, often two to three times the size of +the objects stored in a packed git repository. =item $GIT_DIR/ssoma.index diff --git a/Documentation/public-inbox-v2-format.pod b/Documentation/public-inbox-v2-format.pod index bc58074..65a85c1 100644 --- a/Documentation/public-inbox-v2-format.pod +++ b/Documentation/public-inbox-v2-format.pod @@ -118,8 +118,9 @@ large mail archives; but are fine for backup and usable for small instances. Our use of the L requires Xapian document IDs to -remain stable. Thus, use of L and -L require the use of C<--no-renumber> switch. +remain stable. Using L and +L wrappers are recommended over tools +provided by Xapian. =head2 OVERVIEW DB diff --git a/Documentation/public-inbox-xcpdb.pod b/Documentation/public-inbox-xcpdb.pod index c47500b..5697dcd 100644 --- a/Documentation/public-inbox-xcpdb.pod +++ b/Documentation/public-inbox-xcpdb.pod @@ -1,6 +1,6 @@ =head1 NAME -public-inbox-xcpdb - copy Xapian DBs (for format upgrades) +public-inbox-xcpdb - upgrade Xapian DB formats =head1 SYNOPSIS @@ -16,7 +16,8 @@ L or L. This is intended for upgrading the database format used by Xapian. It DOES NOT upgrade the schema used by the -public-inbox search interface (see L). +public-inbox PSGI search interface (see +L). =head1 ENVIRONMENT diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm index 0d86771..2771a24 100644 --- a/lib/PublicInbox/Inbox.pm +++ b/lib/PublicInbox/Inbox.pm @@ -225,7 +225,7 @@ sub description { local $/ = "\n"; chomp $desc; $desc =~ s/\s+/ /smg; - $desc = '($REPO_DIR/description missing)' if $desc eq ''; + $desc = '($INBOX_DIR/description missing)' if $desc eq ''; $self->{description} = $desc; } diff --git a/script/public-inbox-compact b/script/public-inbox-compact index 4bdadfc..e8bf31e 100755 --- a/script/public-inbox-compact +++ b/script/public-inbox-compact @@ -8,7 +8,7 @@ use PublicInbox::InboxWritable; use PublicInbox::Xapcmd; use PublicInbox::Admin; PublicInbox::Admin::require_or_die('-index'); -my $usage = "Usage: public-inbox-compact REPO_DIR\n"; +my $usage = "Usage: public-inbox-compact INBOX_DIR\n"; my $opt = { compact => 1, -coarse_lock => 1 }; GetOptions($opt, @PublicInbox::Xapcmd::COMPACT_OPT) or die "bad command-line args\n$usage"; diff --git a/script/public-inbox-index b/script/public-inbox-index index b6e3052..40187b3 100755 --- a/script/public-inbox-index +++ b/script/public-inbox-index @@ -4,12 +4,12 @@ # Basic tool to create a Xapian search index for a git repository # configured for public-inbox. # Usage with libeatmydata -# highly recommended: eatmydata public-inbox-index REPO_DIR +# highly recommended: eatmydata public-inbox-index INBOX_DIR use strict; use warnings; use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev); -my $usage = "public-inbox-index REPO_DIR"; +my $usage = "public-inbox-index INBOX_DIR"; use PublicInbox::Admin; PublicInbox::Admin::require_or_die('-index'); diff --git a/script/public-inbox-init b/script/public-inbox-init index 5724c52..985a09f 100755 --- a/script/public-inbox-init +++ b/script/public-inbox-init @@ -5,7 +5,7 @@ # Initializes a public-inbox, basically a wrapper for git-init(1) use strict; use warnings; -my $usage = "public-inbox-init NAME REPO_DIR HTTP_URL ADDRESS [ADDRESS..]"; +my $usage = "public-inbox-init NAME INBOX_DIR HTTP_URL ADDRESS [ADDRESS..]"; use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/; use PublicInbox::Admin; PublicInbox::Admin::require_or_die('-base'); -- EW