* [PATCH 2/2] t/convert-compact: avoid warning on `scalar(split(...))'
@ 2020-05-01 18:04 6% ` Eric Wong
0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2020-05-01 18:04 UTC (permalink / raw)
To: meta
Perl 5.10.1 would warn about implicit assignment to @_ by
split(). So favor the documented method of using `tr'
to count lines.
Fixes: b5ddcb3352ef31ae ("index: support --compact / -c on command-line")
---
t/convert-compact.t | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/t/convert-compact.t b/t/convert-compact.t
index ae299021..1627e019 100644
--- a/t/convert-compact.t
+++ b/t/convert-compact.t
@@ -124,7 +124,6 @@ $rdr->{2} = \(my $err2 = '');
$cmd = [ qw(-index --reindex -cc), "$tmpdir/v2" ];
ok(run_script($cmd, undef, $rdr), '--reindex -c -c');
like($err2, qr/xapian-compact/, 'xapian-compact ran (-c -c)');
-ok(scalar(split(/\n/, $err2)) > scalar(split(/\n/, $err)),
- '-compacted twice');
+ok(($err2 =~ tr/\n/\n/) > ($err =~ tr/\n/\n/), '-compacted twice');
done_testing();
^ permalink raw reply related [relevance 6%]
* [PATCH 0/2] index: support --compact / -c
@ 2020-03-28 0:56 7% Eric Wong
2020-03-28 0:56 6% ` [PATCH 2/2] index: support --compact / -c on command-line Eric Wong
0 siblings, 1 reply; 3+ results
From: Eric Wong @ 2020-03-28 0:56 UTC (permalink / raw)
To: meta
It looks like HDDs and SSDs have gotten and will get even more
expensive due to manufacturing freezes from the pandemic.
Indexing (especially with --reindex to fixup old bugs) takes a
large amount of space, so support running compact immediately
after indexing to avoid users having to script a -compact
invocation for each inbox. Compacting before indexing can be
triggered by using this switch twice, to further reduce space
overhead at a small time loss.
Note: I only found the bug fixed in 1/2 while testing 2/2. It
took me a while to fix this bug because I've probably lost 10
IQ points from the stress of recent weeks :<
Eric Wong (2):
searchidxshard: ensure we set indexlevel on shard[0]
index: support --compact / -c on command-line
Documentation/public-inbox-index.pod | 24 ++++++++++++++++++++----
lib/PublicInbox/InboxWritable.pm | 1 +
lib/PublicInbox/SearchIdx.pm | 26 +++++++++++++++++---------
lib/PublicInbox/SearchIdxShard.pm | 4 +++-
lib/PublicInbox/Xapcmd.pm | 4 +++-
script/public-inbox-index | 20 +++++++++++++++++---
t/convert-compact.t | 13 +++++++++++++
t/init.t | 7 ++++++-
8 files changed, 80 insertions(+), 19 deletions(-)
^ permalink raw reply [relevance 7%]
* [PATCH 2/2] index: support --compact / -c on command-line
2020-03-28 0:56 7% [PATCH 0/2] index: support --compact / -c Eric Wong
@ 2020-03-28 0:56 6% ` Eric Wong
0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2020-03-28 0:56 UTC (permalink / raw)
To: meta
It's more convenient to specify `-c' / `--compact' on the
command-line when reindexing than it is to invoke
public-inbox-compact(1) separately.
This is especially convenient in low-space situations when
public-inbox-index is operating on multiple inboxes
sequentially, as compaction can happen immediately after
indexing each inbox, instead of waiting until all inboxes are
indexed.
---
Documentation/public-inbox-index.pod | 24 ++++++++++++++++++++----
lib/PublicInbox/InboxWritable.pm | 1 +
lib/PublicInbox/Xapcmd.pm | 4 +++-
script/public-inbox-index | 20 +++++++++++++++++---
t/convert-compact.t | 13 +++++++++++++
5 files changed, 54 insertions(+), 8 deletions(-)
diff --git a/Documentation/public-inbox-index.pod b/Documentation/public-inbox-index.pod
index 14113ec8..dede5d2e 100644
--- a/Documentation/public-inbox-index.pod
+++ b/Documentation/public-inbox-index.pod
@@ -4,7 +4,7 @@ public-inbox-index - create and update search indices
=head1 SYNOPSIS
-public-inbox-index [OPTIONS] INBOX_DIR
+public-inbox-index [OPTIONS] INBOX_DIR...
=head1 DESCRIPTION
@@ -32,16 +32,32 @@ normal search functionality.
=over
+=item --compact / -c
+
+Compacts the Xapian DBs after indexing. This is recommended
+when using C<--reindex> to avoid running out of disk space
+while indexing multiple inboxes.
+
+While option takes a negligible amount of time compared to
+C<--reindex>, it requires temporarily duplicating the entire
+contents of the Xapian DB.
+
+This switch may be specified twice, in which case compaction
+happens both before and after indexing to minimize the temporal
+footprint of the (re)indexing operation.
+
=item --reindex
Forces a re-index of all messages in the inbox.
This can be used for in-place upgrades and bugfixes while
NNTP/HTTP server processes are utilizing the index. Keep in
mind this roughly doubles the size of the already-large
-Xapian database. Running L<public-inbox-compact(1)>
-afterwards is recommended to release free space.
+Xapian database. Using this with C<--compact> or running
+L<public-inbox-compact(1)> afterwards is recommended to
+release free space.
-This does not touch the NNTP article number database.
+This does not touch the NNTP article number database or
+affect threading.
=item --prune
diff --git a/lib/PublicInbox/InboxWritable.pm b/lib/PublicInbox/InboxWritable.pm
index e684f546..ce979ea2 100644
--- a/lib/PublicInbox/InboxWritable.pm
+++ b/lib/PublicInbox/InboxWritable.pm
@@ -19,6 +19,7 @@ use constant {
sub new {
my ($class, $ibx, $creat_opt) = @_;
+ return $ibx if ref($ibx) eq $class;
my $self = bless $ibx, $class;
# TODO: maybe stop supporting this
diff --git a/lib/PublicInbox/Xapcmd.pm b/lib/PublicInbox/Xapcmd.pm
index 7414c9b6..8e2b9063 100644
--- a/lib/PublicInbox/Xapcmd.pm
+++ b/lib/PublicInbox/Xapcmd.pm
@@ -217,13 +217,15 @@ sub prepare_run {
($tmp, \@queue);
}
+sub check_compact () { runnable_or_die($XAPIAN_COMPACT) }
+
sub run {
my ($ibx, $task, $opt) = @_; # task = 'cpdb' or 'compact'
my $cb = \&${\"PublicInbox::Xapcmd::$task"};
PublicInbox::Admin::progress_prepare($opt ||= {});
defined(my $dir = $ibx->{inboxdir}) or die "no inboxdir defined\n";
-d $dir or die "inboxdir=$dir does not exist\n";
- runnable_or_die($XAPIAN_COMPACT) if $opt->{compact};
+ check_compact() if $opt->{compact};
my $reindex; # v1:{ from => $x40 }, v2:{ from => [ $x40, $x40, .. ] } }
if (!$opt->{-coarse_lock}) {
diff --git a/script/public-inbox-index b/script/public-inbox-index
index c6910420..7def9964 100755
--- a/script/public-inbox-index
+++ b/script/public-inbox-index
@@ -11,12 +11,19 @@ use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
my $usage = "public-inbox-index INBOX_DIR";
use PublicInbox::Admin;
PublicInbox::Admin::require_or_die('-index');
+use PublicInbox::Xapcmd;
-my $opt = { quiet => -1 };
-GetOptions($opt, qw(verbose|v+ reindex jobs|j=i prune indexlevel|L=s))
+my $compact_opt;
+my $opt = { quiet => -1, compact => 0 };
+GetOptions($opt, qw(verbose|v+ reindex compact|c+ jobs|j=i prune indexlevel|L=s))
or die "bad command-line args\n$usage";
die "--jobs must be positive\n" if defined $opt->{jobs} && $opt->{jobs} <= 0;
+if ($opt->{compact}) {
+ require PublicInbox::Xapcmd;
+ PublicInbox::Xapcmd::check_compact();
+ $compact_opt = { -coarse_lock => 1, compact => 1 };
+}
my @ibxs = PublicInbox::Admin::resolve_inboxes(\@ARGV);
PublicInbox::Admin::require_or_die('-index');
@@ -31,4 +38,11 @@ foreach my $ibx (@ibxs) {
PublicInbox::Admin::require_or_die(keys %$mods);
PublicInbox::Admin::progress_prepare($opt);
-PublicInbox::Admin::index_inbox($_, undef, $opt) for @ibxs;
+for my $ibx (@ibxs) {
+ $ibx = PublicInbox::InboxWritable->new($ibx);
+ if ($opt->{compact} >= 2) {
+ PublicInbox::Xapcmd::run($ibx, 'compact', $compact_opt);
+ }
+ PublicInbox::Admin::index_inbox($ibx, undef, $opt);
+ PublicInbox::Xapcmd::run($ibx, 'compact', $compact_opt) if $compact_opt;
+}
diff --git a/t/convert-compact.t b/t/convert-compact.t
index 1671caad..70609c7d 100644
--- a/t/convert-compact.t
+++ b/t/convert-compact.t
@@ -115,4 +115,17 @@ my $msgs = $ibx->recent({limit => 1000});
is($msgs->[0]->{mid}, 'a-mid@b', 'message exists in history');
is(scalar @$msgs, 1, 'only one message in history');
+$ibx = undef;
+$err = '';
+$cmd = [ qw(-index --reindex -c), "$tmpdir/v2" ];
+ok(run_script($cmd, undef, $rdr), '--reindex -c');
+like($err, qr/xapian-compact/, 'xapian-compact ran (-c)');
+
+$rdr->{2} = \(my $err2 = '');
+$cmd = [ qw(-index --reindex -cc), "$tmpdir/v2" ];
+ok(run_script($cmd, undef, $rdr), '--reindex -c -c');
+like($err2, qr/xapian-compact/, 'xapian-compact ran (-c -c)');
+ok(scalar(split(/\n/, $err2)) > scalar(split(/\n/, $err)),
+ '-compacted twice');
+
done_testing();
^ permalink raw reply related [relevance 6%]
Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-03-28 0:56 7% [PATCH 0/2] index: support --compact / -c Eric Wong
2020-03-28 0:56 6% ` [PATCH 2/2] index: support --compact / -c on command-line Eric Wong
2020-05-01 18:04 [PATCH 0/2] tests: Perl 5.10.1 fixes Eric Wong
2020-05-01 18:04 6% ` [PATCH 2/2] t/convert-compact: avoid warning on `scalar(split(...))' Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).