* [PATCH 1/2] searchidxshard: ensure we set indexlevel on shard[0]
2020-03-28 0:56 6% [PATCH 0/2] index: support --compact / -c Eric Wong
@ 2020-03-28 0:56 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-03-28 0:56 UTC (permalink / raw)
To: meta
For sharded v2 repositories with few-enough messages, it is
possible for shard[0] to go unused and never trigger the
->commit_txn_lazy to set the indexlevel field in Xapian
metadata.
So set it immediately at initialization and avoid this case.
While we're at it, avoid triggering needless pwrite syscalls
from ->set_metadata by checking with ->get_metadata, first.
---
lib/PublicInbox/SearchIdx.pm | 26 +++++++++++++++++---------
lib/PublicInbox/SearchIdxShard.pm | 4 +++-
t/init.t | 7 ++++++-
3 files changed, 26 insertions(+), 11 deletions(-)
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 44b05813..7d089e7a 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -58,6 +58,7 @@ sub new {
ibx_ver => $version,
indexlevel => $indexlevel,
}, $class;
+ $self->{-set_indexlevel_once} = 1 if $indexlevel eq 'medium';
$ibx->umask_prepare;
if ($version == 1) {
$self->{lock_path} = "$inboxdir/ssoma.lock";
@@ -842,20 +843,27 @@ sub begin_txn_lazy {
});
}
+# store 'indexlevel=medium' in v2 shard=0 and v1 (only one shard)
+# This metadata is read by Admin::detect_indexlevel:
+sub set_indexlevel {
+ my ($self) = @_;
+
+ if (!$self->{shard} && # undef or 0, not >0
+ delete($self->{-set_indexlevel_once})) {
+ my $xdb = $self->{xdb};
+ my $level = $xdb->get_metadata('indexlevel');
+ if (!$level || $level ne 'medium') {
+ $xdb->set_metadata('indexlevel', 'medium');
+ }
+ }
+}
+
sub commit_txn_lazy {
my ($self) = @_;
delete $self->{txn} or return;
$self->{-inbox}->with_umask(sub {
if (my $xdb = $self->{xdb}) {
-
- # store 'indexlevel=medium' in v2 shard=0 and
- # v1 (only one shard)
- # This metadata is read by Admin::detect_indexlevel:
- if (!$self->{shard} # undef or 0, not >0
- && $self->{indexlevel} eq 'medium') {
- $xdb->set_metadata('indexlevel', 'medium');
- }
-
+ set_indexlevel($self);
$xdb->commit_transaction;
}
$self->{over}->commit_lazy if $self->{over};
diff --git a/lib/PublicInbox/SearchIdxShard.pm b/lib/PublicInbox/SearchIdxShard.pm
index 2b48b1b4..1ea01095 100644
--- a/lib/PublicInbox/SearchIdxShard.pm
+++ b/lib/PublicInbox/SearchIdxShard.pm
@@ -11,9 +11,11 @@ use IO::Handle (); # autoflush
sub new {
my ($class, $v2writable, $shard) = @_;
- my $self = $class->SUPER::new($v2writable->{-inbox}, 1, $shard);
+ my $ibx = $v2writable->{-inbox};
+ my $self = $class->SUPER::new($ibx, 1, $shard);
# create the DB before forking:
$self->_xdb_acquire;
+ $self->set_indexlevel;
$self->_xdb_release;
$self->spawn_worker($v2writable, $shard) if $v2writable->{parallel};
$self;
diff --git a/t/init.t b/t/init.t
index e20ff006..a78c2fc8 100644
--- a/t/init.t
+++ b/t/init.t
@@ -5,6 +5,7 @@ use warnings;
use Test::More;
use PublicInbox::Config;
use PublicInbox::TestCommon;
+use PublicInbox::Admin;
use File::Basename;
my ($tmpdir, $for_destroy) = tmpdir();
sub quiet_fail {
@@ -72,11 +73,15 @@ SKIP: {
quiet_fail($cmd, 'initializing V2 as V1 fails');
foreach my $lvl (qw(medium basic)) {
+ my $dir = "$tmpdir/v2$lvl";
$cmd = [ '-init', "v2$lvl", '-V2', '-L', $lvl,
- "$tmpdir/v2$lvl", "http://example.com/v2$lvl",
+ $dir, "http://example.com/v2$lvl",
"v2$lvl\@example.com" ];
ok(run_script($cmd), "-init -L $lvl");
is(read_indexlevel("v2$lvl"), $lvl, "indexlevel set to '$lvl'");
+ my $ibx = PublicInbox::Inbox->new({ inboxdir => $dir });
+ is(PublicInbox::Admin::detect_indexlevel($ibx), $lvl,
+ 'detected expected level w/o config');
}
# loop for idempotency
^ permalink raw reply related [relevance 7%]
* [PATCH 0/2] index: support --compact / -c
@ 2020-03-28 0:56 6% Eric Wong
2020-03-28 0:56 7% ` [PATCH 1/2] searchidxshard: ensure we set indexlevel on shard[0] Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-03-28 0:56 UTC (permalink / raw)
To: meta
It looks like HDDs and SSDs have gotten and will get even more
expensive due to manufacturing freezes from the pandemic.
Indexing (especially with --reindex to fixup old bugs) takes a
large amount of space, so support running compact immediately
after indexing to avoid users having to script a -compact
invocation for each inbox. Compacting before indexing can be
triggered by using this switch twice, to further reduce space
overhead at a small time loss.
Note: I only found the bug fixed in 1/2 while testing 2/2. It
took me a while to fix this bug because I've probably lost 10
IQ points from the stress of recent weeks :<
Eric Wong (2):
searchidxshard: ensure we set indexlevel on shard[0]
index: support --compact / -c on command-line
Documentation/public-inbox-index.pod | 24 ++++++++++++++++++++----
lib/PublicInbox/InboxWritable.pm | 1 +
lib/PublicInbox/SearchIdx.pm | 26 +++++++++++++++++---------
lib/PublicInbox/SearchIdxShard.pm | 4 +++-
lib/PublicInbox/Xapcmd.pm | 4 +++-
script/public-inbox-index | 20 +++++++++++++++++---
t/convert-compact.t | 13 +++++++++++++
t/init.t | 7 ++++++-
8 files changed, 80 insertions(+), 19 deletions(-)
^ permalink raw reply [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-03-28 0:56 6% [PATCH 0/2] index: support --compact / -c Eric Wong
2020-03-28 0:56 7% ` [PATCH 1/2] searchidxshard: ensure we set indexlevel on shard[0] Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).