user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 7/9] daemon: unconditionally close Xapian shards on cleanup
Date: Tue, 12 Oct 2021 11:47:03 +0000	[thread overview]
Message-ID: <20211012114705.383-8-e@80x24.org> (raw)
In-Reply-To: <20211012114705.383-1-e@80x24.org>

The cost of opening a Xapian DB (even with shards) isn't high,
so save some FDs and just close it.  We hit Xapian far less than
over.sqlite3 and we discard the MSet ASAP even when streaming
large responses.

This simplifies our code a bit and hopefully helps reduce
fragmentation by increasing mortality of late allocations.
---
 lib/PublicInbox/Inbox.pm  | 18 +++++++++---------
 lib/PublicInbox/Search.pm | 14 --------------
 t/extsearch.t             |  8 +-------
 3 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 724df50a..61d153bf 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -11,8 +11,6 @@ use List::Util qw(max);
 use Carp qw(croak);
 
 # returns true if further checking is required
-sub cleanup_shards { $_[0]->{search} ? $_[0]->{search}->cleanup_shards : undef }
-
 sub check_inodes ($) {
 	for (qw(over mm)) { $_[0]->{$_}->check_inodes if $_[0]->{$_} }
 }
@@ -31,7 +29,8 @@ sub do_cleanup {
 		delete(@$ibx{qw(over mm description cloneurl
 				-imap_url -nntp_url)});
 	}
-	$ibx->cleanup_shards and $live = 1;
+	my $srch = $ibx->{search} // $ibx;
+	delete @$srch{qw(xdb qp)};
 	for my $git (@{$ibx->{-repo_objs} // []}) {
 		$live = 1 if $git->cleanup(1);
 	}
@@ -138,17 +137,18 @@ sub max_git_epoch {
 	$cur;
 }
 
+sub mm_file {
+	my ($self) = @_;
+	my $d = $self->{inboxdir};
+	($self->version >= 2 ? $d : "$d/public-inbox").'/msgmap.sqlite3';
+}
+
 sub mm {
 	my ($self, $req) = @_;
 	$self->{mm} //= eval {
 		require PublicInbox::Msgmap;
 		_cleanup_later($self);
-		my $dir = $self->{inboxdir};
-		if ($self->version >= 2) {
-			PublicInbox::Msgmap->new_file("$dir/msgmap.sqlite3");
-		} else {
-			PublicInbox::Msgmap->new($dir);
-		}
+		PublicInbox::Msgmap->new_file(mm_file($self));
 	} // ($req ? croak("E: $@") : undef);
 }
 
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index dd6d3710..f0e7ed0c 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -243,20 +243,6 @@ sub xdb ($) {
 	};
 }
 
-# returns true if a future rescan is desired
-sub cleanup_shards {
-	my ($self) = @_;
-	return unless exists($self->{xdb});
-	my $xpfx = $self->{xpfx};
-	return reopen($self) if $xpfx =~ m!/xapian[0-9]+\z!; # true
-	opendir(my $dh, $xpfx) or return warn("$xpfx gone: $!\n"); # true
-	my $nr = grep(/\A[0-9]+\z/, readdir($dh)) or
-		return warn("$xpfx has no shards\n"); # true
-	return reopen($self) if $nr == ($self->{nshard} // -1);
-	delete @$self{qw(xdb qp)};
-	undef;
-}
-
 sub new {
 	my ($class, $ibx) = @_;
 	ref $ibx or die "BUG: expected PublicInbox::Inbox object: $ibx";
diff --git a/t/extsearch.t b/t/extsearch.t
index 8190de17..dfc190e2 100644
--- a/t/extsearch.t
+++ b/t/extsearch.t
@@ -447,9 +447,6 @@ SKIP: {
 	ok(ref($es->{xdb}), '{xdb} created');
 	my $nshards1 = $es->{nshard};
 	is($nshards1, 1, 'correct shard count');
-	my $xdb_str = "$es->{xdb}";
-	ok($es->cleanup_shards, 'cleanup_shards noop');
-	is("$es->{xdb}", $xdb_str, '{xdb} unchanged');
 
 	my @ei_dir = glob("$d/ei*/");
 	chmod 0755, $ei_dir[0] or xbail "chmod: $!";
@@ -463,11 +460,8 @@ SKIP: {
 		my $m = sprintf('%04o', 07777 & (stat($dirs[$i]))[2]);
 		is($m, $mode, "shard [$i] mode");
 	}
-	is($es->cleanup_shards, undef, 'cleanup_shards cleaned');
-	ok(!defined($es->{xdb}), 'old {xdb} gone');
-	is($es->cleanup_shards, undef, 'cleanup_shards clean idempotent');
+	delete @$es{qw(xdb qp)};
 	is($es->mset('z:0..')->size, $nresult0, 'new shards, same results');
-	ok($es->cleanup_shards, 'cleanup_shards true after open');
 
 	for my $i (4..5) {
 		is(grep(m!/ei[0-9]+/$i\z!, @dirs), 0, "no shard [$i]");

  parent reply	other threads:[~2021-10-12 11:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-12 11:46 [PATCH 0/9] various read-only daemon and WWW things Eric Wong
2021-10-12 11:46 ` [PATCH 1/9] daemon: use v5.10.1, disable local warnings Eric Wong
2021-10-12 11:46 ` [PATCH 2/9] daemon: quiet down Eml-related warnings Eric Wong
2021-10-12 11:46 ` [PATCH 3/9] search: delete QueryParser along with DB handle Eric Wong
2021-10-12 11:47 ` [PATCH 4/9] nntp: use defined-OR from Perl 5.10 for msgid check Eric Wong
2021-10-12 11:47 ` [PATCH 5/9] msgmap: use DBI->prepare_cached Eric Wong
2021-10-12 11:47 ` [PATCH 6/9] msgmap: share most of check_inodes w/ over Eric Wong
2021-10-12 11:47 ` Eric Wong [this message]
2021-10-12 11:47 ` [PATCH 8/9] msgmap: ->new_file to supports $ibx arg, drop ->new Eric Wong
2021-10-12 11:47 ` [PATCH 9/9] www: _/text/config/raw Last-Modified: is mm->created_at Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211012114705.383-8-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    --subject='Re: [PATCH 7/9] daemon: unconditionally close Xapian shards on cleanup' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).