* [PATCH 0/3] non-blocking xap_helper for public daemons
@ 2025-02-11 3:55 Eric Wong
2025-02-11 3:55 ` [PATCH 1/3] search: async_mset: always run callback on exceptions Eric Wong
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Eric Wong @ 2025-02-11 3:55 UTC (permalink / raw)
To: meta
3/3 ensures overloaded search won't completely block non-search
HTTP requests from being processed. 1/3 mainly affects generic
PSGI servers and 2/3 should make post-overload situations
slightly less expensive after 3/3 is applied.
Eric Wong (3):
search: async_mset: always run callback on exceptions
www: mbox: break out of loop on exceptions
daemon: make xap_helper socket non-blocking
MANIFEST | 1 +
lib/PublicInbox/Daemon.pm | 9 +++--
lib/PublicInbox/Mbox.pm | 1 +
lib/PublicInbox/Search.pm | 4 +--
lib/PublicInbox/SearchView.pm | 18 ++++++++--
lib/PublicInbox/TestCommon.pm | 21 +++++++++--
t/psgi_v2.t | 23 +++----------
xt/slow-xh-search.t | 65 +++++++++++++++++++++++++++++++++++
8 files changed, 114 insertions(+), 28 deletions(-)
create mode 100644 xt/slow-xh-search.t
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] search: async_mset: always run callback on exceptions
2025-02-11 3:55 [PATCH 0/3] non-blocking xap_helper for public daemons Eric Wong
@ 2025-02-11 3:55 ` Eric Wong
2025-02-11 3:55 ` [PATCH 2/3] www: mbox: break out of loop " Eric Wong
2025-02-11 3:55 ` [PATCH 3/3] daemon: make xap_helper socket non-blocking Eric Wong
2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2025-02-11 3:55 UTC (permalink / raw)
To: meta
To produce consistent error behavior, ensure we always run
the user-supplied callback on exceptions when using in-process
Xapian (as opposed to the external xap_helper process).
---
lib/PublicInbox/Search.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 784e3b0a..cb166101 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -493,8 +493,8 @@ sub async_mset {
$cb->(@args, undef, $@) if $@;
$ret;
} else { # synchronous
- my $mset = $self->mset($qry_str, $opt);
- $cb->(@args, $mset);
+ my $mset = eval { $self->mset($qry_str, $opt) };
+ $@ ? $cb->(@args, undef, $@) : $cb->(@args, $mset);
undef;
}
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] www: mbox: break out of loop on exceptions
2025-02-11 3:55 [PATCH 0/3] non-blocking xap_helper for public daemons Eric Wong
2025-02-11 3:55 ` [PATCH 1/3] search: async_mset: always run callback on exceptions Eric Wong
@ 2025-02-11 3:55 ` Eric Wong
2025-02-11 3:55 ` [PATCH 3/3] daemon: make xap_helper socket non-blocking Eric Wong
2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2025-02-11 3:55 UTC (permalink / raw)
To: meta
In case async_mset fails due to resource limitations, ensure we
stop trying to make expensive search queries ASAP.
---
lib/PublicInbox/Mbox.pm | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 17893a09..a5974a39 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -181,6 +181,7 @@ my $refill_ids_cb = sub { # async_mset cb
$http = undef unless $ctx->{-really_async};
if ($err) {
warn "E: $err";
+ $ctx->{-mbox_done} = 1;
$ctx->close if $http; # our async httpd
return;
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] daemon: make xap_helper socket non-blocking
2025-02-11 3:55 [PATCH 0/3] non-blocking xap_helper for public daemons Eric Wong
2025-02-11 3:55 ` [PATCH 1/3] search: async_mset: always run callback on exceptions Eric Wong
2025-02-11 3:55 ` [PATCH 2/3] www: mbox: break out of loop " Eric Wong
@ 2025-02-11 3:55 ` Eric Wong
2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2025-02-11 3:55 UTC (permalink / raw)
To: meta
A non-blocking socket for public-facing searches is ideal on
overloaded servers which receive more searches than it can
handle in a short time span. A non-blocking socket here
prevents a sendmsg(2) call from blocking the main event loop
and ensures non-search requests can still be processed even
if all xap_helper subprocesses are overwhelmed.
$search_xh_pid is hoisted out to TestCommon for use with the new
slow-xh-search test and renamed $find_xh_pid to avoid confusion
with Xapian search functionality. We'll also drop a redundant
call to find the xap_helper PID in psgi_v2.t while transitioning
to $find_xh_pid.
---
MANIFEST | 1 +
lib/PublicInbox/Daemon.pm | 9 +++--
lib/PublicInbox/SearchView.pm | 18 ++++++++--
lib/PublicInbox/TestCommon.pm | 21 +++++++++--
t/psgi_v2.t | 23 +++----------
xt/slow-xh-search.t | 65 +++++++++++++++++++++++++++++++++++
6 files changed, 111 insertions(+), 26 deletions(-)
create mode 100644 xt/slow-xh-search.t
diff --git a/MANIFEST b/MANIFEST
index 8bddb751..2ca7755c 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -678,5 +678,6 @@ xt/perf-msgview.t
xt/perf-nntpd.t
xt/perf-threading.t
xt/pop3d-mpop.t
+xt/slow-xh-search.t
xt/solver.t
xt/stress-sharedkv.t
diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index 84147b68..e00e0a78 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -400,9 +400,12 @@ sub spawn_xh () {
local $ENV{STDOUT_PATH} = $stdout;
PublicInbox::XapClient::start_helper('-j', $xh_workers)
};
- warn "E: $@" if $@;
- awaitpid($PublicInbox::Search::XHC->{io}->attached_pid, \&respawn_xh)
- if $PublicInbox::Search::XHC;
+ if ($@) {
+ warn "E: $@";
+ } elsif (my $xhc = $PublicInbox::Search::XHC) {
+ $xhc->{io}->blocking(0);
+ awaitpid($xhc->{io}->attached_pid, \&respawn_xh);
+ }
}
sub reopen_logs {
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index 1213b3d3..8ae422f9 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -16,6 +16,11 @@ use PublicInbox::SearchThread;
use PublicInbox::SearchQuery;
use PublicInbox::Search qw(get_pct);
my %rmap_inc;
+use Errno ();
+
+# possible busy errors from async_mset (via pipe, sendmsg, epoll_ctl)
+my %BUSY = map { (Errno->$_, 1) } qw(EAGAIN ENOBUFS ENOMEM ETOOMANYREFS
+ EMFILE ENFILE ENOSPC);
sub mbox_results {
my ($ctx) = @_;
@@ -60,12 +65,21 @@ sub sres_top_html {
sub sres_html_cb { # async_mset cb
my ($ctx, $opt, $q, $mset, $err) = @_;
my $code = 200;
+ if ($err) {
+ if ($BUSY{$! + 0}) {
+ warn "W: query failed: $!";
+ $code = 503;
+ } else { # bad query by user
+ $code = 400;
+ }
+ }
my $total = $mset ? $mset->get_matches_estimated : undef;
ctx_prepare($q, $ctx);
my ($res, $html);
- if ($err) {
- $code = 400;
+ if ($code == 400) {
$html = '<pre>'.err_txt($ctx, $err).'</pre><hr>';
+ } elsif ($code == 503) {
+ $html = "<pre>\ntoo busy</pre><hr>";
} elsif ($total == 0) {
if (defined($ctx->{-uxs_retried})) { # undo retry damage:
$q->{'q'} = $ctx->{-uxs_retried};
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index cebd0936..cbcfc008 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -15,7 +15,7 @@ use Carp ();
our @EXPORT;
my $lei_loud = $ENV{TEST_LEI_ERR_LOUD};
our $tail_cmd = $ENV{TAIL};
-our ($lei_opt, $lei_out, $lei_err);
+our ($lei_opt, $lei_out, $lei_err, $find_xh_pid);
use autodie qw(chdir close fcntl mkdir open opendir seek unlink);
$ENV{XDG_CACHE_HOME} //= "$ENV{HOME}/.cache"; # reuse C++ xap_helper builds
$ENV{GIT_TEST_FSYNC} = 0; # hopefully reduce wear
@@ -32,7 +32,7 @@ BEGIN {
tcp_host_port test_lei lei lei_ok $lei_out $lei_err $lei_opt
test_httpd no_httpd_errors xbail require_cmd is_xdeeply tail_f
ignore_inline_c_missing no_pollerfd no_coredump cfg_new
- strace strace_inject lsof_pid oct_is);
+ strace strace_inject lsof_pid oct_is $find_xh_pid);
require Test::More;
my @methods = grep(!/\W/, @Test::More::EXPORT);
eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
@@ -1086,6 +1086,23 @@ sub oct_is ($$$) {
goto &is; # tail recursion to get lineno from callers on failure
}
+$find_xh_pid = $^O eq 'linux' && -r "/proc/$$/stat" ? sub {
+ my ($ppid) = @_;
+ my ($cmdline, $fh, @s);
+ for (glob('/proc/*/stat')) {
+ CORE::open $fh, '<', $_ or next;
+ @s = split /\s+/, readline($fh) // next;
+ next if $s[3] ne $ppid; # look for matching PPID
+ CORE::open $fh, '<', "/proc/$s[0]/cmdline" or next;
+ $cmdline = readline($fh) // next;
+ if ($cmdline =~ /\0-MPublicInbox::XapHelper\0-e\0/ ||
+ $cmdline =~ m!/xap_helper\0!) {
+ return $s[0];
+ }
+ }
+ undef;
+} : 'xap_helper PID lookup currently depends on Linux /proc';
+
package PublicInbox::TestCommon::InboxWakeup;
use strict;
sub on_inbox_unlock { ${$_[0]}->($_[1]) }
diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 2027037d..dfec5abc 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -393,23 +393,9 @@ my $client3 = sub {
test_psgi(sub { $www->call(@_) }, $client3);
test_httpd($env, $client3, 4);
-if ($^O eq 'linux' && -r "/proc/$$/stat") {
+SKIP: {
+ ref($find_xh_pid) or skip $find_xh_pid, 1;
my $args;
- my $search_xh_pid = sub {
- my ($pid) = @_;
- for my $f (glob('/proc/*/stat')) {
- open my $fh, '<', $f or next;
- my @s = split /\s+/, readline($fh) // next;
- next if $s[3] ne $pid; # look for matching PPID
- open $fh, '<', "/proc/$s[0]/cmdline" or next;
- my $cmdline = readline($fh) // next;
- if ($cmdline =~ /\0-MPublicInbox::XapHelper\0-e\0/ ||
- $cmdline =~ m!/xap_helper\0!) {
- return $s[0];
- }
- }
- undef;
- };
my $usr1_test = sub {
my ($cb) = @_;
my $td = $PublicInbox::TestCommon::CURRENT_DAEMON;
@@ -417,8 +403,7 @@ if ($^O eq 'linux' && -r "/proc/$$/stat") {
my $res = $cb->(GET('/v2test/?q=m:a-mid@b'));
is $res->code, 200, '-httpd is running w/ search';
- $search_xh_pid->($pid);
- my $xh_pid = $search_xh_pid->($pid) or
+ my $xh_pid = $find_xh_pid->($pid) or
BAIL_OUT "can't find XH pid with $args";
my $xh_err = readlink "/proc/$xh_pid/fd/2";
is $xh_err, "$env->{TMPDIR}/stderr.log",
@@ -432,7 +417,7 @@ if ($^O eq 'linux' && -r "/proc/$$/stat") {
tick;
$res = $cb->(GET('/v2test/?q=m:a-mid@b'));
is $res->code, 200, '-httpd still running w/ search';
- my $new_xh_pid = $search_xh_pid->($pid) or
+ my $new_xh_pid = $find_xh_pid->($pid) or
BAIL_OUT "can't find new XH pid with $args";
is $new_xh_pid, $xh_pid, "XH pid unchanged ($args)";
$xh_err = readlink "/proc/$new_xh_pid/fd/2";
diff --git a/xt/slow-xh-search.t b/xt/slow-xh-search.t
new file mode 100644
index 00000000..f8602864
--- /dev/null
+++ b/xt/slow-xh-search.t
@@ -0,0 +1,65 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use v5.12;
+use autodie;
+use PublicInbox::TestCommon;
+use Socket ();
+ref($find_xh_pid) or plan skip_all => $find_xh_pid;
+require_mods qw(-httpd v2 psgi Xapian);
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $tmpdir = tmpdir;
+my $unix = "$tmpdir/unix.sock";
+my $out = "$tmpdir/out.log";
+my $err = "$tmpdir/err.log";
+my $cmd = [ qw(-httpd -W0 -X0), "--stdout=$out", "--stderr=$err" ];
+my $env = { PI_CONFIG => $cfg_path };
+my $srv = IO::Socket::UNIX->new(Listen => 4096, Local => $unix,
+ Type => Socket::SOCK_STREAM);
+my $rdr = { 3 => $srv };
+$srv->blocking(0);
+my $td = start_script($cmd, $env, $rdr);
+chomp(my $nfd = `/bin/sh -c 'ulimit -n'`);
+diag "RLIMIT_NOFILE=$nfd, consider increasing if test fails";
+fail "RLIMIT_NOFILE=$nfd too small" if $nfd < 11;
+$nfd -= 10;
+
+my $req = "GET /t2/?q=m:testmessage\@example.com HTTP/1.0\r\n\r\n";
+my @opt = (Peer => $unix, Type => Socket::SOCK_STREAM);
+my $start_req = sub {
+ my $c = IO::Socket::UNIX->new(@opt) or xbail "connect: $!";
+ $c->autoflush(1);
+ print $c $req;
+ $c;
+};
+
+# make a search request/response to ensure xap_helper is started
+read $start_req->(), my $buf, 16384;
+
+my $xh_pid = $find_xh_pid->($td->{pid}) or
+ xbail "can't find XH pid";
+kill 'STOP', $xh_pid;
+diag "starting $nfd requests...";
+my @c = map { $start_req->() } (1..$nfd);
+diag "all $nfd requested";
+tick 1;
+kill 'CONT', $xh_pid;
+my %codes = (200 => 0, 503 => 0);
+diag "reading $nfd responses";
+for my $c (@c) {
+ read $c, $buf, 16384;
+ undef $c;
+ if ($buf =~ m!\AHTTP/1\.[01] (\d+)!) {
+ $codes{$1}++;
+ } else {
+ diag explain($buf);
+ fail 'bad response';
+ }
+}
+ok $codes{200}, 'got some 200 responses' or diag explain(\%codes);
+ok $codes{503}, 'got some 503 errors' or diag explain(\%codes);
+is $codes{200} + $codes{503}, $nfd,
+ 'only got 503 and 200 codes' or diag explain(\%codes);
+
+$td->join('TERM');
+done_testing;
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-02-11 3:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-11 3:55 [PATCH 0/3] non-blocking xap_helper for public daemons Eric Wong
2025-02-11 3:55 ` [PATCH 1/3] search: async_mset: always run callback on exceptions Eric Wong
2025-02-11 3:55 ` [PATCH 2/3] www: mbox: break out of loop " Eric Wong
2025-02-11 3:55 ` [PATCH 3/3] daemon: make xap_helper socket non-blocking Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).