user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/7] limiter fixes and updates
@ 2025-02-08  3:26 Eric Wong
  2025-02-08  3:26 ` [PATCH 1/7] syscall: `use bytes' throughout Eric Wong
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

The end goal is to make stupid expensive stuff such as blob
reconstruction (and in the future, rediff) possible on unloaded
servers, but also to fail more gracefully and be less
detrimental to for clients hitting less expensive endpoints.

Most notably, the publicinboxlimiter.<name>.depth parameter now
exists for all limiters so it's convenient to return 503 errors
when the limiter queue is too large.  The default value is only
32 (`-codeblob' had hardcoded 128 before in ViewVCS.pm,
everything else was unlimited).

This means excessive queueing for git clone/fetch will
disconnect clients more quickly if servers get hammered.
So with 5/7, the maximum git-http-backend processes in a
default instances is down to 64: 32 parallel processes and
32 queued clients.  Previously the limit was 32 parallel
processes and unlimited (RLIMIT_NOFILE bound) clients which
could cause problems for users of other endpoints.

3/7 to check client connections before doing expensive tasks
is an idea stolen from some other generic app server.  It
will be used more heavily for other expensive endpoints.

xt/git-http-backend-parallel.t in 5/7 is a stupid expensive
maintainer test since it needs to serve many parallel clones and
testing parallelism is tricky.

Eric Wong (7):
  syscall: `use bytes' throughout
  git_http_backend: fix 32 default connection limit
  daemon: check connections before solving codeblobs
  qspawn: use limiter->new default
  viewvcs: -codeblob limiter w/ depth for solver
  qspawn: drop redundant check against limiter->{max}
  config: handle limiter `max' knob in ->setup_limiter

 Documentation/public-inbox-config.pod |  17 +++-
 MANIFEST                              |   3 +
 devel/sysdefs-list                    |  16 +++-
 lib/PublicInbox/Config.pm             |   9 +--
 lib/PublicInbox/Daemon.pm             |  29 +++++++
 lib/PublicInbox/GitHTTPBackend.pm     |   3 +-
 lib/PublicInbox/HTTP.pm               |   7 +-
 lib/PublicInbox/HTTPD.pm              |   2 +
 lib/PublicInbox/Limiter.pm            |  32 ++++++--
 lib/PublicInbox/Qspawn.pm             |  31 ++++---
 lib/PublicInbox/Syscall.pm            |  13 ++-
 lib/PublicInbox/TestCommon.pm         |   3 +-
 lib/PublicInbox/ViewVCS.pm            |  50 +++++++-----
 t/daemon.t                            |  56 +++++++++++++
 t/psgi_log.psgi                       |  16 ++++
 t/solver_git.t                        |  68 +++++++++++++++-
 xt/git-http-backend-parallel.t        | 111 ++++++++++++++++++++++++++
 17 files changed, 410 insertions(+), 56 deletions(-)
 create mode 100644 t/daemon.t
 create mode 100644 t/psgi_log.psgi
 create mode 100644 xt/git-http-backend-parallel.t

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/7] syscall: `use bytes' throughout
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  2025-02-08  3:26 ` [PATCH 2/7] git_http_backend: fix 32 default connection limit Eric Wong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

All *nix syscalls operate on the byte-level with no knowledge of
encodings, so ensure all `substr' and `length' calculations use
the bytes(3perl) pragma.
---
 lib/PublicInbox/Syscall.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Syscall.pm b/lib/PublicInbox/Syscall.pm
index e5e3340f..d005fecb 100644
--- a/lib/PublicInbox/Syscall.pm
+++ b/lib/PublicInbox/Syscall.pm
@@ -18,6 +18,7 @@
 package PublicInbox::Syscall;
 use v5.12;
 use parent qw(Exporter);
+use bytes qw(length substr);
 use POSIX qw(ENOENT ENOSYS EINVAL O_NONBLOCK);
 use Socket qw(SOL_SOCKET SCM_RIGHTS);
 use Config;
@@ -544,7 +545,6 @@ require PublicInbox::CmdIPC4;
 };
 
 *sendmsg_more = sub ($@) {
-	use bytes qw(length substr);
 	my $sock = shift;
 	my $iov = join('', map { pack 'P'.TMPL_size_t, $_, length } @_);
 	my $mh = pack(TMPL_msghdr,
@@ -563,7 +563,6 @@ require PublicInbox::CmdIPC4;
 if (defined($SYS_writev)) {
 *writev = sub {
 	my $fh = shift;
-	use bytes qw(length substr);
 	my $iov = join('', map { pack 'P'.TMPL_size_t, $_, length } @_);
 	my $w;
 	do {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/7] git_http_backend: fix 32 default connection limit
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
  2025-02-08  3:26 ` [PATCH 1/7] syscall: `use bytes' throughout Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  2025-02-08  3:26 ` [PATCH 3/7] daemon: check connections before solving codeblobs Eric Wong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

Using any configurable limiter actually defaults to `1', which
isn't intended for `-httpbackend'.  So add a $max_default
parameter to ensure we can set a per-limiter default value for
default limiters such as `-httpbackend'.

Fixes: 0a083241 (git_http_backend: use default limiter from Qspawn, 2025-01-28)
---
 lib/PublicInbox/Config.pm         | 10 +++++-----
 lib/PublicInbox/GitHTTPBackend.pm |  3 ++-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 03db2a51..2bb1a672 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -123,13 +123,13 @@ sub lookup_newsgroup {
 }
 
 sub limiter {
-	my ($self, $name) = @_;
+	my ($self, $name, $max_default) = @_;
 	$self->{-limiters}->{$name} //= do {
 		require PublicInbox::Limiter;
-		my $max = $self->{"publicinboxlimiter.$name.max"} || 1;
-		my $limiter = PublicInbox::Limiter->new($max);
-		$limiter->setup_rlimit($name, $self);
-		$limiter;
+		my $max = $self->{"publicinboxlimiter.$name.max"};
+		my $l = PublicInbox::Limiter->new($max || $max_default || 1);
+		$l->setup_rlimit($name, $self);
+		$l;
 	};
 }
 
diff --git a/lib/PublicInbox/GitHTTPBackend.pm b/lib/PublicInbox/GitHTTPBackend.pm
index 172de2bc..57d2ec7b 100644
--- a/lib/PublicInbox/GitHTTPBackend.pm
+++ b/lib/PublicInbox/GitHTTPBackend.pm
@@ -98,8 +98,9 @@ sub serve_smart ($$$;$) {
 		my $val = $env->{$name};
 		$env{$name} = $val if defined $val;
 	}
+	# 32 is the default connection limit for git-daemon, so match that
 	my $limiter = $git->{-httpbackend_limiter} //
-		($pi_cfg ? $pi_cfg->limiter('-httpbackend') : undef);
+		($pi_cfg ? $pi_cfg->limiter('-httpbackend', 32) : undef);
 	$env{GIT_HTTP_EXPORT_ALL} = '1';
 	$env{PATH_TRANSLATED} = "$git->{git_dir}/$path";
 	my $rdr = input_prepare($env) or return r(500);

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/7] daemon: check connections before solving codeblobs
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
  2025-02-08  3:26 ` [PATCH 1/7] syscall: `use bytes' throughout Eric Wong
  2025-02-08  3:26 ` [PATCH 2/7] git_http_backend: fix 32 default connection limit Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  2025-02-08  3:26 ` [PATCH 4/7] qspawn: use limiter->new default Eric Wong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

The /$MSGID/s/ codeblob reconstruction endpoint is extremely
expensive and one of the few which do a large amount of upfront
processing before being able to write to an HTTP client and
detect if they're still alive (unlike other expensive
endpoints).

For TCP connections, use the TCP_INFO via getsockopt(2) if
supported or attempt to use poll(2) + the FIONREAD ioctl(2) to
check the connection before possibly invoking
git-(apply|ls-files|update-index) dozens or even hundreds of
times to reconstruct a blob.
---
 MANIFEST                      |  2 ++
 devel/sysdefs-list            | 16 +++++++++-
 lib/PublicInbox/Daemon.pm     | 29 ++++++++++++++++++
 lib/PublicInbox/HTTP.pm       |  7 +++--
 lib/PublicInbox/HTTPD.pm      |  2 ++
 lib/PublicInbox/Syscall.pm    | 10 ++++++-
 lib/PublicInbox/TestCommon.pm |  3 +-
 lib/PublicInbox/ViewVCS.pm    | 13 +++++---
 t/daemon.t                    | 56 +++++++++++++++++++++++++++++++++++
 t/psgi_log.psgi               | 16 ++++++++++
 t/solver_git.t                | 49 ++++++++++++++++++++++++++++--
 11 files changed, 192 insertions(+), 11 deletions(-)
 create mode 100644 t/daemon.t
 create mode 100644 t/psgi_log.psgi

diff --git a/MANIFEST b/MANIFEST
index 30adab80..1fc2a152 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -448,6 +448,7 @@ t/config.t
 t/config_limiter.t
 t/content_hash.t
 t/convert-compact.t
+t/daemon.t
 t/data-gen/.gitignore
 t/data/0001.patch
 t/data/attached-mbox-with-utf8.eml
@@ -587,6 +588,7 @@ t/precheck.t
 t/psgi_attach.eml
 t/psgi_attach.t
 t/psgi_bad_mids.t
+t/psgi_log.psgi
 t/psgi_mount.t
 t/psgi_multipart_not.t
 t/psgi_scan_all.t
diff --git a/devel/sysdefs-list b/devel/sysdefs-list
index 1345e0ba..3a2e60d2 100755
--- a/devel/sysdefs-list
+++ b/devel/sysdefs-list
@@ -24,6 +24,14 @@ my $x = "$tmp/sysdefs";
 open my $fh, '>', $f or die "open $f $!";
 print $fh $str or die "print $f $!";
 close $fh or die "close $f $!";
+for (qw(sys/ioctl sys/filio)) {
+	my $cfg_name = $_;
+	my $cpp_name = uc $_;
+	$cfg_name =~ tr!/!!d;
+	$cpp_name =~ tr!/!_!;
+	($Config{"i_$cfg_name"} // '') eq 'define' and
+		push @cflags, "-DHAVE_${cpp_name}_H";
+}
 system($cc, '-o', $x, $f, @cflags) == 0 or die "$cc failed \$?=$?";
 print STDERR '# %Config',
 	(map { " $_=$Config{$_}" } qw(ptrsize sizesize lseeksize)), "\n";
@@ -37,7 +45,12 @@ __DATA__
 #include <stddef.h>
 #include <sys/socket.h>
 #include <sys/syscall.h>
-#include <sys/ioctl.h>
+#ifdef HAVE_SYS_IOCTL_H
+#	include <sys/ioctl.h>
+#endif
+#ifdef HAVE_SYS_FILIO_H
+#	include <sys/filio.h>
+#endif
 #ifdef __linux__
 #	include <linux/fs.h>
 #	include <sys/epoll.h>
@@ -144,6 +157,7 @@ int main(void)
 #endif /* Linux, any other OSes with stable syscalls? */
 
 	D(SIGWINCH);
+	MAYBE X(FIONREAD);
 	MAYBE D(SO_ACCEPTFILTER);
 	MAYBE D(_SC_NPROCESSORS_ONLN);
 	MAYBE D(_SC_AVPHYS_PAGES);
diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index 57f01d2c..84147b68 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -11,6 +11,7 @@ use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
 use IO::Handle; # ->autoflush
 use IO::Socket;
 use File::Spec;
+use IO::Poll qw(POLLERR POLLIN POLLHUP);
 use POSIX qw(WNOHANG :signal_h F_SETFD);
 use Socket qw(IPPROTO_TCP SOL_SOCKET);
 STDOUT->autoflush(1);
@@ -754,4 +755,32 @@ sub write_pid ($) {
 	do_chown($path);
 }
 
+sub stream_hup ($) {
+	my $ev = POLLIN;
+	my $n = IO::Poll::_poll(0, fileno($_[0]) // return, $ev) or return;
+	return 1 if $ev & (POLLHUP|POLLERR);
+
+	# n.b. POLLHUP isn't reliably detected, so check FIONREAD on POLLIN
+	if (defined(PublicInbox::Syscall::FIONREAD) && ($ev & POLLIN)) {
+		ioctl($_[0], PublicInbox::Syscall::FIONREAD, $n = "") //
+			return;
+		return (unpack('i', $n) == 0);
+	}
+	undef;
+}
+
+if (PublicInbox::Syscall->can('TCP_ESTABLISHED')) {
+	eval <<'EOM';
+sub tcp_hup ($) {
+	my $buf = getsockopt($_[0], Socket::IPPROTO_TCP, Socket::TCP_INFO)
+		or return;
+	unpack('C', $buf) != PublicInbox::Syscall::TCP_ESTABLISHED
+}
+EOM
+	warn "E: $@" if $@;
+}
+
+no warnings 'once';
+*tcp_hup = \&stream_hup if !__PACKAGE__->can('tcp_hup');
+
 1;
diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm
index 80ebad16..105156e2 100644
--- a/lib/PublicInbox/HTTP.pm
+++ b/lib/PublicInbox/HTTP.pm
@@ -165,11 +165,14 @@ sub app_dispatch {
 	}
 }
 
+# we use the non-standard 499 code for client disconnects (matching nginx)
+sub status_msg ($) { status_message($_[0]) // 'error' }
+
 sub response_header_write ($$$) {
 	my ($self, $env, $res) = @_;
 	my $proto = $env->{SERVER_PROTOCOL} or return; # HTTP/0.9 :P
 	my $status = $res->[0];
-	my $h = "$proto $status " . status_message($status) . "\r\n";
+	my $h = "$proto $status " . status_msg($status) . "\r\n";
 	my ($len, $chunked);
 	my $headers = $res->[1];
 
@@ -428,7 +431,7 @@ sub read_input_chunked { # unlikely...
 
 sub quit {
 	my ($self, $status) = @_;
-	my $h = "HTTP/1.1 $status " . status_message($status) . "\r\n\r\n";
+	my $h = "HTTP/1.1 $status " . status_msg($status) . "\r\n\r\n";
 	$self->write(\$h);
 	$self->close;
 	undef; # input_prepare expects this
diff --git a/lib/PublicInbox/HTTPD.pm b/lib/PublicInbox/HTTPD.pm
index 6a6347d8..38b2e4f9 100644
--- a/lib/PublicInbox/HTTPD.pm
+++ b/lib/PublicInbox/HTTPD.pm
@@ -45,6 +45,8 @@ sub env_for ($$$) {
 		'pi-httpd.async' => 1,
 		'pi-httpd.app' => $self->{app},
 		'pi-httpd.warn_cb' => $self->{warn_cb},
+		'pi-httpd.ckhup' => $port ? \&PublicInbox::Daemon::tcp_hup :
+				\&PublicInbox::Daemon::stream_hup,
 	}
 }
 
diff --git a/lib/PublicInbox/Syscall.pm b/lib/PublicInbox/Syscall.pm
index d005fecb..15ff2005 100644
--- a/lib/PublicInbox/Syscall.pm
+++ b/lib/PublicInbox/Syscall.pm
@@ -305,6 +305,8 @@ BEGIN {
 	if ($^O eq 'linux') {
 		%CONST = (
 			MSG_MORE => 0x8000,
+			FIONREAD => 0x541b,
+			TCP_ESTABLISHED => 1,
 			TMPL_cmsg_len => TMPL_size_t,
 			# cmsg_len, cmsg_level, cmsg_type
 			SIZEOF_cmsghdr => SIZEOF_int * 2 + SIZEOF_size_t,
@@ -319,6 +321,7 @@ BEGIN {
 	} elsif ($^O =~ /\A(?:freebsd|openbsd|netbsd|dragonfly)\z/) {
 		%CONST = (
 			TMPL_cmsg_len => 'L', # socklen_t
+			FIONREAD => 0x4004667f,
 			SIZEOF_cmsghdr => SIZEOF_int * 3,
 			CMSG_DATA_off => SIZEOF_ptr == 8 ? '@16' : '',
 			TMPL_msghdr => 'PL' . # msg_name, msg_namelen
@@ -328,7 +331,11 @@ BEGIN {
 				TMPL_size_t. # msg_controllen
 				'i', # msg_flags
 
-		)
+		);
+		# *BSD uses `TCPS_ESTABLISHED', not `TCP_ESTABLISHED'
+		# dragonfly uses TCPS_ESTABLISHED==5, but it lacks TCP_INFO,
+		# so leave it unset on dfly
+		$CONST{TCP_ESTABLISHED} = 4 if $^O ne 'dragonfly';
 	}
 	$CONST{CMSG_ALIGN_size} = SIZEOF_size_t;
 	$CONST{SIZEOF_cmsghdr} //= 0;
@@ -336,6 +343,7 @@ BEGIN {
 	$CONST{CMSG_DATA_off} //= undef;
 	$CONST{TMPL_msghdr} //= undef;
 	$CONST{MSG_MORE} //= 0;
+	$CONST{FIONREAD} //= undef;
 }
 
 # SFD_CLOEXEC is arch-dependent, so IN_CLOEXEC may be, too
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index a887fef8..cebd0936 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -21,7 +21,7 @@ $ENV{XDG_CACHE_HOME} //= "$ENV{HOME}/.cache"; # reuse C++ xap_helper builds
 $ENV{GIT_TEST_FSYNC} = 0; # hopefully reduce wear
 
 $_ = File::Spec->rel2abs($_) for (grep(!m!^/!, @INC));
-our $CURRENT_DAEMON;
+our ($CURRENT_DAEMON, $CURRENT_LISTENER);
 BEGIN {
 	@EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
@@ -989,6 +989,7 @@ sub test_httpd ($$;$$) {
 		my $ua = LWP::UserAgent->new;
 		$ua->max_redirect(0);
 		local $CURRENT_DAEMON = $td;
+		local $CURRENT_LISTENER = $sock;
 		Plack::Test::ExternalServer::test_psgi(client => $client,
 							ua => $ua);
 		$cb->() if $cb;
diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm
index 96b60ee0..a6e54b30 100644
--- a/lib/PublicInbox/ViewVCS.pm
+++ b/lib/PublicInbox/ViewVCS.pm
@@ -642,12 +642,17 @@ sub show_blob { # git->cat_async callback
 
 sub start_solver ($) {
 	my ($ctx) = @_;
+	$ctx->{-next_solver} = on_destroy \&next_solver;
+	++$solver_nr;
+	if (my $ck = $ctx->{env}->{'pi-httpd.ckhup'}) {
+		$ck->($ctx->{env}->{'psgix.io'}->{sock}) and
+			return html_page $ctx, 499, 'client disconnected';
+	}
+
 	while (my ($from, $to) = each %QP_MAP) {
 		my $v = $ctx->{qp}->{$from} // next;
 		$ctx->{hints}->{$to} = $v if $v ne '';
 	}
-	$ctx->{-next_solver} = on_destroy \&next_solver;
-	++$solver_nr;
 	# ->newdir and open may croak
 	$ctx->{-tmp} = File::Temp->newdir("solver.$ctx->{oid_b}-XXXX",
 						TMPDIR => 1);
@@ -665,9 +670,9 @@ sub start_solver ($) {
 sub next_solver {
 	--$solver_nr;
 	my $ctx = shift(@solver_q) // return;
-	# XXX FIXME: client may've disconnected if it waited a long while
 	eval { start_solver($ctx) };
-	warn "W: start_solver: $@" if $@;
+	return unless $@;
+	warn "W: start_solver: $@";
 	html_page($ctx, 500) if $ctx->{-wcb};
 }
 
diff --git a/t/daemon.t b/t/daemon.t
new file mode 100644
index 00000000..51aa70b4
--- /dev/null
+++ b/t/daemon.t
@@ -0,0 +1,56 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# unit tests for common (public-facing) daemon code
+use v5.12;
+use autodie;
+use PublicInbox::TestCommon;
+use Socket qw(SOCK_STREAM);
+require_ok 'PublicInbox::Daemon';
+use PublicInbox::IO qw(poll_in);
+
+# $fn is stream_hup or tcp_hup
+my $ck_hup = sub {
+	my ($s, $connect, $fn, $type) = @_;
+	my $c = $connect->();
+	poll_in $s; # needed on *BSD
+	my $addr = accept(my $acc, $s);
+	close $c;
+	my $ck = PublicInbox::Daemon->can($fn);
+	poll_in($acc) if $^O ne 'linux'; # netbsd needs this, at least...
+	ok $ck->($acc), "$fn detected close ($type)";
+	$c = $connect->();
+	syswrite $c, 'hi';
+	poll_in $s; # needed on *BSD
+	$addr = accept($acc, $s);
+	ok !$ck->($acc), "$fn false when still established ($type)";
+};
+
+{
+	my $tmpdir = tmpdir;
+	my $l = "$tmpdir/named.sock";
+	my $s = IO::Socket::UNIX->new(Listen => 5, Local => $l,
+				Type => SOCK_STREAM) or
+			xbail "bind+listen($l): $!";
+	my $connect = sub {
+		IO::Socket::UNIX->new(Peer => $l, Type => SOCK_STREAM) or
+			xbail "connect($l): $!";
+	};
+	$ck_hup->($s, $connect, 'stream_hup', 'UNIX');
+}
+
+{
+	my $s = tcp_server;
+	my $tcp_conn = sub { tcp_connect($s) };
+	$ck_hup->($s, $tcp_conn, 'stream_hup', 'TCP');
+	$ck_hup->($s, $tcp_conn, 'tcp_hup', 'TCP');
+}
+
+SKIP: {
+	$^O =~ /\A(?:linux|freebsd|netbsd|openbsd)\z/ or
+		skip "no TCP_INFO support $^O", 1;
+	isnt \&PublicInbox::Daemon::stream_hup,
+		\&PublicInbox::Daemon::tcp_hup,
+		"stream_hup and tcp_hup are different on \$^O=$^O";
+}
+
+done_testing;
diff --git a/t/psgi_log.psgi b/t/psgi_log.psgi
new file mode 100644
index 00000000..ddecb4be
--- /dev/null
+++ b/t/psgi_log.psgi
@@ -0,0 +1,16 @@
+#!/usr/bin/perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: GPL-3.0+ <https://www.gnu.org/licenses/gpl-3.0.txt>
+# Usage: plackup [OPTIONS] /path/to/this/file
+use v5.12;
+use PublicInbox::WWW;
+use Plack::Builder;
+my $www = PublicInbox::WWW->new;
+$www->preload;
+builder {
+	enable 'AccessLog::Timed',
+		logger => sub { syswrite(STDOUT, $_[0]) },
+		format => '%t "%r" %>s %b %D';
+	enable 'Head';
+	sub { $www->call($_[0]) }
+}
diff --git a/t/solver_git.t b/t/solver_git.t
index c45011ad..76a0d8ab 100644
--- a/t/solver_git.t
+++ b/t/solver_git.t
@@ -8,8 +8,8 @@ require_git v2.6;
 use PublicInbox::ContentHash qw(git_sha);
 use PublicInbox::Spawn qw(run_qx which);
 use File::Path qw(remove_tree);
-use PublicInbox::IO qw(write_file);
-use autodie qw(close mkdir open rename symlink unlink);
+use PublicInbox::IO qw(write_file try_cat);
+use autodie qw(close kill mkdir open read rename symlink unlink);
 require_mods qw(DBD::SQLite Xapian);
 require PublicInbox::SolverGit;
 my $rdr = { 2 => \(my $null) };
@@ -476,8 +476,53 @@ EOF
 	my $env = { PI_CONFIG => $cfgpath, TMPDIR => $tmpdir };
 
 	xsys_e $cp, qw(-Rp), $binfoo, $gone_repo; # for test_httpd $client
+	my $has_log;
+	SKIP: { # some distros may split out this middleware
+		require_mods 'Plack::Middleware::AccessLog::Timed', 1;
+		$has_log = $env->{psgi_file} = 't/psgi_log.psgi';
+	}
 	test_httpd($env, $client, 7, sub {
 	SKIP: {
+		my $td_pid = $PublicInbox::TestCommon::CURRENT_DAEMON->{pid};
+		my $lis = $PublicInbox::TestCommon::CURRENT_LISTENER;
+		kill 'STOP', $td_pid;
+		my @req = ("GET /$name/69df7d5/s/ HTTP/1.0\r\n", "\r\n");
+		my $c0 = tcp_connect($lis);
+		print $c0 $req[0];
+		my @c = map {
+			my $c = tcp_connect($lis);
+			print $c @req;
+			$c;
+		} (1..30);
+		close $_ for @c[(1..29)]; # premature disconnects
+		kill 'CONT', $td_pid;
+		read $c[0], my $buf, 16384;
+		print $c0 $req[1];
+		read $c0, $buf, 16384;
+
+		if ($has_log) {
+			# kick server to ensure all CBs are handled, first
+			$c0 = tcp_connect($lis);
+			print $c0 <<EOM;
+GUH /$name/69df7d5/s/ HTTP/1.1\r
+Connection: close\r
+\r
+EOM
+			read $c0, $buf, 16384;
+			my %counts;
+			my @n = map {
+					my $code = (split ' ', $_)[5];
+					$counts{$code}++;
+					$code;
+				} grep m! HTTP/1\.0\b!,
+					try_cat("$tmpdir/stdout.log");
+			ok $counts{499}, 'got some 499s from disconnects';
+			ok $counts{200} >= 2, 'got at least two 200 codes' or
+				diag explain(\%counts);
+			is $counts{499} + $counts{200}, 31,
+				'all 1.0 connections logged for disconnects';
+		}
+
 		require_cmd('curl', 1) or skip 'no curl', 1;
 		mkdir "$tmpdir/ext";
 		my $rurl = "$ENV{PLACK_TEST_EXTERNALSERVER_URI}/$name";

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/7] qspawn: use limiter->new default
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
                   ` (2 preceding siblings ...)
  2025-02-08  3:26 ` [PATCH 3/7] daemon: check connections before solving codeblobs Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  2025-02-08  3:26 ` [PATCH 5/7] viewvcs: -codeblob limiter w/ depth for solver Eric Wong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

No need to hardcode `32' here since PublicInbox::Limiter already
uses that value already (and includes a helpful comment about
using the same value as git-daemon).
---
 lib/PublicInbox/Qspawn.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index cde45b42..3ceeff23 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -172,7 +172,7 @@ sub psgi_qx {
 	my ($self, $env, $limiter, @qx_cb_arg) = @_;
 	$self->{psgi_env} = $env;
 	$self->{qx_cb_arg} = \@qx_cb_arg;
-	$limiter ||= $def_limiter ||= PublicInbox::Limiter->new(32);
+	$limiter ||= $def_limiter ||= PublicInbox::Limiter->new;
 	start($self, $limiter, undef);
 }
 
@@ -280,7 +280,7 @@ sub psgi_yield {
 	my ($self, $env, $limiter, @parse_hdr_arg)= @_;
 	$self->{psgi_env} = $env;
 	$self->{yield_parse_hdr} = [ \(my $buf = ''), @parse_hdr_arg ];
-	$limiter ||= $def_limiter ||= PublicInbox::Limiter->new(32);
+	$limiter ||= $def_limiter ||= PublicInbox::Limiter->new;
 
 	# the caller already captured the PSGI write callback from
 	# the PSGI server, so we can call ->start, here:

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/7] viewvcs: -codeblob limiter w/ depth for solver
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
                   ` (3 preceding siblings ...)
  2025-02-08  3:26 ` [PATCH 4/7] qspawn: use limiter->new default Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  2025-02-08  3:26 ` [PATCH 6/7] qspawn: drop redundant check against limiter->{max} Eric Wong
  2025-02-08  3:26 ` [PATCH 7/7] config: handle limiter `max' knob in ->setup_limiter Eric Wong
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

By adding the `publicinboxlimiter.<name>.depth' parameter for
all limiters, we can have queue overflow detection and reject
requests with HTTP 503 error codes for all limiter uses, not
just the custom queue in ViewVCS as before.  Then we can just
use a normal limiter in ViewVCS and get rid of the old custom
queue.
---
 Documentation/public-inbox-config.pod |  17 +++-
 MANIFEST                              |   1 +
 lib/PublicInbox/Config.pm             |   6 +-
 lib/PublicInbox/Limiter.pm            |  30 +++++--
 lib/PublicInbox/Qspawn.pm             |  31 ++++---
 lib/PublicInbox/ViewVCS.pm            |  41 ++++++----
 t/solver_git.t                        |  19 +++++
 xt/git-http-backend-parallel.t        | 111 ++++++++++++++++++++++++++
 8 files changed, 215 insertions(+), 41 deletions(-)
 create mode 100644 xt/git-http-backend-parallel.t

diff --git a/Documentation/public-inbox-config.pod b/Documentation/public-inbox-config.pod
index 6f8ec1f3..faf35d67 100644
--- a/Documentation/public-inbox-config.pod
+++ b/Documentation/public-inbox-config.pod
@@ -560,12 +560,14 @@ the default limiter.
 C<RLIMIT_*> keys may be set to enforce resource limits for
 a particular limiter (L<BSD::Resource(3pm)> is required).
 
-Default named-limiters are prefixed with "-".  Currently,
+Default named-limiters are prefixed with C<->.  Currently,
 the C<-cgit> named limiter is reserved for instances spawning
 cgit via C<publicinbox.cgitrc>.  The C<-httpbackend> named
 limiter (in public-inbox 2.0+) governs all L<git-http-backend(1)>
 processes for inboxes and coderepos for a given public-inbox
-config file.
+config file.  The C<-codeblob> limiter (in 2.0+) governs the
+entire series of git commands used for blob reconstruction from
+patches.
 
 =over 8
 
@@ -573,6 +575,17 @@ config file.
 
 The maximum number of parallel processes for the given limiter.
 
+Default: 32 for C<-httpbackend>, 1 for everything else
+
+=item publicinboxlimiter.<name>.depth
+
+The maximum queue depth for the given limiter.  HTTP C<503> errors
+will be returned when the queue depth is exceeded.
+
+Default: 32
+
+New in public-inbox 2.0+ (PENDING)
+
 =item publicinboxlimiter.<name>.rlimitCore
 
 =item publicinboxlimiter.<name>.rlimitCPU
diff --git a/MANIFEST b/MANIFEST
index 1fc2a152..8bddb751 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -657,6 +657,7 @@ xt/cmp-msgstr.t
 xt/create-many-inboxes.t
 xt/eml_check_limits.t
 xt/eml_octet-stream.t
+xt/git-http-backend-parallel.t
 xt/git-http-backend.t
 xt/git_async_cmp.t
 xt/httpd-async-stream.t
diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 2bb1a672..4045d6d3 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -126,9 +126,9 @@ sub limiter {
 	my ($self, $name, $max_default) = @_;
 	$self->{-limiters}->{$name} //= do {
 		require PublicInbox::Limiter;
-		my $max = $self->{"publicinboxlimiter.$name.max"};
-		my $l = PublicInbox::Limiter->new($max || $max_default || 1);
-		$l->setup_rlimit($name, $self);
+		my $n = $self->{"publicinboxlimiter.$name.max"};
+		my $l = PublicInbox::Limiter->new($n || $max_default || 1);
+		$l->setup_limiter($name, $self);
 		$l;
 	};
 }
diff --git a/lib/PublicInbox/Limiter.pm b/lib/PublicInbox/Limiter.pm
index b515de98..49981e21 100644
--- a/lib/PublicInbox/Limiter.pm
+++ b/lib/PublicInbox/Limiter.pm
@@ -8,7 +8,8 @@ use PublicInbox::Spawn;
 sub new {
 	my ($class, $max) = @_;
 	bless {
-		# 32 is same as the git-daemon connection limit
+		# 32 is same as the git-daemon connection limit, but
+		# -cgit and -codeblob internal limiters default to 1
 		max => $max || 32,
 		running => 0,
 		run_queue => [],
@@ -18,16 +19,25 @@ sub new {
 	}, $class;
 }
 
-sub setup_rlimit {
+sub setup_limiter {
 	my ($self, $name, $cfg) = @_;
+	my $k = "publicinboxlimiter.$name.depth";
+	my $v = $cfg->{$k};
+	if (defined $v) {
+		if ($v =~ /\A[1-9][0-9]*\z/) {
+			$self->{depth} = $v + 0;
+		} else {
+			warn "W: `$v' not a positive integer in $cfg->{-f}\n";
+		}
+	}
 	for my $rlim (@PublicInbox::Spawn::RLIMITS) {
-		my $k = lc($rlim);
+		$k = lc($rlim);
 		$k =~ tr/_//d;
 		$k = "publicinboxlimiter.$name.$k";
-		my $v = $cfg->{$k} // next;
+		$v = $cfg->{$k} // next;
 		my @rlimit = split(/\s*,\s*/, $v);
 		if (scalar(@rlimit) == 1) {
-			push @rlimit, $rlimit[0];
+			$rlimit[1] = $rlimit[0];
 		} elsif (scalar(@rlimit) != 2) {
 			warn "W: could not parse $k: $v (ignored)\n";
 			next;
@@ -40,12 +50,16 @@ sub setup_rlimit {
 				warn "BSD::Resource missing for $rlim";
 				next;
 			} : undef;
-		for my $i (0..$#rlimit) {
-			next if $rlimit[$i] ne 'INFINITY';
-			$rlimit[$i] = $inf;
+		for (@rlimit) {
+			$_ = $inf if $_ eq 'INFINITY';
 		}
 		$self->{$rlim} = \@rlimit;
 	}
 }
 
+sub is_too_busy {
+	my ($self) = @_;
+	scalar(@{$self->{run_queue}}) > ($self->{depth} // 32)
+}
+
 1;
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 3ceeff23..9bd225b7 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -78,18 +78,8 @@ sub psgi_status_err { # Qspawn itself is useful w/o PSGI
 	PublicInbox::WwwStatic::r($_[0] // 500);
 }
 
-sub finalize ($) {
+sub _finalize ($) {
 	my ($self) = @_;
-
-	# process is done, spawn whatever's in the queue
-	my $limiter = delete $self->{limiter} or return;
-	my $running = --$limiter->{running};
-
-	if ($running < $limiter->{max}) {
-		if (my $next = shift(@{$limiter->{run_queue}})) {
-			_do_spawn(@$next, $limiter);
-		}
-	}
 	if (my $err = $self->{_err}) { # set by finish or waitpid_err
 		utf8::decode($err);
 		if (my $dst = $self->{qsp_err}) {
@@ -112,6 +102,21 @@ sub finalize ($) {
 	}
 }
 
+sub finalize ($) {
+	my ($self) = @_;
+
+	# process is done, spawn whatever's in the queue
+	my $limiter = delete $self->{limiter} or return;
+	my $running = --$limiter->{running};
+
+	if ($running < $limiter->{max}) {
+		if (my $next = shift(@{$limiter->{run_queue}})) {
+			_do_spawn(@$next, $limiter);
+		}
+	}
+	_finalize $self;
+}
+
 sub waitpid_err { # callback for awaitpid
 	my (undef, $self) = @_; # $_[0]: pid
 	$self->{_err} = ''; # for defined check in ->finish
@@ -159,6 +164,10 @@ sub start ($$$) {
 	my ($self, $limiter, $start_cb) = @_;
 	if ($limiter->{running} < $limiter->{max}) {
 		_do_spawn($self, $start_cb, $limiter);
+	} elsif ($limiter->is_too_busy) {
+		$self->{psgi_env}->{'qspawn.fallback'} //= 503 if
+			$self->{psgi_env};
+		_finalize $self;
 	} else {
 		push @{$limiter->{run_queue}}, [ $self, $start_cb ];
 	}
diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm
index a6e54b30..e1e129b1 100644
--- a/lib/PublicInbox/ViewVCS.pm
+++ b/lib/PublicInbox/ViewVCS.pm
@@ -50,10 +50,6 @@ my %GIT_MODE = (
 	'160000' => 'g', # commit (gitlink)
 );
 
-# TODO: not fork safe, but we don't fork w/o exec in PublicInbox::WWW
-my (@solver_q, $solver_lim);
-my $solver_nr = 0;
-
 sub html_page ($$;@) {
 	my ($ctx, $code) = @_[0, 1];
 	my $wcb = delete $ctx->{-wcb};
@@ -640,10 +636,10 @@ sub show_blob { # git->cat_async callback
 		'</code></pre></td></tr></table>'.dbg_log($ctx), @def);
 }
 
-sub start_solver ($) {
-	my ($ctx) = @_;
-	$ctx->{-next_solver} = on_destroy \&next_solver;
-	++$solver_nr;
+sub start_solver ($$) {
+	my ($ctx, $limiter) = @_;
+	$ctx->{-next_solver} = on_destroy \&next_solver, $limiter;
+	++$limiter->{running};
 	if (my $ck = $ctx->{env}->{'pi-httpd.ckhup'}) {
 		$ck->($ctx->{env}->{'psgix.io'}->{sock}) and
 			return html_page $ctx, 499, 'client disconnected';
@@ -659,7 +655,7 @@ sub start_solver ($) {
 	$ctx->{lh} or open $ctx->{lh}, '+>>', "$ctx->{-tmp}/solve.log";
 	my $solver = PublicInbox::SolverGit->new($ctx->{ibx},
 						\&solve_result, $ctx);
-	$solver->{limiter} = $solver_lim;
+	$solver->{limiter} = $ctx->{www}->{solver_limiter};
 	$solver->{gits} //= [ $ctx->{git} ];
 	$solver->{tmp} = $ctx->{-tmp}; # share tmpdir
 	# PSGI server will call this immediately and give us a callback (-wcb)
@@ -668,9 +664,10 @@ sub start_solver ($) {
 
 # run the next solver job when done and DESTROY-ed (on_destroy cb)
 sub next_solver {
-	--$solver_nr;
-	my $ctx = shift(@solver_q) // return;
-	eval { start_solver($ctx) };
+	my ($limiter) = @_;
+	--$limiter->{running};
+	my $ctx = shift(@{$limiter->{run_queue}}) // return;
+	eval { start_solver $ctx, $limiter };
 	return unless $@;
 	warn "W: start_solver: $@";
 	html_page($ctx, 500) if $ctx->{-wcb};
@@ -678,12 +675,22 @@ sub next_solver {
 
 sub may_start_solver ($) {
 	my ($ctx) = @_;
-	$solver_lim //= $ctx->{www}->{pi_cfg}->limiter('codeblob');
-	if ($solver_nr >= $solver_lim->{max}) {
-		@solver_q > 128 ? html_page($ctx, 503, 'too busy')
-				: push(@solver_q, $ctx);
+	my $limiter = $ctx->{www}->{pi_cfg}->limiter('-codeblob');
+
+	# {solver_limiter} just inherits rlimits from the configurable
+	# -codeblob limiter.  The parallelism and depth management is
+	# redundant since -codeblob $limiter encompasses {solver_limiter}.
+	$ctx->{www}->{solver_limiter} //= do {
+		my $l = PublicInbox::Limiter->new;
+		$l->{$_} = $limiter->{$_} for grep /^RLIMIT_/, keys %$limiter;
+		$l;
+	};
+	if ($limiter->{running} < $limiter->{max}) {
+		start_solver $ctx, $limiter;
+	} elsif ($limiter->is_too_busy) {
+		html_page $ctx, 503, 'too busy';
 	} else {
-		start_solver($ctx);
+		push @{$limiter->{run_queue}}, $ctx;
 	}
 }
 
diff --git a/t/solver_git.t b/t/solver_git.t
index 76a0d8ab..22d01d93 100644
--- a/t/solver_git.t
+++ b/t/solver_git.t
@@ -522,6 +522,25 @@ EOM
 			is $counts{499} + $counts{200}, 31,
 				'all 1.0 connections logged for disconnects';
 		}
+		undef $c0;
+		kill 'STOP', $td_pid;
+		@c = map {
+			my $c = tcp_connect($lis);
+			print $c $req[0];
+			$c;
+		}  (0..66);
+		print $_ $req[1] for @c;
+		kill 'CONT', $td_pid;
+		my %codes;
+		for (@c) {
+			read $_, $buf, 16384;
+			my $code = $buf =~ m!\AHTTP/1\.0 (\d+) ! ? $1 : '';
+			++$codes{$code};
+		}
+		is $codes{''}, undef, 'all valid '.scalar(@c).' HTTP responses';
+		ok $codes{503}, 'got some 503 too busy errors';
+		is $codes{503} + $codes{200},
+			scalar(@c), 'only got 200 and 503 codes';
 
 		require_cmd('curl', 1) or skip 'no curl', 1;
 		mkdir "$tmpdir/ext";
diff --git a/xt/git-http-backend-parallel.t b/xt/git-http-backend-parallel.t
new file mode 100644
index 00000000..70b5d0ea
--- /dev/null
+++ b/xt/git-http-backend-parallel.t
@@ -0,0 +1,111 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# ensure publicinboxlimiter.-httpbackend.{depth,max} knobs work
+# and returns 503 (too busy) errors on overload.
+use v5.12;
+use autodie;
+use PublicInbox::TestCommon;
+use File::Path qw(remove_tree);
+use PublicInbox::IO qw(write_file try_cat);
+use PublicInbox::Spawn qw(spawn);
+use PublicInbox::Git qw(git_exe);
+require PublicInbox::Sigfd;
+my $git_dir = $ENV{GIANT_GIT_DIR} //
+	plan 'skip_all' => 'GIANT_GIT_DIR not defined';
+require_mods qw(-httpd psgi);
+my $tmpdir = tmpdir;
+my $henv = { TMPDIR => $tmpdir, PI_CONFIG => "$tmpdir/cfg" };
+write_file '>', $henv->{PI_CONFIG}, <<EOM;
+[coderepo "giant.git"]
+	dir = $git_dir
+EOM
+my @tail = split ' ', $PublicInbox::TestCommon::tail_cmd // '';
+my $uri;
+
+my $run_clones = sub {
+	my ($max) = @_;
+	my (%wait, $tpid, %chld_status, @f);
+	if (@tail) {
+		my @f = map {
+			my $f = "$tmpdir/$_.err";
+			open my $fh, '>', $f;
+			$f;
+		} (1..$max);
+		$tpid = spawn([ @tail, @f ], undef, { 1 => 2 });
+	}
+	for my $n (1..$max) {
+		my $rdr;
+		my $err = "$tmpdir/$n.err";
+		open $rdr->{2}, '>', $err;
+		my $pid = spawn([git_exe, qw(clone -q --mirror), $uri,
+				"$tmpdir/$n.git"], undef, $rdr);
+		$wait{$pid} = $n;
+	}
+	while (keys %wait) {
+		my $pid = waitpid(-1, 0);
+		my $n = delete $wait{$pid} // next;
+		push @{$chld_status{$?}}, $n;
+		remove_tree "$tmpdir/$n.git";
+	}
+	if ($tpid) {
+		kill 'TERM', $tpid;
+		waitpid($tpid, 0);
+	}
+	\%chld_status;
+};
+
+my $sigfd = PublicInbox::Sigfd->new;
+my $reload_cfg = sub {
+	write_file '>>', $henv->{PI_CONFIG}, @_;
+	kill 'HUP', $PublicInbox::TestCommon::CURRENT_DAEMON->{pid};
+	# signalfd/EVFILT_SIGNAL platforms should handle signals more
+	# predictably and not need tick
+	tick(1) unless $sigfd;
+};
+
+my $ck_503 = sub {
+	my ($chld_status) = @_;
+	ok scalar(keys %$chld_status), 'got non-zero statuses from git clone';
+	for my $s (keys %$chld_status) {
+		my @unexpected;
+		for my $n (@{$chld_status->{$s}}) {
+			my $msg = try_cat "$tmpdir/$n.err";
+			push @unexpected, $msg if $msg !~ /\b503\b/s;
+		}
+		ok !@unexpected, 'no unexpected errors' or
+			diag explain([ "status $s", \@unexpected ]);
+		diag "chld_status=$s: ".scalar(@{$chld_status->{$s}})
+			.' instances'
+	}
+};
+
+my $client = sub {
+	$uri = "$ENV{PLACK_TEST_EXTERNALSERVER_URI}/giant.git/";
+	my $chld_status = $run_clones->(64);
+	my $ok = delete $chld_status->{0} // [];
+	is scalar(@$ok), 64, 'got all successes';
+	is keys(%$chld_status), 0, 'no other statuses' or
+		diag explain($chld_status);
+
+	$reload_cfg->(<<EOM);
+[publicinboxlimiter "-httpbackend"]
+	depth = 1
+EOM
+
+	$chld_status = $run_clones->(40);
+	$ok = delete $chld_status->{0} // [];
+	ok scalar(@$ok) >= 33, 'got at least 33 successes' or
+		diag 'got '.scalar(@$ok).' successes';
+	$ck_503->($chld_status);
+
+	$reload_cfg->("\tmax = 1\n");
+
+	$chld_status = $run_clones->(10);
+	$ok = delete $chld_status->{0} // [];
+	ok scalar(@$ok) >= 2, 'got at least 2 successes' or
+		diag 'got '.scalar(@$ok).' successes';
+	$ck_503->($chld_status);
+};
+test_httpd $henv, $client;
+done_testing;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/7] qspawn: drop redundant check against limiter->{max}
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
                   ` (4 preceding siblings ...)
  2025-02-08  3:26 ` [PATCH 5/7] viewvcs: -codeblob limiter w/ depth for solver Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  2025-02-08  3:26 ` [PATCH 7/7] config: handle limiter `max' knob in ->setup_limiter Eric Wong
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

We only need to guard against excessive parallelism on entry
in one place during Qspawn->start.  Redundant guards are
confusing and make bugs harder-to-spot.
---
 lib/PublicInbox/Qspawn.pm | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 9bd225b7..34e9eff6 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -107,13 +107,9 @@ sub finalize ($) {
 
 	# process is done, spawn whatever's in the queue
 	my $limiter = delete $self->{limiter} or return;
-	my $running = --$limiter->{running};
-
-	if ($running < $limiter->{max}) {
-		if (my $next = shift(@{$limiter->{run_queue}})) {
-			_do_spawn(@$next, $limiter);
-		}
-	}
+	--$limiter->{running};
+	my $next = shift @{$limiter->{run_queue}};
+	_do_spawn(@$next, $limiter) if $next;
 	_finalize $self;
 }
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 7/7] config: handle limiter `max' knob in ->setup_limiter
  2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
                   ` (5 preceding siblings ...)
  2025-02-08  3:26 ` [PATCH 6/7] qspawn: drop redundant check against limiter->{max} Eric Wong
@ 2025-02-08  3:26 ` Eric Wong
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2025-02-08  3:26 UTC (permalink / raw)
  To: meta

Handling all config knobs in the same sub is more logical and
reduces the size of the frequently-loaded PublicInbox::Config
package.
---
 lib/PublicInbox/Config.pm  |  3 +--
 lib/PublicInbox/Limiter.pm | 20 ++++++++++++--------
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 4045d6d3..6ff16783 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -126,8 +126,7 @@ sub limiter {
 	my ($self, $name, $max_default) = @_;
 	$self->{-limiters}->{$name} //= do {
 		require PublicInbox::Limiter;
-		my $n = $self->{"publicinboxlimiter.$name.max"};
-		my $l = PublicInbox::Limiter->new($n || $max_default || 1);
+		my $l = PublicInbox::Limiter->new($max_default || 1);
 		$l->setup_limiter($name, $self);
 		$l;
 	};
diff --git a/lib/PublicInbox/Limiter.pm b/lib/PublicInbox/Limiter.pm
index 49981e21..fc62d0d4 100644
--- a/lib/PublicInbox/Limiter.pm
+++ b/lib/PublicInbox/Limiter.pm
@@ -21,25 +21,29 @@ sub new {
 
 sub setup_limiter {
 	my ($self, $name, $cfg) = @_;
-	my $k = "publicinboxlimiter.$name.depth";
-	my $v = $cfg->{$k};
-	if (defined $v) {
+	for my $f (qw(max depth)) {
+		my $k = "publicinboxlimiter.$name.$f";
+		my $v = $cfg->{$k} // next;
 		if ($v =~ /\A[1-9][0-9]*\z/) {
-			$self->{depth} = $v + 0;
+			$self->{$f} = $v + 0;
 		} else {
-			warn "W: `$v' not a positive integer in $cfg->{-f}\n";
+			warn <<EOM
+W: `$k=$v' is not a positive integer in $cfg->{-f}
+EOM
 		}
 	}
 	for my $rlim (@PublicInbox::Spawn::RLIMITS) {
-		$k = lc($rlim);
+		my $k = lc($rlim);
 		$k =~ tr/_//d;
 		$k = "publicinboxlimiter.$name.$k";
-		$v = $cfg->{$k} // next;
+		my $v = $cfg->{$k} // next;
 		my @rlimit = split(/\s*,\s*/, $v);
 		if (scalar(@rlimit) == 1) {
 			$rlimit[1] = $rlimit[0];
 		} elsif (scalar(@rlimit) != 2) {
-			warn "W: could not parse $k: $v (ignored)\n";
+			warn <<EOM;
+W: could not parse `$k=$v' in $cfg->{-f} (ignored)
+EOM
 			next;
 		}
 		my $inf = $v =~ /\binfinity\b/i ?

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-02-08  3:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-08  3:26 [PATCH 0/7] limiter fixes and updates Eric Wong
2025-02-08  3:26 ` [PATCH 1/7] syscall: `use bytes' throughout Eric Wong
2025-02-08  3:26 ` [PATCH 2/7] git_http_backend: fix 32 default connection limit Eric Wong
2025-02-08  3:26 ` [PATCH 3/7] daemon: check connections before solving codeblobs Eric Wong
2025-02-08  3:26 ` [PATCH 4/7] qspawn: use limiter->new default Eric Wong
2025-02-08  3:26 ` [PATCH 5/7] viewvcs: -codeblob limiter w/ depth for solver Eric Wong
2025-02-08  3:26 ` [PATCH 6/7] qspawn: drop redundant check against limiter->{max} Eric Wong
2025-02-08  3:26 ` [PATCH 7/7] config: handle limiter `max' knob in ->setup_limiter Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).