user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 00/26] process management simplifications
@ 2023-10-25  0:29  7% Eric Wong
  2023-10-25  0:29  5% ` [PATCH 02/26] spawn: support synchronous run_qx Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2023-10-25  0:29 UTC (permalink / raw)
  To: meta

The convoluted HTTPD/Async code and is now gone and replaced
with the simpler InputPipe (originally developed for lei).
Fortunately, the s/psgi_return/psgi_yield/ change went more
smoothly than I expected.

cindex gets some simplifications, too; which will be helpful
since more work is required on that code.  I didn't get a
chance to use the split out Limiter+Qspawn, though...

3/26 is a fairly large philosophy change to use temporary files
over pipes, but it's probably fine as it's simpler and more
straightforward data flow and only used for small outputs which
can fit in memory, so unlikely to hit disk.

9/26 is the major change which was enabled by a change made
4 years ago.


Eric Wong (26):
  limiter: split out from qspawn
  spawn: support synchronous run_qx
  psgi_qx: use a temporary file rather than pipe
  www_coderepo: capture uses a flattened list
  qspawn: psgi_return allows list for callback args
  qspawn: drop unused err arg for ->event_step
  httpd/async: require IO arg
  xt/check-run: call DS->Reset after all tests
  qspawn: introduce new psgi_yield API
  repo_atom: switch to psgi_yield
  repo_snapshot: psgi_yield
  viewvcs: psgi_yield
  www_altid: switch to psgi_yield
  cgit: switch to psgi_yield
  www_coderepo: psgi_yield
  drop psgi_return, httpd/async and GetlineBody
  qspawn: use WwwStatic for fallbacks and error code
  qspawn: simplify internal argument passing
  cidx_log_p: don't bother with F_SETPIPE_SZ
  cindex: avoid awaitpid for popen
  cindex: use timer for inits
  cindex: start using run_await to simplify code
  cindex: use run_await to read extensions.objectFormat
  cindex: drop XH_PID global
  cindex: use run_await wrapper for git commands
  cindex: use sysread for generating fingerprint

 MANIFEST                           |   5 +-
 lib/PublicInbox/Aspawn.pm          |  34 +++
 lib/PublicInbox/Cgit.pm            |   2 +-
 lib/PublicInbox/CidxLogP.pm        |   3 +-
 lib/PublicInbox/CodeSearchIdx.pm   | 236 +++++++++----------
 lib/PublicInbox/Config.pm          |   4 +-
 lib/PublicInbox/GetlineBody.pm     |  46 ----
 lib/PublicInbox/GetlineResponse.pm |  40 ++++
 lib/PublicInbox/Git.pm             |   6 +
 lib/PublicInbox/GitHTTPBackend.pm  |  17 +-
 lib/PublicInbox/GzipFilter.pm      |   5 +-
 lib/PublicInbox/HTTP.pm            |   8 +-
 lib/PublicInbox/HTTPD.pm           |   5 +-
 lib/PublicInbox/HTTPD/Async.pm     | 109 ---------
 lib/PublicInbox/Inbox.pm           |   4 +-
 lib/PublicInbox/InputPipe.pm       |  12 +-
 lib/PublicInbox/LEI.pm             |   2 +-
 lib/PublicInbox/Limiter.pm         |  47 ++++
 lib/PublicInbox/MailDiff.pm        |   8 +-
 lib/PublicInbox/Qspawn.pm          | 349 +++++++++++------------------
 lib/PublicInbox/RepoAtom.pm        |   6 +-
 lib/PublicInbox/RepoSnapshot.pm    |   2 +-
 lib/PublicInbox/SearchIdx.pm       |  19 +-
 lib/PublicInbox/SolverGit.pm       |  10 +-
 lib/PublicInbox/Spawn.pm           |  69 ++++--
 lib/PublicInbox/ViewVCS.pm         |   7 +-
 lib/PublicInbox/WwwAltId.pm        |   6 +-
 lib/PublicInbox/WwwCoderepo.pm     |  12 +-
 t/dir_idle.t                       |   1 -
 t/fake_inotify.t                   |   2 -
 t/httpd-corner.psgi                |  14 +-
 t/httpd-corner.t                   |  12 +-
 t/qspawn.t                         |   3 +-
 t/spawn.t                          |  13 +-
 xt/check-run.t                     |   2 +
 35 files changed, 515 insertions(+), 605 deletions(-)
 create mode 100644 lib/PublicInbox/Aspawn.pm
 delete mode 100644 lib/PublicInbox/GetlineBody.pm
 create mode 100644 lib/PublicInbox/GetlineResponse.pm
 delete mode 100644 lib/PublicInbox/HTTPD/Async.pm
 create mode 100644 lib/PublicInbox/Limiter.pm

^ permalink raw reply	[relevance 7%]

* [PATCH 02/26] spawn: support synchronous run_qx
  2023-10-25  0:29  7% [PATCH 00/26] process management simplifications Eric Wong
@ 2023-10-25  0:29  5% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2023-10-25  0:29 UTC (permalink / raw)
  To: meta

This is similar to `backtick` but supports all our existing spawn
functionality (chdir, env, rlimit, redirects, etc.).  It also
supports SCALAR ref redirects like run_script in our test suite
for std{in,out,err}.

We can probably use :utf8 by default for these redirects, even.
---
 lib/PublicInbox/Git.pm       |  6 ++++
 lib/PublicInbox/SearchIdx.pm | 19 ++++------
 lib/PublicInbox/Spawn.pm     | 69 ++++++++++++++++++++++++++----------
 t/spawn.t                    | 13 ++++++-
 4 files changed, 76 insertions(+), 31 deletions(-)

diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index a460d155..476dcf30 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -69,6 +69,7 @@ sub check_git_exe () {
 		$GIT_VER = eval("v$1") // die "BUG: bad vstring: $1 ($v)";
 		$EXE_ST = $st;
 	}
+	$GIT_EXE;
 }
 
 sub git_version {
@@ -422,6 +423,11 @@ sub async_err ($$$$$) {
 	$async_warn ? carp($msg) : $self->fail($msg);
 }
 
+sub cmd {
+	my $self = shift;
+	[ $GIT_EXE // check_git_exe(), "--git-dir=$self->{git_dir}", @_ ]
+}
+
 # $git->popen(qw(show f00)); # or
 # $git->popen(qw(show f00), { GIT_CONFIG => ... }, { 2 => ... });
 sub popen {
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 8a571cfb..3c64c715 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -22,7 +22,7 @@ use POSIX qw(strftime);
 use Fcntl qw(SEEK_SET);
 use Time::Local qw(timegm);
 use PublicInbox::OverIdx;
-use PublicInbox::Spawn qw(run_wait);
+use PublicInbox::Spawn qw(run_wait run_qx);
 use PublicInbox::Git qw(git_unquote);
 use PublicInbox::MsgTime qw(msg_timestamp msg_datestamp);
 use PublicInbox::Address;
@@ -351,23 +351,18 @@ sub index_diff ($$$) {
 }
 
 sub patch_id {
-	my ($self) = @_; # $_[1] is the diff (may be huge)
-	open(my $fh, '+>:utf8', undef) or die "open: $!";
-	open(my $eh, '+>', undef) or die "open: $!";
-	$fh->autoflush(1);
-	print $fh $_[1] or die "print: $!";
-	sysseek($fh, 0, SEEK_SET) or die "sysseek: $!";
-	my $id = ($self->{ibx} // $self->{eidx} // $self)->git->qx(
-			[qw(patch-id --stable)], {}, { 0 => $fh, 2 => $eh });
-	seek($eh, 0, SEEK_SET) or die "seek: $!";
-	while (<$eh>) { warn $_ }
+	my ($self, $sref) = @_;
+	my $git = ($self->{ibx} // $self->{eidx} // $self)->git;
+	my $opt = { 0 => $sref, 2 => \(my $err) };
+	my $id = run_qx($git->cmd(qw(patch-id --stable)), undef, $opt);
+	warn $err if $err;
 	$id =~ /\A([a-f0-9]{40,})/ ? $1 : undef;
 }
 
 sub index_body_text {
 	my ($self, $doc, $sref) = @_;
 	if ($$sref =~ /^(?:diff|---|\+\+\+) /ms) {
-		my $id = patch_id($self, $$sref);
+		my $id = patch_id($self, $sref);
 		$doc->add_term('XDFID'.$id) if defined($id);
 	}
 
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 106f5e01..1fa7a41f 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -22,8 +22,9 @@ use Fcntl qw(SEEK_SET);
 use IO::Handle ();
 use Carp qw(croak);
 use PublicInbox::ProcessIO;
-our @EXPORT_OK = qw(which spawn popen_rd popen_wr run_die run_wait);
+our @EXPORT_OK = qw(which spawn popen_rd popen_wr run_die run_wait run_qx);
 our @RLIMITS = qw(RLIMIT_CPU RLIMIT_CORE RLIMIT_DATA);
+use autodie qw(open pipe read seek sysseek truncate);
 
 BEGIN {
 	my $all_libc = <<'ALL_LIBC'; # all *nix systems we support
@@ -290,7 +291,6 @@ ALL_LIBC
 	undef $all_libc unless -d $inline_dir;
 	if (defined $all_libc) {
 		local $ENV{PERL_INLINE_DIRECTORY} = $inline_dir;
-		use autodie;
 		# CentOS 7.x ships Inline 0.53, 0.64+ has built-in locking
 		my $lk = PublicInbox::Lock->new($inline_dir.
 						'/.public-inbox.lock');
@@ -301,7 +301,7 @@ ALL_LIBC
 		open STDERR, '>&', $fh;
 		STDERR->autoflush(1);
 		STDOUT->autoflush(1);
-		CORE::eval 'use Inline C => $all_libc, BUILD_NOISY => 1';
+		eval 'use Inline C => $all_libc, BUILD_NOISY => 1';
 		my $err = $@;
 		open(STDERR, '>&', $olderr);
 		open(STDOUT, '>&', $oldout);
@@ -332,26 +332,34 @@ sub which ($) {
 }
 
 sub spawn ($;$$) {
-	my ($cmd, $env, $opts) = @_;
+	my ($cmd, $env, $opt) = @_;
 	my $f = which($cmd->[0]) // die "$cmd->[0]: command not found\n";
-	my @env;
+	my (@env, @rdr);
 	my %env = (%ENV, $env ? %$env : ());
 	while (my ($k, $v) = each %env) {
 		push @env, "$k=$v" if defined($v);
 	}
-	my $redir = [];
 	for my $child_fd (0..2) {
-		my $parent_fd = $opts->{$child_fd};
-		if (defined($parent_fd) && $parent_fd !~ /\A[0-9]+\z/) {
-			my $fd = fileno($parent_fd) //
-					die "$parent_fd not an IO GLOB? $!";
-			$parent_fd = $fd;
+		my $pfd = $opt->{$child_fd};
+		if ('SCALAR' eq ref($pfd)) {
+			open my $fh, '+>:utf8', undef;
+			$opt->{"fh.$child_fd"} = $fh;
+			if ($child_fd == 0) {
+				print $fh $$pfd;
+				$fh->flush or die "flush: $!";
+				sysseek($fh, 0, SEEK_SET);
+			}
+			$pfd = fileno($fh);
+		} elsif (defined($pfd) && $pfd !~ /\A[0-9]+\z/) {
+			my $fd = fileno($pfd) //
+					die "$pfd not an IO GLOB? $!";
+			$pfd = $fd;
 		}
-		$redir->[$child_fd] = $parent_fd // $child_fd;
+		$rdr[$child_fd] = $pfd // $child_fd;
 	}
 	my $rlim = [];
 	foreach my $l (@RLIMITS) {
-		my $v = $opts->{$l} // next;
+		my $v = $opt->{$l} // next;
 		my $r = eval "require BSD::Resource; BSD::Resource::$l();";
 		unless (defined $r) {
 			warn "$l undefined by BSD::Resource: $@\n";
@@ -359,31 +367,41 @@ sub spawn ($;$$) {
 		}
 		push @$rlim, $r, @$v;
 	}
-	my $cd = $opts->{'-C'} // ''; # undef => NULL mapping doesn't work?
-	my $pgid = $opts->{pgid} // -1;
-	my $pid = pi_fork_exec($redir, $f, $cmd, \@env, $rlim, $cd, $pgid);
+	my $cd = $opt->{'-C'} // ''; # undef => NULL mapping doesn't work?
+	my $pgid = $opt->{pgid} // -1;
+	my $pid = pi_fork_exec(\@rdr, $f, $cmd, \@env, $rlim, $cd, $pgid);
 	die "fork_exec @$cmd failed: $!\n" unless $pid > 0;
 	$pid;
 }
 
 sub popen_rd {
 	my ($cmd, $env, $opt, @cb_arg) = @_;
-	pipe(my $r, local $opt->{1}) or die "pipe: $!\n";
+	pipe(my $r, local $opt->{1});
 	my $pid = spawn($cmd, $env, $opt);
 	PublicInbox::ProcessIO->maybe_new($pid, $r, @cb_arg);
 }
 
 sub popen_wr {
 	my ($cmd, $env, $opt, @cb_arg) = @_;
-	pipe(local $opt->{0}, my $w) or die "pipe: $!\n";
+	pipe(local $opt->{0}, my $w);
 	$w->autoflush(1);
 	my $pid = spawn($cmd, $env, $opt);
 	PublicInbox::ProcessIO->maybe_new($pid, $w, @cb_arg)
 }
 
+sub read_out_err ($) {
+	my ($opt) = @_;
+	for my $fd (1, 2) { # read stdout/stderr
+		my $fh = delete($opt->{"fh.$fd"}) // next;
+		seek($fh, 0, SEEK_SET);
+		read($fh, ${$opt->{$fd}}, -s $fh, length(${$opt->{$fd}} // ''));
+	}
+}
+
 sub run_wait ($;$$) {
 	my ($cmd, $env, $opt) = @_;
 	waitpid(spawn($cmd, $env, $opt), 0);
+	read_out_err($opt);
 	$?
 }
 
@@ -392,4 +410,19 @@ sub run_die ($;$$) {
 	run_wait($cmd, $env, $rdr) and croak "E: @$cmd failed: \$?=$?";
 }
 
+sub run_qx {
+	my ($cmd, $env, $opt) = @_;
+	my $fh = popen_rd($cmd, $env, $opt);
+	my @ret;
+	if (wantarray) {
+		@ret = <$fh>;
+	} else {
+		local $/;
+		$ret[0] = <$fh>;
+	}
+	close $fh; # caller should check $?
+	read_out_err($opt);
+	wantarray ? @ret : $ret[0];
+}
+
 1;
diff --git a/t/spawn.t b/t/spawn.t
index 1af66bda..4b3baae4 100644
--- a/t/spawn.t
+++ b/t/spawn.t
@@ -3,7 +3,7 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use v5.12;
 use Test::More;
-use PublicInbox::Spawn qw(which spawn popen_rd);
+use PublicInbox::Spawn qw(which spawn popen_rd run_qx);
 require PublicInbox::Sigfd;
 require PublicInbox::DS;
 
@@ -19,6 +19,17 @@ require PublicInbox::DS;
 	is($?, 0, 'true exited successfully');
 }
 
+{
+	my $opt = { 0 => \'in', 2 => \(my $e) };
+	my $out = run_qx(['sh', '-c', 'echo e >&2; cat'], undef, $opt);
+	is($e, "e\n", 'captured stderr');
+	is($out, 'in', 'stdin read and stdout captured');
+	$opt->{0} = \"IN\n3\nLINES";
+	my @out = run_qx(['sh', '-c', 'echo E >&2; cat'], undef, $opt);
+	is($e, "e\nE\n", 'captured stderr appended to string');
+	is_deeply(\@out, [ "IN\n", "3\n", 'LINES' ], 'stdout array');
+}
+
 SKIP: {
 	my $pid = spawn(['true'], undef, { pgid => 0 });
 	ok($pid, 'spawned process with new pgid');

^ permalink raw reply related	[relevance 5%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2023-10-25  0:29  7% [PATCH 00/26] process management simplifications Eric Wong
2023-10-25  0:29  5% ` [PATCH 02/26] spawn: support synchronous run_qx Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).