user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 00/19] lei import Maildir, remote mboxrd fixes
@ 2021-02-07  8:51 Eric Wong
  2021-02-07  8:51 ` [PATCH 01/19] spawn: pi_fork_exec: restore parent sigmask in child Eric Wong
                   ` (19 more replies)
  0 siblings, 20 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

"lei q" with remote mboxrd + early MUA spawning is
nicer, too.  Several risky constructs eliminated,

Interrupting "add-external --mirror" is less bad, now;
though it could probably support indexlevel=none in
case somebody wants to run index themselves.

Eric Wong (19):
  spawn: pi_fork_exec: restore parent sigmask in child
  spawn: pi_fork_exec: support "pgid"
  lei add-external: handle interrupts with --mirror
  spawn_pp: die more consistently in child
  ipc: do not die inside wq_worker child process
  ipc: trim down the Storable checks
  Makefile.PL: depend on IO::Uncompress::Gunzip
  xapcmd: avoid potential die surprise in children
  tests: guard setup_public_inboxes for SQLite and Xapian
  Revert "ipc: add support for asynchronous callbacks"
  ipc: wq_do => wq_io_do
  lei: more consistent IPC exit and error handling
  lei: remove --mua-cmd alias for --mua
  lei: replace --thread with --threads
  lei q: improve remote mboxrd UX
  lei q: SIGWINCH process group with the terminal
  lei import: support Maildirs
  imap: avoid unnecessary delete on stack
  httpd/async: avoid unnecessary on-stack delete

 Documentation/lei-q.pod        |   4 +-
 MANIFEST                       |   1 +
 Makefile.PL                    |   1 +
 lib/PublicInbox/HTTPD/Async.pm |   2 +-
 lib/PublicInbox/IMAP.pm        |   6 +-
 lib/PublicInbox/IPC.pm         | 105 +++++++-----------------
 lib/PublicInbox/LEI.pm         |  49 +++++++----
 lib/PublicInbox/LeiCurl.pm     |  11 ++-
 lib/PublicInbox/LeiHelp.pm     |   6 +-
 lib/PublicInbox/LeiImport.pm   |  38 ++++++---
 lib/PublicInbox/LeiMirror.pm   |  75 ++++++++++-------
 lib/PublicInbox/LeiOverview.pm |   7 +-
 lib/PublicInbox/LeiQuery.pm    |   4 +-
 lib/PublicInbox/LeiStore.pm    |   8 +-
 lib/PublicInbox/LeiToMail.pm   |  37 ++++-----
 lib/PublicInbox/LeiXSearch.pm  | 143 ++++++++++++++++++++-------------
 lib/PublicInbox/Mbox.pm        |   2 +-
 lib/PublicInbox/OnDestroy.pm   |   2 +-
 lib/PublicInbox/Search.pm      |   2 +-
 lib/PublicInbox/SearchView.pm  |   2 +-
 lib/PublicInbox/Spawn.pm       |  63 +++++++++------
 lib/PublicInbox/SpawnPP.pm     |  44 +++++-----
 lib/PublicInbox/Xapcmd.pm      |  11 +--
 script/lei                     |   8 +-
 t/ipc.t                        |  39 ++-------
 t/lei-externals.t              |   2 +
 t/lei-import-maildir.t         |  33 ++++++++
 t/lei-mirror.t                 |  14 ++++
 t/lei.t                        |   2 +-
 t/lei_to_mail.t                |   6 +-
 t/spawn.t                      |  18 +++++
 xt/stress-sharedkv.t           |   6 +-
 32 files changed, 433 insertions(+), 318 deletions(-)
 create mode 100644 t/lei-import-maildir.t


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/19] spawn: pi_fork_exec: restore parent sigmask in child
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 02/19] spawn: pi_fork_exec: support "pgid" Eric Wong
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

We continue to unblock SIGCHLD unconditionally, but also
any signals not blocked by the parent (wq_worker).

This will allow Ctrl-C (SIGINT) to stop "git clone" and allow
git-clone cleanup to be performed and other long-running
processes when pi_fork_exec supports setpgid(2).  This won't
affect existing daemons on systems with signalfd(2) or
EVFILT_SIGNAL at all, since those run with signals blocked
anyways.
---
 lib/PublicInbox/Spawn.pm   | 26 ++++++++++++--------------
 lib/PublicInbox/SpawnPP.pm |  6 ++----
 2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index f7dcb024..bac24dd1 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -82,20 +82,20 @@ int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
 	const char *filename = SvPV_nolen(file);
 	pid_t pid;
 	char **argv, **envp;
-	sigset_t set, old, cset;
+	sigset_t set, old;
 	int ret, perrnum, cerrnum = 0;
+	int chld_is_member;
 
 	AV2C_COPY(argv, cmd);
 	AV2C_COPY(envp, env);
 
-	ret = sigfillset(&set);
-	assert(ret == 0 && "BUG calling sigfillset");
-	ret = sigprocmask(SIG_SETMASK, &set, &old);
-	assert(ret == 0 && "BUG calling sigprocmask to block");
-	ret = sigemptyset(&cset);
-	assert(ret == 0 && "BUG calling sigemptyset");
-	ret = sigaddset(&cset, SIGCHLD);
-	assert(ret == 0 && "BUG calling sigaddset for SIGCHLD");
+	if (sigfillset(&set)) return -1;
+	if (sigprocmask(SIG_SETMASK, &set, &old)) return -1;
+	chld_is_member = sigismember(&old, SIGCHLD);
+	if (chld_is_member < 0) return -1;
+	if (chld_is_member > 0)
+		sigdelset(&old, SIGCHLD);
+
 	pid = vfork();
 	if (pid == 0) {
 		int sig;
@@ -127,15 +127,13 @@ int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
 				exit_err(&cerrnum);
 		}
 
-		/*
-		 * don't bother unblocking other signals for now, just SIGCHLD.
-		 * we don't want signals to the group taking out a subprocess
-		 */
-		(void)sigprocmask(SIG_UNBLOCK, &cset, NULL);
+		(void)sigprocmask(SIG_SETMASK, &old, NULL);
 		execve(filename, argv, envp);
 		exit_err(&cerrnum);
 	}
 	perrnum = errno;
+	if (chld_is_member > 0)
+		sigaddset(&old, SIGCHLD);
 	ret = sigprocmask(SIG_SETMASK, &old, NULL);
 	assert(ret == 0 && "BUG calling sigprocmask to restore");
 	if (cerrnum) {
diff --git a/lib/PublicInbox/SpawnPP.pm b/lib/PublicInbox/SpawnPP.pm
index b0ad4da5..f64b95dc 100644
--- a/lib/PublicInbox/SpawnPP.pm
+++ b/lib/PublicInbox/SpawnPP.pm
@@ -37,10 +37,8 @@ sub pi_fork_exec ($$$$$$) {
 			chdir $cd or die "chdir $cd: $!";
 		}
 		$SIG{$_} = 'DEFAULT' for keys %SIG;
-		my $cset = POSIX::SigSet->new();
-		$cset->addset(POSIX::SIGCHLD) or die "can't add SIGCHLD: $!";
-		sigprocmask(SIG_UNBLOCK, $cset) or
-					die "can't unblock SIGCHLD: $!";
+		$old->delset(POSIX::SIGCHLD) or die "delset SIGCHLD: $!";
+		sigprocmask(SIG_SETMASK, $old) or die "SETMASK: ~SIGCHLD: $!";
 		if ($ENV{MOD_PERL}) {
 			exec which('env'), '-i', @$env, @$cmd;
 			die "exec env -i ... $cmd->[0] failed: $!\n";

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 02/19] spawn: pi_fork_exec: support "pgid"
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
  2021-02-07  8:51 ` [PATCH 01/19] spawn: pi_fork_exec: restore parent sigmask in child Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07 23:10   ` dprintf(3) portability? [was [02/19] spawn: pi_fork_exec: support "pgid"] Eric Wong
  2021-02-07  8:51 ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

We'll be using this to allow the "git clone" process hierarchy
to be killed via Ctrl-C.  This also fixes a long-standing bug
in error reporting for the Inline::C version, because we're
actually testing for errors, now!

n.b. strlen(3) is officially async-signal-safe as of
POSIX.1-2016, but I can't think of a reason any previous
implementation prior to that wouldn't be.
---
 lib/PublicInbox/Spawn.pm   | 37 +++++++++++++++++++++++++------------
 lib/PublicInbox/SpawnPP.pm | 24 ++++++++++++++----------
 t/spawn.t                  | 18 ++++++++++++++++++
 3 files changed, 57 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index bac24dd1..00e6829e 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -61,9 +61,10 @@ my $all_libc = <<'ALL_LIBC'; # all *nix systems we support
 } while (0)
 
 /* needs to be safe inside a vfork'ed process */
-static void exit_err(int *cerrnum)
+static void exit_err(const char *fn, volatile int *cerrnum)
 {
 	*cerrnum = errno;
+	write(2, fn, strlen(fn));
 	_exit(1);
 }
 
@@ -73,7 +74,7 @@ static void exit_err(int *cerrnum)
  * Be sure to update PublicInbox::SpawnPP if this changes
  */
 int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
-		 const char *cd)
+		 const char *cd, int pgid)
 {
 	AV *redir = (AV *)SvRV(redirref);
 	AV *cmd = (AV *)SvRV(cmdref);
@@ -83,8 +84,10 @@ int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
 	pid_t pid;
 	char **argv, **envp;
 	sigset_t set, old;
-	int ret, perrnum, cerrnum = 0;
+	int ret, perrnum;
+	volatile int cerrnum = 0; /* shared due to vfork */
 	int chld_is_member;
+	I32 max_fd = av_len(redir);
 
 	AV2C_COPY(argv, cmd);
 	AV2C_COPY(envp, env);
@@ -99,23 +102,25 @@ int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
 	pid = vfork();
 	if (pid == 0) {
 		int sig;
-		I32 i, child_fd, max = av_len(redir);
+		I32 i, child_fd, max_rlim;
 
-		for (child_fd = 0; child_fd <= max; child_fd++) {
+		for (child_fd = 0; child_fd <= max_fd; child_fd++) {
 			SV **parent = av_fetch(redir, child_fd, 0);
 			int parent_fd = SvIV(*parent);
 			if (parent_fd == child_fd)
 				continue;
 			if (dup2(parent_fd, child_fd) < 0)
-				exit_err(&cerrnum);
+				exit_err("dup2", &cerrnum);
 		}
+		if (pgid >= 0 && setpgid(0, pgid) < 0)
+			exit_err("setpgid", &cerrnum);
 		for (sig = 1; sig < NSIG; sig++)
 			signal(sig, SIG_DFL); /* ignore errors on signals */
 		if (*cd && chdir(cd) < 0)
-			exit_err(&cerrnum);
+			exit_err("chdir", &cerrnum);
 
-		max = av_len(rlim);
-		for (i = 0; i < max; i += 3) {
+		max_rlim = av_len(rlim);
+		for (i = 0; i < max_rlim; i += 3) {
 			struct rlimit rl;
 			SV **res = av_fetch(rlim, i, 0);
 			SV **soft = av_fetch(rlim, i + 1, 0);
@@ -124,12 +129,12 @@ int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
 			rl.rlim_cur = SvIV(*soft);
 			rl.rlim_max = SvIV(*hard);
 			if (setrlimit(SvIV(*res), &rl) < 0)
-				exit_err(&cerrnum);
+				exit_err("setrlimit", &cerrnum);
 		}
 
 		(void)sigprocmask(SIG_SETMASK, &old, NULL);
 		execve(filename, argv, envp);
-		exit_err(&cerrnum);
+		exit_err("execve", &cerrnum);
 	}
 	perrnum = errno;
 	if (chld_is_member > 0)
@@ -137,9 +142,16 @@ int pi_fork_exec(SV *redirref, SV *file, SV *cmdref, SV *envref, SV *rlimref,
 	ret = sigprocmask(SIG_SETMASK, &old, NULL);
 	assert(ret == 0 && "BUG calling sigprocmask to restore");
 	if (cerrnum) {
+		int err_fd = STDERR_FILENO;
+		if (err_fd <= max_fd) {
+			SV **parent = av_fetch(redir, err_fd, 0);
+			err_fd = SvIV(*parent);
+		}
 		if (pid > 0)
 			waitpid(pid, NULL, 0);
 		pid = -1;
+		/* continue message started by exit_err in child */
+		dprintf(err_fd, ": %s\n", strerror(cerrnum));
 		errno = cerrnum;
 	} else if (perrnum) {
 		errno = perrnum;
@@ -373,7 +385,8 @@ sub spawn ($;$$) {
 		push @$rlim, $r, @$v;
 	}
 	my $cd = $opts->{'-C'} // ''; # undef => NULL mapping doesn't work?
-	my $pid = pi_fork_exec($redir, $f, $cmd, \@env, $rlim, $cd);
+	my $pgid = $opts->{pgid} // -1;
+	my $pid = pi_fork_exec($redir, $f, $cmd, \@env, $rlim, $cd, $pgid);
 	die "fork_exec @$cmd failed: $!\n" unless $pid > 0;
 	$pid;
 }
diff --git a/lib/PublicInbox/SpawnPP.pm b/lib/PublicInbox/SpawnPP.pm
index f64b95dc..401cb78d 100644
--- a/lib/PublicInbox/SpawnPP.pm
+++ b/lib/PublicInbox/SpawnPP.pm
@@ -5,12 +5,12 @@
 # of vfork, so no speedups under Linux for spawning from large processes.
 package PublicInbox::SpawnPP;
 use strict;
-use warnings;
-use POSIX qw(dup2 :signal_h);
+use v5.10.1;
+use POSIX qw(dup2 _exit setpgid :signal_h);
 
 # Pure Perl implementation for folks that do not use Inline::C
-sub pi_fork_exec ($$$$$$) {
-	my ($redir, $f, $cmd, $env, $rlim, $cd) = @_;
+sub pi_fork_exec ($$$$$$$) {
+	my ($redir, $f, $cmd, $env, $rlim, $cd, $pgid) = @_;
 	my $old = POSIX::SigSet->new();
 	my $set = POSIX::SigSet->new();
 	$set->fillset or die "fillset failed: $!";
@@ -22,21 +22,25 @@ sub pi_fork_exec ($$$$$$) {
 		$pid = -1;
 	}
 	if ($pid == 0) {
-		while (@$rlim) {
-			my ($r, $soft, $hard) = splice(@$rlim, 0, 3);
-			BSD::Resource::setrlimit($r, $soft, $hard) or
-			  warn "failed to set $r=[$soft,$hard]\n";
-		}
 		for my $child_fd (0..$#$redir) {
 			my $parent_fd = $redir->[$child_fd];
 			next if $parent_fd == $child_fd;
 			dup2($parent_fd, $child_fd) or
 				die "dup2($parent_fd, $child_fd): $!\n";
 		}
+		if ($pgid >= 0 && !defined(setpgid(0, $pgid))) {
+			warn "setpgid: $!";
+			_exit(1);
+		}
+		$SIG{$_} = 'DEFAULT' for keys %SIG;
 		if ($cd ne '') {
 			chdir $cd or die "chdir $cd: $!";
 		}
-		$SIG{$_} = 'DEFAULT' for keys %SIG;
+		while (@$rlim) {
+			my ($r, $soft, $hard) = splice(@$rlim, 0, 3);
+			BSD::Resource::setrlimit($r, $soft, $hard) or
+			  warn "failed to set $r=[$soft,$hard]\n";
+		}
 		$old->delset(POSIX::SIGCHLD) or die "delset SIGCHLD: $!";
 		sigprocmask(SIG_SETMASK, $old) or die "SETMASK: ~SIGCHLD: $!";
 		if ($ENV{MOD_PERL}) {
diff --git a/t/spawn.t b/t/spawn.t
index 6f811ec1..a17b72d9 100644
--- a/t/spawn.t
+++ b/t/spawn.t
@@ -18,6 +18,24 @@ use PublicInbox::Sigfd;
 	is($?, 0, 'true exited successfully');
 }
 
+SKIP: {
+	my $pid = spawn(['true'], undef, { pgid => 0 });
+	ok($pid, 'spawned process with new pgid');
+	is(waitpid($pid, 0), $pid, 'waitpid succeeds on spawned process');
+	is($?, 0, 'true exited successfully');
+	pipe(my ($r, $w)) or BAIL_OUT;
+	$pid = eval { spawn(['true'], undef, { pgid => 1, 2 => $w }) };
+	close $w;
+	my $err = do { local $/; <$r> };
+	# diag "$err ($@)";
+	if (defined $pid) {
+		waitpid($pid, 0) if defined $pid;
+		isnt($?, 0, 'child error (pure-Perl)');
+	} else {
+		ok($@, 'exception raised');
+	}
+}
+
 { # ensure waitpid(-1, 0) and SIGCHLD works in spawned process
 	my $script = <<'EOF';
 $| = 1; # unbuffer stdout

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 03/19] lei add-external: handle interrupts with --mirror
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
  2021-02-07  8:51 ` [PATCH 01/19] spawn: pi_fork_exec: restore parent sigmask in child Eric Wong
  2021-02-07  8:51 ` [PATCH 02/19] spawn: pi_fork_exec: support "pgid" Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 04/19] spawn_pp: die more consistently in child Eric Wong
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

This also updates lei_xsearch to follow the same pattern for
stopping curl(1) and tail(1) processes it spawns.
---
 lib/PublicInbox/IPC.pm        |  5 +--
 lib/PublicInbox/LEI.pm        |  6 ++++
 lib/PublicInbox/LeiMirror.pm  | 66 +++++++++++++++++++++++------------
 lib/PublicInbox/LeiXSearch.pm | 21 +++++------
 lib/PublicInbox/OnDestroy.pm  |  2 +-
 t/lei-mirror.t                | 12 +++++++
 6 files changed, 74 insertions(+), 38 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 0dee2a92..b936c27a 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -150,9 +150,10 @@ sub ipc_worker_reap { # dwaitpid callback
 }
 
 sub wq_wait_old {
-	my ($self, $args) = @_;
+	my ($self, @args) = @_;
+	my $cb = ref($args[0]) eq 'CODE' ? shift(@args) : \&ipc_worker_reap;
 	my $pids = delete $self->{"-wq_old_pids.$$"} or return;
-	dwaitpid($_, \&ipc_worker_reap, [$self, $args]) for @$pids;
+	dwaitpid($_, $cb, [$self, @args]) for @$pids;
 }
 
 # for base class, override in sub classes
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3098ade7..515bc2a3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -370,6 +370,12 @@ sub sigpipe_handler { # handles SIGPIPE from @WQ_KEYS workers
 	fail_handler($_[0], 13, delete $_[0]->{1});
 }
 
+# PublicInbox::OnDestroy callback for SIGINT to take out the entire pgid
+sub sigint_reap {
+	my ($pgid) = @_;
+	dwaitpid($pgid) if kill('-INT', $pgid);
+}
+
 sub fail ($$;$) {
 	my ($self, $buf, $exit_code) = @_;
 	err($self, $buf) if defined $buf;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index bb172e6a..13795a58 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -10,13 +10,19 @@ use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
 use PublicInbox::Spawn qw(popen_rd spawn);
 use PublicInbox::PktOp;
 
+sub do_finish_mirror { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($mrr, $lei) = @$arg;
+	if ($? == 0 && unlink("$mrr->{dst}/mirror.done")) {
+		$lei->add_external_finish($mrr->{dst});
+	}
+	$lei->dclose;
+}
+
 sub mirror_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	my $mrr = delete $lei->{mrr};
-	$mrr->wq_wait_old($lei) if $mrr;
-	# FIXME: check $? before finish
-	$lei->add_external_finish($mrr->{dst});
-	$lei->dclose;
+	my $mrr = delete $lei->{mrr} or return;
+	$mrr->wq_wait_old(\&do_finish_mirror, $lei);
 }
 
 # for old installations without manifest.js.gz
@@ -59,8 +65,9 @@ E: confused by scraping <$uri>, got ambiguous results:
 }
 
 sub clone_cmd {
-	my ($lei) = @_;
+	my ($lei, $opt) = @_;
 	my @cmd = qw(git);
+	$opt->{$_} = $lei->{$_} for (0..2);
 	# we support "-c $key=$val" for arbitrary git config options
 	# e.g.: git -c http.proxy=socks5h://127.0.0.1:9050
 	push(@cmd, '-c', $_) for @{$lei->{opt}->{c} // []};
@@ -92,14 +99,12 @@ sub _try_config {
 	my $f = "$ce-$$.tmp";
 	open(my $fh, '+>', $f) or return $lei->err("open $f: $! (non-fatal)");
 	my $opt = { 0 => $lei->{0}, 1 => $fh, 2 => $lei->{2} };
-	$lei->qerr("# @$cmd");
-	my $pid = spawn($cmd, $lei->{env}, $opt);
-	waitpid($pid, 0) == $pid or return $lei->err("waitpid @$cmd: $!");
-	if (($? >> 8) == 22) { # 404 missing
+	my $cerr = run_reap($lei, $cmd, $opt) // return;
+	if (($cerr >> 8) == 22) { # 404 missing
 		unlink($f) if -s $fh == 0;
 		return;
 	}
-	return $lei->err("# @$cmd failed (non-fatal)") if $?;
+	return $lei->err("# @$cmd failed (non-fatal)") if $cerr;
 	rename($f, $ce) or return $lei->err("link($f, $ce): $! (non-fatal)");
 	my $cfg = PublicInbox::Config::git_config_dump($f);
 	my $ibx = $self->{ibx} = {};
@@ -132,6 +137,18 @@ sub index_cloned_inbox {
 	local %ENV = (%ENV, %$env) if $env;
 	PublicInbox::Admin::progress_prepare($opt, $lei->{2});
 	PublicInbox::Admin::index_inbox($ibx, undef, $opt);
+	open my $x, '>', "$self->{dst}/mirror.done"; # for do_finish_mirror
+}
+
+sub run_reap {
+	my ($lei, $cmd, $opt) = @_;
+	$lei->qerr("# @$cmd");
+	$opt->{pgid} = 0;
+	my $pid = spawn($cmd, $lei->{env}, $opt);
+	my $reap = PublicInbox::OnDestroy->new($lei->can('sigint_reap'), $pid);
+	my $err = waitpid($pid, 0) == $pid ? undef : "waitpid @$cmd: $!";
+	@$reap = (); # cancel reap
+	$err ? $lei->err($err) : $?
 }
 
 sub clone_v1 {
@@ -140,11 +157,10 @@ sub clone_v1 {
 	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
 	my $uri = URI->new($self->{src});
 	my $pfx = $curl->torsocks($lei, $uri) or return;
-	my $cmd = [ @$pfx, clone_cmd($lei), $uri->as_string, $self->{dst} ];
-	$lei->qerr("# @$cmd");
-	my $pid = spawn($cmd, $lei->{env}, $lei);
-	waitpid($pid, 0) == $pid or die "BUG: waitpid @$cmd: $!";
-	$? == 0 or return $lei->child_error($?, "@$cmd failed");
+	my $cmd = [ @$pfx, clone_cmd($lei, my $opt = {}),
+			$uri->as_string, $self->{dst} ];
+	my $cerr = run_reap($lei, $cmd, $opt) // return;
+	return $lei->child_error($cerr, "@$cmd failed") if $cerr;
 	_try_config($self);
 	index_cloned_inbox($self, 1);
 }
@@ -170,13 +186,11 @@ failed to extract epoch number from $src
 	my $lk = bless { lock_path => "$dst/inbox.lock" }, 'PublicInbox::Lock';
 	_try_config($self);
 	my $on_destroy = $lk->lock_for_scope($$);
-	my @cmd = clone_cmd($lei);
+	my @cmd = clone_cmd($lei, my $opt = {});
 	while (my $pair = shift(@src_edst)) {
 		my $cmd = [ @$pfx, @cmd, @$pair ];
-		$lei->qerr("# @$cmd");
-		my $pid = spawn($cmd, $lei->{env}, $lei);
-		waitpid($pid, 0) == $pid or die "BUG: waitpid @$cmd: $!";
-		$? == 0 or return $lei->child_error($?, "@$cmd failed");
+		my $cerr = run_reap($lei, $cmd, $opt) // return;
+		return $lei->child_error($cerr, "@$cmd failed") if $cerr;
 	}
 	undef $on_destroy; # unlock
 	index_cloned_inbox($self, 2);
@@ -193,9 +207,14 @@ sub try_manifest {
 	my $cmd = $curl->for_uri($lei, $uri);
 	$lei->qerr("# @$cmd");
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
-	my $fh = popen_rd($cmd, $lei->{env}, $opt);
+	my ($fh, $pid) = popen_rd($cmd, $lei->{env}, $opt);
+	my $reap = PublicInbox::OnDestroy->new($lei->can('sigint_reap'), $pid);
 	my $gz = do { local $/; <$fh> } // die "read(curl $uri): $!";
-	unless (close $fh) {
+	close $fh;
+	my $err = waitpid($pid, 0) == $pid ? undef : "waitpid @$cmd: $!";
+	@$reap = ();
+	return $lei->err($err) if $err;
+	if ($?) {
 		return try_scrape($self) if ($? >> 8) == 22; # 404 missing
 		return $lei->child_error($?, "@$cmd failed");
 	}
@@ -282,6 +301,7 @@ sub start {
 sub ipc_atfork_child {
 	my ($self) = @_;
 	$self->{lei}->lei_atfork_child;
+	$SIG{TERM} = sub { exit(128 + 15) }; # trigger OnDestroy $reap
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1e5d7ca6..6a1b107b 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -197,13 +197,6 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$each_smsg->($smsg, undef, $eml);
 }
 
-# PublicInbox::OnDestroy callback
-sub kill_reap {
-	my ($pid) = @_;
-	kill('KILL', $pid); # spawn() blocks other signals
-	waitpid($pid, 0);
-}
-
 sub query_remote_mboxrd {
 	my ($self, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
@@ -213,18 +206,19 @@ sub query_remote_mboxrd {
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
 	my $verbose = $opt->{verbose};
-	my $reap;
+	my ($reap_tail, $reap_curl);
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
 	fcntl($cerr, F_SETFL, O_APPEND|O_RDWR) or warn "set O_APPEND: $!";
-	my $rdr = { 2 => $cerr };
+	my $rdr = { 2 => $cerr, pgid => 0 };
 	my $coff = 0;
+	my $sigint_reap = $lei->can('sigint_reap');
 	if ($verbose) {
 		# spawn a process to force line-buffering, otherwise curl
 		# will write 1 character at-a-time and parallel outputs
 		# mmmaaayyy llloookkk llliiikkkeee ttthhhiiisss
-		my $o = { 1 => $lei->{2}, 2 => $lei->{2} };
+		my $o = { 1 => $lei->{2}, 2 => $lei->{2}, pgid => 0 };
 		my $pid = spawn(['tail', '-f', $cerr->filename], undef, $o);
-		$reap = PublicInbox::OnDestroy->new(\&kill_reap, $pid);
+		$reap_tail = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 	}
 	my $curl = PublicInbox::LeiCurl->new($lei, $self->{curl}) or return;
 	push @$curl, '-s', '-d', '';
@@ -236,10 +230,13 @@ sub query_remote_mboxrd {
 		my $cmd = $curl->for_uri($lei, $uri);
 		$lei->err("# @$cmd") if $verbose;
 		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
+		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
 						$lei, $each_smsg);
-		waitpid($pid, 0) == $pid or die "BUG: waitpid (curl): $!";
+		my $err = waitpid($pid, 0) == $pid ? undef : "BUG: waitpid: $!";
+		@$reap_curl = (); # cancel OnDestroy
+		die $err if $err;
 		if ($? == 0) {
 			my $nr = $lei->{-nr_remote_eml};
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
diff --git a/lib/PublicInbox/OnDestroy.pm b/lib/PublicInbox/OnDestroy.pm
index 0ae4c4c9..615bc450 100644
--- a/lib/PublicInbox/OnDestroy.pm
+++ b/lib/PublicInbox/OnDestroy.pm
@@ -10,7 +10,7 @@ sub new {
 
 sub DESTROY {
 	my ($cb, @args) = @{$_[0]};
-	if (!ref($cb)) {
+	if (!ref($cb) && $cb) {
 		my $pid = $cb;
 		return if $pid != $$;
 		$cb = shift @args;
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 6af49678..2373b370 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -13,15 +13,27 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	my $t1 = "$home/t1-mirror";
 	ok($lei->('add-external', $t1, '--mirror', "$http/t1/"), '--mirror v1');
 	ok(-f "$t1/public-inbox/msgmap.sqlite3", 't1-mirror indexed');
+
+	ok($lei->('ls-external'), 'ls-external');
+	like($lei_out, qr!\Q$t1\E!, 't1 added to ls-externals');
+
 	my $t2 = "$home/t2-mirror";
 	ok($lei->('add-external', $t2, '--mirror', "$http/t2/"), '--mirror v2');
 	ok(-f "$t2/msgmap.sqlite3", 't2-mirror indexed');
 
+	ok($lei->('ls-external'), 'ls-external');
+	like($lei_out, qr!\Q$t2\E!, 't2 added to ls-externals');
+
 	ok(!$lei->('add-external', $t2, '--mirror', "$http/t2/"),
 		'--mirror fails if reused');
 
+	ok($lei->('ls-external'), 'ls-external');
+	like($lei_out, qr!\Q$t2\E!, 'still in ls-externals');
+
 	ok(!$lei->('add-external', "$t2-fail", '-Lmedium'), '--mirror v2');
 	ok(!-d "$t2-fail", 'destination not created on failure');
+	ok($lei->('ls-external'), 'ls-external');
+	unlike($lei_out, qr!\Q$t2-fail\E!, 'not added to ls-external');
 });
 
 ok($td->kill, 'killed -httpd');

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 04/19] spawn_pp: die more consistently in child
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 05/19] ipc: do not die inside wq_worker child process Eric Wong
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

The default $SIG{__DIE__} inside a forked child doesn't actually
do what we want it to do.  We don't want it to zip up the stack
the parent used, but instead want to exit the child process
after warning.
---
 lib/PublicInbox/SpawnPP.pm | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/SpawnPP.pm b/lib/PublicInbox/SpawnPP.pm
index 401cb78d..2c5edef6 100644
--- a/lib/PublicInbox/SpawnPP.pm
+++ b/lib/PublicInbox/SpawnPP.pm
@@ -22,15 +22,15 @@ sub pi_fork_exec ($$$$$$$) {
 		$pid = -1;
 	}
 	if ($pid == 0) {
+		$SIG{__DIE__} = sub { warn @_; _exit 1 };
 		for my $child_fd (0..$#$redir) {
 			my $parent_fd = $redir->[$child_fd];
 			next if $parent_fd == $child_fd;
 			dup2($parent_fd, $child_fd) or
-				die "dup2($parent_fd, $child_fd): $!\n";
+				die "dup2($parent_fd, $child_fd): $!";
 		}
 		if ($pgid >= 0 && !defined(setpgid(0, $pgid))) {
-			warn "setpgid: $!";
-			_exit(1);
+			die "setpgid(0, $pgid): $!";
 		}
 		$SIG{$_} = 'DEFAULT' for keys %SIG;
 		if ($cd ne '') {
@@ -39,20 +39,18 @@ sub pi_fork_exec ($$$$$$$) {
 		while (@$rlim) {
 			my ($r, $soft, $hard) = splice(@$rlim, 0, 3);
 			BSD::Resource::setrlimit($r, $soft, $hard) or
-			  warn "failed to set $r=[$soft,$hard]\n";
+				die "setrlimit($r=[$soft,$hard]: $!)";
 		}
 		$old->delset(POSIX::SIGCHLD) or die "delset SIGCHLD: $!";
 		sigprocmask(SIG_SETMASK, $old) or die "SETMASK: ~SIGCHLD: $!";
+		$cmd->[0] = $f;
 		if ($ENV{MOD_PERL}) {
-			exec which('env'), '-i', @$env, @$cmd;
-			die "exec env -i ... $cmd->[0] failed: $!\n";
+			@$cmd = (which('env'), '-i', @$env, @$cmd);
 		} else {
-			local %ENV = map { split(/=/, $_, 2) } @$env;
-			my @cmd = @$cmd;
-			$cmd[0] = $f;
-			exec @cmd;
-			die "exec $cmd->[0] failed: $!\n";
+			%ENV = map { split(/=/, $_, 2) } @$env;
 		}
+		exec @$cmd;
+		die "exec @$cmd failed: $!";
 	}
 	sigprocmask(SIG_SETMASK, $old) or die "can't unblock signals: $!";
 	$! = $syserr;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 05/19] ipc: do not die inside wq_worker child process
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 04/19] spawn_pp: die more consistently in child Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 06/19] ipc: trim down the Storable checks Eric Wong
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

die() in a child zips up the stack into the parent, which is
undesirable behavior.  We're going to exit anyways, just warn
and let exit(1) happen due to $@ being set.
---
 lib/PublicInbox/IPC.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index b936c27a..8f6f4ded 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -129,7 +129,7 @@ sub ipc_worker_spawn {
 			local %SIG = %SIG;
 			ipc_worker_loop($self, $r_req, $w_res);
 		};
-		die "worker $ident PID:$$ died: $@\n" if $@;
+		warn "worker $ident PID:$$ died: $@\n" if $@;
 		undef $end; # trigger exit
 	}
 	PublicInbox::DS::sig_setmask($sigset) unless $oldset;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 06/19] ipc: trim down the Storable checks
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 05/19] ipc: do not die inside wq_worker child process Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 07/19] Makefile.PL: depend on IO::Uncompress::Gunzip Eric Wong
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

It's distributed with Perl and our Makefile.PL even declares a
dependency on it, just like Encode and all the Compress::*
stuff.
---
 lib/PublicInbox/IPC.pm | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 8f6f4ded..3713b56b 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -38,12 +38,9 @@ if ($enc && $dec) { # should be custom ops
 	*ipc_freeze = sub ($) { sereal_encode_with_object $enc, $_[0] };
 	*ipc_thaw = sub ($) { sereal_decode_with_object $dec, $_[0], my $ret };
 } else {
-	eval { # some distros have Storable as a separate package from Perl
-		require Storable;
-		*ipc_freeze = \&Storable::freeze;
-		*ipc_thaw = \&Storable::thaw;
-		$enc = 1;
-	} // warn("Storable (part of Perl) missing: $@\n");
+	require Storable;
+	*ipc_freeze = \&Storable::freeze;
+	*ipc_thaw = \&Storable::thaw;
 }
 
 my $recv_cmd = PublicInbox::Spawn->can('recv_cmd4');
@@ -102,7 +99,6 @@ sub ipc_worker_loop ($$$) {
 # starts a worker if Sereal or Storable is installed
 sub ipc_worker_spawn {
 	my ($self, $ident, $oldset, $fields) = @_;
-	return unless $enc; # no Sereal or Storable
 	return if ($self->{-ipc_ppid} // -1) == $$; # idempotent
 	delete(@$self{qw(-ipc_req -ipc_res -ipc_ppid -ipc_pid)});
 	pipe(my ($r_req, $w_req)) or die "pipe: $!";
@@ -364,7 +360,7 @@ sub _wq_worker_start ($$$) {
 # starts workqueue workers if Sereal or Storable is installed
 sub wq_workers_start {
 	my ($self, $ident, $nr_workers, $oldset, $fields) = @_;
-	($enc && $send_cmd && $recv_cmd && defined($SEQPACKET)) or return;
+	($send_cmd && $recv_cmd && defined($SEQPACKET)) or return;
 	return if $self->{-wq_s1}; # idempotent
 	$self->{-wq_s1} = $self->{-wq_s2} = undef;
 	socketpair($self->{-wq_s1}, $self->{-wq_s2}, AF_UNIX, $SEQPACKET, 0) or

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 07/19] Makefile.PL: depend on IO::Uncompress::Gunzip
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 06/19] ipc: trim down the Storable checks Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 08/19] xapcmd: avoid potential die surprise in children Eric Wong
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

It's another part of the Perl standard library and rarely
split out from Perl (though we can't depend on that fact).
---
 Makefile.PL | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile.PL b/Makefile.PL
index 68545573..ca8e45cf 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -153,6 +153,7 @@ WriteMakefile(
 		'Digest::SHA' => 0, # rpm: perl-Digest-SHA
 		'Encode' => 2.35, # 2.35 shipped with 5.10.1
 		'IO::Compress::Gzip' => 0,
+		'IO::Uncompress::Gunzip' => 0,
 		'Storable' => 0, # rpm: perl-Storable
 
 		# Plack is needed for public-inbox-httpd and PublicInbox::WWW

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 08/19] xapcmd: avoid potential die surprise in children
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (6 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 07/19] Makefile.PL: depend on IO::Uncompress::Gunzip Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 09/19] tests: guard setup_public_inboxes for SQLite and Xapian Eric Wong
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

Make some notes about sub usage, this may be converted
to use workqueues once the cmsg dependency is dropped.
---
 lib/PublicInbox/Xapcmd.pm | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/Xapcmd.pm b/lib/PublicInbox/Xapcmd.pm
index 269aa99a..e2d67f6a 100644
--- a/lib/PublicInbox/Xapcmd.pm
+++ b/lib/PublicInbox/Xapcmd.pm
@@ -9,7 +9,7 @@ use PublicInbox::SearchIdx;
 use File::Temp 0.19 (); # ->newdir
 use File::Path qw(remove_tree);
 use File::Basename qw(dirname);
-use POSIX qw(WNOHANG);
+use POSIX qw(WNOHANG _exit);
 
 # support testing with dev versions of Xapian which installs
 # commands with a version number suffix (e.g. "xapian-compact-1.5")
@@ -93,8 +93,9 @@ sub cb_spawn {
 	my $pid = fork // die "fork: $!";
 	return $pid if $pid > 0;
 	srand($seed);
+	$SIG{__DIE__} = sub { warn @_; _exit(1) }; # don't jump up stack
 	$cb->($args, $opt);
-	POSIX::_exit(0);
+	_exit(0);
 }
 
 sub runnable_or_die ($) {
@@ -237,7 +238,7 @@ sub prepare_run {
 
 sub check_compact () { runnable_or_die($XAPIAN_COMPACT) }
 
-sub _run {
+sub _run { # with_umask callback
 	my ($ibx, $cb, $opt) = @_;
 	my $im = $ibx->importer(0);
 	$im->lock_acquire;
@@ -303,7 +304,7 @@ sub kill_compact { # setup_signals callback
 }
 
 # xapian-compact wrapper
-sub compact ($$) {
+sub compact ($$) { # cb_spawn callback
 	my ($args, $opt) = @_;
 	my ($src, $newdir) = @$args;
 	my $dst = ref($newdir) ? $newdir->dirname : $newdir;
@@ -384,7 +385,7 @@ sub cpdb_loop ($$$;$$) {
 
 # Like copydatabase(1), this is horribly slow; and it doesn't seem due
 # to the overhead of Perl.
-sub cpdb ($$) {
+sub cpdb ($$) { # cb_spawn callback
 	my ($args, $opt) = @_;
 	my ($old, $newdir) = @$args;
 	my $new = $newdir->dirname;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 09/19] tests: guard setup_public_inboxes for SQLite and Xapian
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (7 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 08/19] xapcmd: avoid potential die surprise in children Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 10/19] Revert "ipc: add support for asynchronous callbacks" Eric Wong
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

This will need some work to before it's generally applicable
to the rest of our code base.
---
 t/lei-externals.t | 2 ++
 t/lei-mirror.t    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/t/lei-externals.t b/t/lei-externals.t
index f2cb09b4..28c01174 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -4,6 +4,8 @@
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 use Fcntl qw(SEEK_SET);
 use PublicInbox::Spawn qw(which);
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
 
 my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
 	http://czquwvybam4bgbro.onion/meta/
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 2373b370..667284fd 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -2,6 +2,8 @@
 # Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
 my $sock = tcp_server();
 my ($tmpdir, $for_destroy) = tmpdir();
 my $http = 'http://'.$sock->sockhost.':'.$sock->sockport.'/';

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 10/19] Revert "ipc: add support for asynchronous callbacks"
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (8 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 09/19] tests: guard setup_public_inboxes for SQLite and Xapian Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 11/19] ipc: wq_do => wq_io_do Eric Wong
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

This reverts commit a7e6a8cd68fb6d700337d8dbc7ee2c65ff3d2fc1.

It turns out to be unworkable in the face of multiple producer
processes, since the lock we make has no effect when calculating
pipe capacity.
---
 lib/PublicInbox/IPC.pm | 52 +++---------------------------------------
 t/ipc.t                | 25 --------------------
 2 files changed, 3 insertions(+), 74 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 3713b56b..7e5a0b16 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -18,7 +18,6 @@ use PublicInbox::OnDestroy;
 use PublicInbox::WQWorker;
 use Socket qw(AF_UNIX MSG_EOR SOCK_STREAM);
 my $SEQPACKET = eval { Socket::SOCK_SEQPACKET() }; # portable enough?
-use constant PIPE_BUF => $^O eq 'linux' ? 4096 : POSIX::_POSIX_PIPE_BUF();
 our @EXPORT_OK = qw(ipc_freeze ipc_thaw);
 my $WQ_MAX_WORKERS = 4096;
 my ($enc, $dec);
@@ -59,15 +58,10 @@ sub _get_rec ($) {
 	ipc_thaw($buf);
 }
 
-sub _pack_rec ($) {
-	my ($ref) = @_;
-	my $buf = ipc_freeze($ref);
-	length($buf) . "\n" . $buf;
-}
-
 sub _send_rec ($$) {
 	my ($w, $ref) = @_;
-	print $w _pack_rec($ref) or croak "print: $!";
+	my $buf = ipc_freeze($ref);
+	print $w length($buf), "\n", $buf or croak "print: $!";
 }
 
 sub ipc_return ($$$) {
@@ -188,21 +182,6 @@ sub ipc_lock_init {
 	$self->{-ipc_lock} //= bless { lock_path => $f }, 'PublicInbox::Lock'
 }
 
-sub ipc_async_wait ($$) {
-	my ($self, $max) = @_; # max == -1 to wait for all
-	my $aif = $self->{-async_inflight} or return;
-	my $r_res = $self->{-ipc_res} or die 'BUG: no ipc_res';
-	while (my ($sub, $bytes, $cb, $cb_arg) = splice(@$aif, 0, 4)) {
-		my $ret = _get_rec($r_res) //
-			die "no response on $sub (req.size=$bytes)";
-		$self->{-async_inflight_bytes} -= $bytes;
-
-		eval { $cb->($cb_arg, $ret) };
-		warn "E: $sub callback error: $@\n" if $@;
-		return if --$max == 0;
-	}
-}
-
 # call $self->$sub(@args), on a worker if ipc_worker_spawn was used
 sub ipc_do {
 	my ($self, $sub, @args) = @_;
@@ -210,8 +189,7 @@ sub ipc_do {
 		my $ipc_lock = $self->{-ipc_lock};
 		my $lock = $ipc_lock ? $ipc_lock->lock_for_scope : undef;
 		if (defined(wantarray)) {
-			my $r_res = $self->{-ipc_res} or die 'BUG: no ipc_res';
-			ipc_async_wait($self, -1);
+			my $r_res = $self->{-ipc_res} or die 'no ipc_res';
 			_send_rec($w_req, [ wantarray, $sub, @args ]);
 			my $ret = _get_rec($r_res) // die "no response on $sub";
 			die $$ret if ref($ret) eq 'PublicInbox::IPC::Die';
@@ -224,30 +202,6 @@ sub ipc_do {
 	}
 }
 
-sub ipc_async {
-	my ($self, $sub, $sub_args, $cb, $cb_arg) = @_;
-	if (my $w_req = $self->{-ipc_req}) { # run in worker
-		my $rec = _pack_rec([ 1, $sub, @$sub_args ]);
-		my $cur_bytes = \($self->{-async_inflight_bytes} //= 0);
-		while (($$cur_bytes + length($rec)) > PIPE_BUF) {
-			ipc_async_wait($self, 1);
-		}
-		my $ipc_lock = $self->{-ipc_lock};
-		my $lock = $ipc_lock ? $ipc_lock->lock_for_scope : undef;
-		print $w_req $rec or croak "print: $!";
-		$$cur_bytes += length($rec);
-		push @{$self->{-async_inflight}},
-				$sub, length($rec), $cb, $cb_arg;
-	} else {
-		my $ret = [ eval { $self->$sub(@$sub_args) } ];
-		if (my $exc = $@) {
-			$ret = ( bless(\$exc, 'PublicInbox::IPC::Die') );
-		}
-		eval { $cb->($cb_arg, $ret) };
-		warn "E: $sub callback error: $@\n" if $@;
-	}
-}
-
 # needed when there's multiple IPC workers and the parent forking
 # causes newer siblings to inherit older siblings sockets
 sub ipc_sibling_atfork_child {
diff --git a/t/ipc.t b/t/ipc.t
index 5801c760..face5726 100644
--- a/t/ipc.t
+++ b/t/ipc.t
@@ -37,7 +37,6 @@ my $ipc = bless {}, 'PublicInbox::IPC';
 my @t = qw(array scalar scalarref undef);
 my $test = sub {
 	my $x = shift;
-	my @res;
 	for my $type (@t) {
 		my $m = "test_$type";
 		my @ret = $ipc->ipc_do($m);
@@ -46,34 +45,10 @@ my $test = sub {
 
 		$ipc->ipc_do($m);
 
-		$ipc->ipc_async($m, [], sub { push @res, \@_ }, \$m);
-
 		my $ret = $ipc->ipc_do($m);
 		my $exp = $ipc->$m;
 		is_deeply($ret, $exp, "!wantarray $m $x");
-
-		is_deeply(\@res, [ [ \$m, \@exp ] ], "async $m $x");
-		@res = ();
 	}
-	$ipc->ipc_async_wait(-1);
-	is_deeply(\@res, [], 'no leftover results');
-	$ipc->ipc_async('test_die', ['die test'],
-			sub { push @res, \@_  }, 'die arg');
-	$ipc->ipc_async_wait(1);
-	is(scalar(@res), 1, 'only one result');
-	is(scalar(@{$res[0]}), 2, 'result has 2-element array');
-	is($res[0]->[0], 'die arg', 'got async die arg '.$x);
-	is(ref($res[0]->[1]), 'PublicInbox::IPC::Die',
-		"exception type $x");
-	{
-		my $nr = PublicInbox::IPC::PIPE_BUF();
-		my $count = 0;
-		my $cb = sub { ++$count };
-		$ipc->ipc_async('test_undef', [], $cb) for (1..$nr);
-		$ipc->ipc_async_wait(-1);
-		is($count, $nr, "$x async runs w/o deadlock");
-	}
-
 	my $ret = eval { $ipc->test_die('phail') };
 	my $exp = $@;
 	$ret = eval { $ipc->ipc_do('test_die', 'phail') };

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 11/19] ipc: wq_do => wq_io_do
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (9 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 10/19] Revert "ipc: add support for asynchronous callbacks" Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 12/19] lei: more consistent IPC exit and error handling Eric Wong
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

We will have a ->wq_do that doesn't pass FDs for I/O.
---
 lib/PublicInbox/IPC.pm         | 12 ++++++------
 lib/PublicInbox/LeiImport.pm   |  4 ++--
 lib/PublicInbox/LeiMirror.pm   |  4 ++--
 lib/PublicInbox/LeiOverview.pm |  4 ++--
 lib/PublicInbox/LeiToMail.pm   |  2 +-
 lib/PublicInbox/LeiXSearch.pm  | 10 +++++-----
 t/ipc.t                        | 14 +++++++-------
 xt/stress-sharedkv.t           |  6 +++---
 8 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 7e5a0b16..728f726c 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -3,10 +3,10 @@
 
 # base class for remote IPC calls and workqueues, requires Storable or Sereal
 # - ipc_do and ipc_worker_* is for a single worker/producer and uses pipes
-# - wq_do and wq_worker* is for a single producer and multiple workers,
+# - wq_io_do and wq_worker* is for a single producer and multiple workers,
 #   using SOCK_SEQPACKET for work distribution
 # use ipc_do when you need work done on a certain process
-# use wq_do when your work can be done on any idle worker
+# use wq_io_do when your work can be done on any idle worker
 package PublicInbox::IPC;
 use strict;
 use v5.10.1;
@@ -248,12 +248,12 @@ sub wq_worker_loop ($) {
 	PublicInbox::DS->Reset;
 }
 
-sub do_sock_stream { # via wq_do, for big requests
+sub do_sock_stream { # via wq_io_do, for big requests
 	my ($self, $len) = @_;
 	recv_and_run($self, delete $self->{0}, $len, 1);
 }
 
-sub wq_do { # always async
+sub wq_io_do { # always async
 	my ($self, $sub, $ios, @args) = @_;
 	if (my $s1 = $self->{-wq_s1}) { # run in worker
 		my $fds = [ map { fileno($_) } @$ios ];
@@ -278,7 +278,7 @@ sub wq_do { # always async
 	} else {
 		@$self{0..$#$ios} = @$ios;
 		eval { $self->$sub(@args) };
-		warn "wq_do: $@" if $@;
+		warn "wq_io_do: $@" if $@;
 		delete @$self{0..$#$ios}; # don't close
 	}
 }
@@ -349,7 +349,7 @@ sub wq_worker_decr { # SIGTTOU handler, kills first idle worker
 	my ($self) = @_;
 	return unless wq_workers($self);
 	my $s2 = $self->{-wq_s2} // die 'BUG: no wq_s2';
-	$self->wq_do('wq_exit', [ $s2, $s2, $s2 ]);
+	$self->wq_io_do('wq_exit', [ $s2, $s2, $s2 ]);
 	# caller must call wq_worker_decr_wait in main loop
 }
 
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 2c7cbf2b..3a99570e 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -44,9 +44,9 @@ sub call { # the main "lei import" method
 	$self->wq_workers_start('lei_import', $j, $lei->oldset, {lei => $lei});
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
-	$self->wq_do('import_stdin', []) if $self->{0};
+	$self->wq_io_do('import_stdin', []) if $self->{0};
 	for my $x (@argv) {
-		$self->wq_do('import_path_url', [], $x);
+		$self->wq_io_do('import_path_url', [], $x);
 	}
 	$self->wq_close(1);
 	$lei->event_step_init; # wait for shutdowns
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 13795a58..5ba69287 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -251,7 +251,7 @@ sub start_clone_url {
 	die "TODO: non-HTTP/HTTPS clone of $self->{src} not supported, yet";
 }
 
-sub do_mirror { # via wq_do
+sub do_mirror { # via wq_io_do
 	my ($self) = @_;
 	my $lei = $self->{lei};
 	eval {
@@ -290,7 +290,7 @@ sub start {
 	$self->wq_workers_start('lei_mirror', 1, $lei->oldset, {lei => $lei});
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
-	$self->wq_do('do_mirror', []);
+	$self->wq_io_do('do_mirror', []);
 	$self->wq_close(1);
 	$lei->event_step_init; # wait for shutdowns
 	if ($lei->{oneshot}) {
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 24e4c190..dcfb9cc7 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -23,7 +23,7 @@ my $JSONL = 'ldjson|ndjson|jsonl'; # 3 names for the same thing
 
 sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
 
-# we open this in the parent process before ->wq_do handoff
+# we open this in the parent process before ->wq_io_do handoff
 sub ovv_out_lk_init ($) {
 	my ($self) = @_;
 	my $tmp = File::Temp->new("lei-ovv.dst.$$.lock-XXXXXX",
@@ -205,7 +205,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		sub {
 			my ($smsg, $mitem) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
-			$l2m->wq_do('write_mail', [], $git_dir, $smsg);
+			$l2m->wq_io_do('write_mail', [], $git_dir, $smsg);
 		}
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 4f847221..3f65e9e9 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -488,7 +488,7 @@ sub poke_dst {
 	}
 }
 
-sub write_mail { # via ->wq_do
+sub write_mail { # via ->wq_io_do
 	my ($self, $git_dir, $smsg) = @_;
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
 	git_async_cat($git, $smsg->{blob}, \&git_to_mail,
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 6a1b107b..1ba767c1 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -333,10 +333,10 @@ sub start_query { # always runs in main (lei-daemon) process
 	}
 	if ($lei->{opt}->{thread}) {
 		for my $ibxish (locals($self)) {
-			$self->wq_do('query_thread_mset', [], $ibxish);
+			$self->wq_io_do('query_thread_mset', [], $ibxish);
 		}
 	} elsif (locals($self)) {
-		$self->wq_do('query_mset', []);
+		$self->wq_io_do('query_mset', []);
 	}
 	my $i = 0;
 	my $q = [];
@@ -344,7 +344,7 @@ sub start_query { # always runs in main (lei-daemon) process
 		push @{$q->[$i++ % $MAX_PER_HOST]}, $uri;
 	}
 	for my $uris (@$q) {
-		$self->wq_do('query_remote_mboxrd', [], $uris);
+		$self->wq_io_do('query_remote_mboxrd', [], $uris);
 	}
 }
 
@@ -354,7 +354,7 @@ sub ipc_atfork_child {
 	$self->SUPER::ipc_atfork_child;
 }
 
-sub query_prepare { # called by wq_do
+sub query_prepare { # called by wq_io_do
 	my ($self) = @_;
 	local $0 = "$0 query_prepare";
 	my $lei = $self->{lei};
@@ -398,7 +398,7 @@ sub do_query {
 	delete $lei->{pkt_op_p};
 	$l2m->wq_close(1) if $l2m;
 	$lei->event_step_init; # wait for shutdowns
-	$self->wq_do('query_prepare', []) if $l2m;
+	$self->wq_io_do('query_prepare', []) if $l2m;
 	start_query($self, $lei);
 	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {
diff --git a/t/ipc.t b/t/ipc.t
index face5726..345024bd 100644
--- a/t/ipc.t
+++ b/t/ipc.t
@@ -106,7 +106,7 @@ my $big = do { local $/; <$agpl> } // BAIL_OUT "read: $!";
 close $agpl or BAIL_OUT "close: $!";
 
 for my $t ('local', 'worker', 'worker again') {
-	$ipc->wq_do('test_write_each_fd', [ $wa, $wb, $wc ], 'hello world');
+	$ipc->wq_io_do('test_write_each_fd', [ $wa, $wb, $wc ], 'hello world');
 	my $i = 0;
 	for my $fh ($ra, $rb, $rc) {
 		my $buf = readline($fh);
@@ -114,12 +114,12 @@ for my $t ('local', 'worker', 'worker again') {
 		like($buf, qr/\Ai=$i \d+ hello world\z/, "got expected ($t)");
 		$i++;
 	}
-	$ipc->wq_do('test_die', [ $wa, $wb, $wc ]);
-	$ipc->wq_do('test_sha', [ $wa, $wb ], 'hello world');
+	$ipc->wq_io_do('test_die', [ $wa, $wb, $wc ]);
+	$ipc->wq_io_do('test_sha', [ $wa, $wb ], 'hello world');
 	is(readline($rb), sha1_hex('hello world')."\n", "SHA small ($t)");
 	{
 		my $bigger = $big x 10;
-		$ipc->wq_do('test_sha', [ $wa, $wb ], $bigger);
+		$ipc->wq_io_do('test_sha', [ $wa, $wb ], $bigger);
 		my $exp = sha1_hex($bigger)."\n";
 		undef $bigger;
 		is(readline($rb), $exp, "SHA big ($t)");
@@ -128,7 +128,7 @@ for my $t ('local', 'worker', 'worker again') {
 	push(@ppids, $ppid);
 }
 
-# wq_do works across fork (siblings can feed)
+# wq_io_do works across fork (siblings can feed)
 SKIP: {
 	skip 'Socket::MsgHdr or Inline::C missing', 3 if !$ppids[0];
 	is_deeply(\@ppids, [$$, undef, undef],
@@ -136,7 +136,7 @@ SKIP: {
 	my $pid = fork // BAIL_OUT $!;
 	if ($pid == 0) {
 		use POSIX qw(_exit);
-		$ipc->wq_do('test_write_each_fd', [ $wa, $wb, $wc ], $$);
+		$ipc->wq_io_do('test_write_each_fd', [ $wa, $wb, $wc ], $$);
 		_exit(0);
 	} else {
 		my $i = 0;
@@ -160,7 +160,7 @@ SKIP: {
 	seek($warn, 0, SEEK_SET) or BAIL_OUT;
 	my @warn = <$warn>;
 	is(scalar(@warn), 3, 'warned 3 times');
-	like($warn[0], qr/ wq_do: /, '1st warned from wq_do');
+	like($warn[0], qr/ wq_io_do: /, '1st warned from wq_do');
 	like($warn[1], qr/ wq_worker: /, '2nd warned from wq_worker');
 	is($warn[2], $warn[1], 'worker did not die');
 
diff --git a/xt/stress-sharedkv.t b/xt/stress-sharedkv.t
index 70de9ffc..1773d4bc 100644
--- a/xt/stress-sharedkv.t
+++ b/xt/stress-sharedkv.t
@@ -15,14 +15,14 @@ my $nr = $ENV{TEST_STRESS_NR} // 100_000;
 my $ios = [];
 my $t = timeit(1, sub {
 	for my $i (1..$nr) {
-		$ipc->wq_do('test_set_maybe', $ios, $skv, $i);
-		$ipc->wq_do('test_set_maybe', $ios, $skv, $i);
+		$ipc->wq_io_do('test_set_maybe', $ios, $skv, $i);
+		$ipc->wq_io_do('test_set_maybe', $ios, $skv, $i);
 	}
 });
 diag "$nr sets done ".timestr($t);
 
 for my $w ($ipc->wq_workers) {
-	$ipc->wq_do('test_skv_done', $ios);
+	$ipc->wq_io_do('test_skv_done', $ios);
 }
 diag "done requested";
 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 12/19] lei: more consistent IPC exit and error handling
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (10 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 11/19] ipc: wq_do => wq_io_do Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 13/19] lei: remove --mua-cmd alias for --mua Eric Wong
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

We're able to propagate $? from wq_workers in a consistent
manner, now.
---
 lib/PublicInbox/IPC.pm        | 22 +++++++++++-----------
 lib/PublicInbox/LEI.pm        |  6 +++---
 lib/PublicInbox/LeiImport.pm  | 14 ++++++++++----
 lib/PublicInbox/LeiXSearch.pm | 12 +++++++++---
 4 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 728f726c..c8673e26 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -140,10 +140,9 @@ sub ipc_worker_reap { # dwaitpid callback
 }
 
 sub wq_wait_old {
-	my ($self, @args) = @_;
-	my $cb = ref($args[0]) eq 'CODE' ? shift(@args) : \&ipc_worker_reap;
+	my ($self, $cb, @args) = @_;
 	my $pids = delete $self->{"-wq_old_pids.$$"} or return;
-	dwaitpid($_, $cb, [$self, @args]) for @$pids;
+	dwaitpid($_, $cb // \&ipc_worker_reap, [$self, @args]) for @$pids;
 }
 
 # for base class, override in sub classes
@@ -348,13 +347,12 @@ sub wq_exit { # wakes up wq_worker_decr_wait
 sub wq_worker_decr { # SIGTTOU handler, kills first idle worker
 	my ($self) = @_;
 	return unless wq_workers($self);
-	my $s2 = $self->{-wq_s2} // die 'BUG: no wq_s2';
-	$self->wq_io_do('wq_exit', [ $s2, $s2, $s2 ]);
+	$self->wq_io_do('wq_exit');
 	# caller must call wq_worker_decr_wait in main loop
 }
 
 sub wq_worker_decr_wait {
-	my ($self, $timeout) = @_;
+	my ($self, $timeout, $cb, @args) = @_;
 	return if $self->{-wq_ppid} != $$; # can't reap siblings or parents
 	my $s1 = $self->{-wq_s1} // croak 'BUG: no wq_s1';
 	vec(my $rin = '', fileno($s1), 1) = 1;
@@ -363,17 +361,17 @@ sub wq_worker_decr_wait {
 	recv($s1, my $pid, 64, 0) // croak "recv: $!";
 	my $workers = $self->{-wq_workers} // croak 'BUG: no wq_workers';
 	delete $workers->{$pid} // croak "BUG: PID:$pid invalid";
-	dwaitpid($pid, \&ipc_worker_reap, $self);
+	dwaitpid($pid, $cb // \&ipc_worker_reap, [ $self, @args ]);
 }
 
 # set or retrieve number of workers
 sub wq_workers {
-	my ($self, $nr) = @_;
+	my ($self, $nr, $cb, @args) = @_;
 	my $cur = $self->{-wq_workers} or return;
 	if (defined $nr) {
 		while (scalar(keys(%$cur)) > $nr) {
 			$self->wq_worker_decr;
-			$self->wq_worker_decr_wait;
+			$self->wq_worker_decr_wait(undef, $cb, @args);
 		}
 		$self->wq_worker_incr while scalar(keys(%$cur)) < $nr;
 	}
@@ -381,7 +379,7 @@ sub wq_workers {
 }
 
 sub wq_close {
-	my ($self, $nohang) = @_;
+	my ($self, $nohang, $cb, @args) = @_;
 	delete @$self{qw(-wq_s1 -wq_s2)} or return;
 	my $ppid = delete $self->{-wq_ppid} or return;
 	my $workers = delete $self->{-wq_workers} // die 'BUG: no wq_workers';
@@ -390,7 +388,9 @@ sub wq_close {
 	if ($nohang) {
 		push @{$self->{"-wq_old_pids.$$"}}, @pids;
 	} else {
-		dwaitpid($_, \&ipc_worker_reap, $self) for @pids;
+		$cb //= \&ipc_worker_reap;
+		unshift @args, $self;
+		dwaitpid($_, $cb, \@args) for @pids;
 	}
 }
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 515bc2a3..21862488 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -360,7 +360,7 @@ sub fail_handler ($;$$) {
 	my ($lei, $code, $io) = @_;
 	for my $f (@WQ_KEYS) {
 		my $wq = delete $lei->{$f} or next;
-		$wq->wq_wait_old($lei) if $wq->wq_kill_old; # lei-daemon
+		$wq->wq_wait_old(undef, $lei) if $wq->wq_kill_old; # lei-daemon
 	}
 	close($io) if $io; # needed to avoid warnings on SIGPIPE
 	$lei->x_it($code // (1 >> 8));
@@ -827,9 +827,9 @@ sub dclose {
 	for my $f (@WQ_KEYS) {
 		my $wq = delete $self->{$f} or next;
 		if ($wq->wq_kill) {
-			$wq->wq_close
+			$wq->wq_close(0, undef, $self);
 		} elsif ($wq->wq_kill_old) {
-			$wq->wq_wait_old($self);
+			$wq->wq_wait_old(undef, $self);
 		}
 	}
 	close(delete $self->{1}) if $self->{1}; # may reap_compress
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 3a99570e..2b2dc2f7 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -14,12 +14,18 @@ sub _import_eml { # MboxReader callback
 	$sto->ipc_do('set_eml', $eml, $set_kw ? $sto->mbox_keywords($eml) : ());
 }
 
+sub import_done_wait { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($imp, $lei) = @$arg;
+	$lei->child_error($?, 'non-fatal errors during import') if $?;
+	my $ign = $lei->{sto}->ipc_do('done'); # PublicInbox::LeiStore::done
+	$lei->dclose;
+}
+
 sub import_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	my $imp = delete $lei->{imp};
-	$imp->wq_wait_old($lei) if $imp;
-	my $wait = $lei->{sto}->ipc_do('done');
-	$lei->dclose;
+	my $imp = delete $lei->{imp} or return;
+	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
 sub call { # the main "lei import" method
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1ba767c1..1024b020 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -279,12 +279,18 @@ sub git_tmp ($) {
 	$git;
 }
 
+sub xsearch_done_wait { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($wq, $lei) = @$arg;
+	$lei->child_error($?, 'non-fatal error from '.ref($wq)) if $?;
+}
+
 sub query_done { # EOF callback for main daemon
 	my ($lei) = @_;
 	my $l2m = delete $lei->{l2m};
-	$l2m->wq_wait_old($lei) if $l2m;
+	$l2m->wq_wait_old(\&xsearch_done_wait, $lei) if $l2m;
 	if (my $lxs = delete $lei->{lxs}) {
-		$lxs->wq_wait_old($lei);
+		$lxs->wq_wait_old(\&xsearch_done_wait, $lei);
 	}
 	$lei->{ovv}->ovv_end($lei);
 	if ($l2m) { # close() calls LeiToMail reap_compress
@@ -309,7 +315,7 @@ sub do_post_augment {
 	if (my $err = $@) {
 		if (my $lxs = delete $lei->{lxs}) {
 			$lxs->wq_kill;
-			$lxs->wq_close;
+			$lxs->wq_close(0, undef, $lei);
 		}
 		$lei->fail("$err");
 	}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 13/19] lei: remove --mua-cmd alias for --mua
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (11 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 12/19] lei: more consistent IPC exit and error handling Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 14/19] lei: replace --thread with --threads Eric Wong
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

While "mua-cmd" may be more accurate, nobody is expected
to type 4 extra characters.  It's a needless ambiguity
with no precedence or prior art to follow.

Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
---
 Documentation/lei-q.pod    | 2 +-
 lib/PublicInbox/LEI.pm     | 6 +++---
 lib/PublicInbox/LeiHelp.pm | 2 +-
 t/lei.t                    | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 5c0ca843..07c742d2 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -36,7 +36,7 @@ Pretty print C<json> or C<concatjson> output.  If stdout is opened to
 a tty and used as the C<--output> destination, C<--pretty> is enabled
 by default.
 
-=item --mua-cmd=COMMAND, --mua=COMMAND
+=item --mua=COMMAND
 
 A command to run on C<--output> Maildir or mbox (e.g., C<mutt -f %f>).
 For a subset of MUAs known to accept a mailbox via C<-f>, COMMAND can
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 21862488..818f2cfb 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -112,7 +112,7 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	mua-cmd|mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
+	mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -232,7 +232,7 @@ my %OPTDESC = (
 'output|mfolder|o=s' => [ 'MFOLDER',
 	"destination (e.g.\xa0`/path/to/Maildir', ".
 	"or\xa0`-'\x{a0}for\x{a0}stdout)" ],
-'mua-cmd|mua=s' => [ 'CMD',
+'mua=s' => [ 'CMD',
 	"MUA to run on --output Maildir or mbox (e.g.\xa0`mutt\xa0-f\xa0%f')" ],
 
 'show	format|f=s' => [ 'OUT|plain|raw|html|mboxrd|mboxcl2|mboxcl',
@@ -723,7 +723,7 @@ sub exec_buf ($$) {
 
 sub start_mua {
 	my ($self) = @_;
-	my $mua = $self->{opt}->{'mua-cmd'} // return;
+	my $mua = $self->{opt}->{mua} // return;
 	my $mfolder = $self->{ovv}->{dst};
 	my (@cmd, $replaced);
 	if ($mua =~ /\A(?:mutt|mailx|mail|neomutt)\z/) {
diff --git a/lib/PublicInbox/LeiHelp.pm b/lib/PublicInbox/LeiHelp.pm
index 43414ab4..e62298f7 100644
--- a/lib/PublicInbox/LeiHelp.pm
+++ b/lib/PublicInbox/LeiHelp.pm
@@ -7,7 +7,7 @@ use strict;
 use v5.10.1;
 use Text::Wrap qw(wrap);
 
-my %NOHELP = map { $_ => 1 } qw(mua-cmd mfolder);
+my %NOHELP = map { $_ => 1 } qw(mfolder);
 
 sub call {
 	my ($self, $errmsg, $CMD, $OPTDESC) = @_;
diff --git a/t/lei.t b/t/lei.t
index f789f63a..8e771eb5 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -104,7 +104,7 @@ my $test_completion = sub {
 	ok($lei->(qw(_complete lei q)), 'complete q (no args)');
 	%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	for my $sw (qw(-f --format -o --output --mfolder --augment -a
-			--mua --mua-cmd --no-local --local --verbose -v
+			--mua --no-local --local --verbose -v
 			--save-as --no-remote --remote --torsocks
 			--reverse -r )) {
 		ok($out{$sw}, "$sw offered as `lei q' completion");

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 14/19] lei: replace --thread with --threads
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (12 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 13/19] lei: remove --mua-cmd alias for --mua Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

Nobody is expected to use long options, but for consistency
with mairix(1), we'll use the pluralized option throughout
(including existing PublicInbox::{Search,SearchView}).

Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
---
 Documentation/lei-q.pod       |  2 +-
 lib/PublicInbox/LEI.pm        | 16 ++++++++--------
 lib/PublicInbox/LeiHelp.pm    |  4 ++--
 lib/PublicInbox/LeiQuery.pm   |  4 ++--
 lib/PublicInbox/LeiXSearch.pm | 12 ++++++------
 lib/PublicInbox/Mbox.pm       |  2 +-
 lib/PublicInbox/Search.pm     |  2 +-
 lib/PublicInbox/SearchView.pm |  2 +-
 8 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 07c742d2..8f053a55 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -47,7 +47,7 @@ or C<neomutt>.
 
 Augment output destination instead of clobbering it.
 
-=item -t, --thread
+=item -t, --threads
 
 Return all messages in the same thread as the actual match(es).
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 818f2cfb..31e6b4a8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -109,14 +109,14 @@ sub index_opt {
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
-	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
+	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
 	mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
-	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
+	qw(type=s solve! format|f=s dedupe|d=s threads|t remote local!),
 	pass_through('git show') ],
 
 'add-external' => [ 'LOCATION',
@@ -135,9 +135,9 @@ our %CMD = ( # sorted in order of importance/use:
 'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
 'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],
 
-'plonk' => [ '--thread|--from=IDENT',
-	'exclude mail matching From: or thread from non-Message-ID searches',
-	qw(stdin| thread|t from|f=s mid=s oid=s) ],
+'plonk' => [ '--threads|--from=IDENT',
+	'exclude mail matching From: or threads from non-Message-ID searches',
+	qw(stdin| threads|t from|f=s mid=s oid=s) ],
 'mark' => [ 'MESSAGE_FLAGS...',
 	'set/unset keywords on message(s) from stdin',
 	qw(stdin| oid=s exact by-mid|mid:s) ],
@@ -224,9 +224,9 @@ my %OPTDESC = (
 
 'dedupe|d=s' => ['STRATEGY|content|oid|mid|none',
 		'deduplication strategy'],
-'show	thread|t' => 'display entire thread a message belongs to',
-'q	thread|t' =>
-	'return all messages in the same thread as the actual match(es)',
+'show	threads|t' => 'display entire thread a message belongs to',
+'q	threads|t' =>
+	'return all messages in the same threads as the actual match(es)',
 'augment|a' => 'augment --output destination instead of clobbering',
 
 'output|mfolder|o=s' => [ 'MFOLDER',
diff --git a/lib/PublicInbox/LeiHelp.pm b/lib/PublicInbox/LeiHelp.pm
index e62298f7..a654e1c2 100644
--- a/lib/PublicInbox/LeiHelp.pm
+++ b/lib/PublicInbox/LeiHelp.pm
@@ -40,7 +40,7 @@ sub call {
 			@vals = (' [', undef, ']');
 		} elsif ($x =~ s/=.+//) { # required arg: $x = "type=s"
 			@vals = (' ', undef);
-		} # else: no args $x = 'thread|t'
+		} # else: no args $x = 'threads|t'
 
 		# we support underscore options from public-inbox-* commands;
 		# but they've never been documented and will likely go away.
@@ -48,7 +48,7 @@ sub call {
 		for (grep { !/_/ && !$NOHELP{$_} } split(/\|/, $x)) {
 			length($_) > 1 ? push(@l, "--$_") : push(@s, "-$_");
 		}
-		if (!scalar(@vals)) { # no args 'thread|t'
+		if (!scalar(@vals)) { # no args 'threads|t'
 		} elsif ($arg_vals =~ s/\A([A-Z_]+)\b//) { # "NAME"
 			$vals[1] = $1;
 		} else {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 0346498f..9a6fa718 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -81,7 +81,7 @@ sub lei_q {
 	$self->{l2m}->{jobs} = ($mj // $nproc) if $self->{l2m};
 	PublicInbox::LeiOverview->new($self) or return;
 
-	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
+	my %mset_opt = map { $_ => $opt->{$_} } qw(threads limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{limit} //= 10000;
 	if (defined(my $sort = $opt->{'sort'})) {
@@ -96,7 +96,7 @@ sub lei_q {
 		}
 	}
 	# descending docid order
-	$mset_opt{relevance} //= -2 if $opt->{thread};
+	$mset_opt{relevance} //= -2 if $opt->{threads};
 	$self->{mset_opt} = \%mset_opt;
 
 	if ($opt->{stdin}) {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1024b020..2794140a 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -118,7 +118,7 @@ sub mset_progress {
 	}
 }
 
-sub query_thread_mset { # for --thread
+sub query_thread_mset { # for --threads
 	my ($self, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
 	my $lei = $self->{lei};
@@ -151,7 +151,7 @@ sub query_thread_mset { # for --thread
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
-sub query_mset { # non-parallel for non-"--thread" users
+sub query_mset { # non-parallel for non-"--threads" users
 	my ($self) = @_;
 	local $0 = "$0 query_mset";
 	my $lei = $self->{lei};
@@ -204,7 +204,7 @@ sub query_remote_mboxrd {
 	my $lei = $self->{lei};
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
-	push(@qform, t => 1) if $opt->{thread};
+	push(@qform, t => 1) if $opt->{threads};
 	my $verbose = $opt->{verbose};
 	my ($reap_tail, $reap_curl);
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
@@ -326,7 +326,7 @@ my $MAX_PER_HOST = 4;
 
 sub concurrency {
 	my ($self, $opt) = @_;
-	my $nl = $opt->{thread} ? locals($self) : 1;
+	my $nl = $opt->{threads} ? locals($self) : 1;
 	my $nr = remotes($self);
 	$nr = $MAX_PER_HOST if $nr > $MAX_PER_HOST;
 	$nl + $nr;
@@ -337,7 +337,7 @@ sub start_query { # always runs in main (lei-daemon) process
 	if (my $l2m = $lei->{l2m}) {
 		$lei->start_mua if $l2m->lock_free;
 	}
-	if ($lei->{opt}->{thread}) {
+	if ($lei->{opt}->{threads}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_thread_mset', [], $ibxish);
 		}
@@ -393,7 +393,7 @@ sub do_query {
 		# 1031: F_SETPIPE_SZ
 		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
 	}
-	if (!$lei->{opt}->{thread} && locals($self)) { # for query_mset
+	if (!$lei->{opt}->{threads} && locals($self)) { # for query_mset
 		# lei->{git_tmp} is set for wq_wait_old so we don't
 		# delete until all lei2mail + lei_xsearch workers are reaped
 		$lei->{git_tmp} = $self->{git_tmp} = git_tmp($self);
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 964147fa..1fca356b 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -236,7 +236,7 @@ sub mbox_all {
 		return PublicInbox::WWW::need($ctx, 'Overview');
 
 	my $qopts = $ctx->{qopts} = { relevance => -1 }; # ORDER BY docid ASC
-	$qopts->{thread} = 1 if $q->{t};
+	$qopts->{threads} = 1 if $q->{t};
 	my $mset = $srch->mset($q_string, $qopts);
 	$qopts->{offset} = $mset->size or
 			return [404, [qw(Content-Type text/plain)],
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 7c6a16be..dbae3bc5 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -336,7 +336,7 @@ sub _enquire_once { # retry_reopen callback
 	}
 
 	# `mairix -t / --threads' or JMAP collapseThreads
-	if ($opts->{thread} && has_threadid($self)) {
+	if ($opts->{threads} && has_threadid($self)) {
 		$enquire->set_collapse_key(THREADID);
 	}
 	$enquire->get_mset($opts->{offset} || 0, $opts->{limit} || 50);
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index d50d3cf6..08c77f35 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -48,7 +48,7 @@ sub sres_top_html {
 		limit => $q->{l},
 		offset => $o,
 		relevance => $q->{r},
-		thread => $q->{t},
+		threads => $q->{t},
 		asc => $asc,
 	};
 	my ($mset, $total, $err, $html);

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 15/19] lei q: improve remote mboxrd UX
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (13 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 14/19] lei: replace --thread with --threads Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  9:32   ` [PATCH 20/19] lei_xsearch: allow quieting regular mset progress, too Eric Wong
  2021-02-07  8:51 ` [PATCH 16/19] lei q: SIGWINCH process group with the terminal Eric Wong
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

For early MUA spawners using lock-free outputs, we we need to
on the startq pipe to silence progress reporting.  For
--augment users, we can start the MUA even earlier by
creating Maildirs in the pre-augment phase.

To improve progress reporting for non-MUA (or late-MUA)
spawners, we'll no longer blindly append "--compressed" to the
curl(1) command when POST-ing for the gzipped mboxrd.
Furthermore, we'll overload stringify ('""') in LeiCurl to
ensure the empty -d '' string shows up properly.
---
 lib/PublicInbox/IPC.pm         |  8 ++--
 lib/PublicInbox/LEI.pm         |  4 +-
 lib/PublicInbox/LeiCurl.pm     | 11 +++--
 lib/PublicInbox/LeiMirror.pm   |  5 +-
 lib/PublicInbox/LeiOverview.pm |  3 +-
 lib/PublicInbox/LeiToMail.pm   | 24 +++++-----
 lib/PublicInbox/LeiXSearch.pm  | 87 ++++++++++++++++++++++------------
 7 files changed, 88 insertions(+), 54 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index c8673e26..9331233a 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -109,7 +109,6 @@ sub ipc_worker_spawn {
 		$w_res->autoflush(1);
 		$SIG{$_} = 'IGNORE' for (qw(TERM INT QUIT));
 		local $0 = $ident;
-		PublicInbox::DS::sig_setmask($sigset);
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
@@ -117,6 +116,7 @@ sub ipc_worker_spawn {
 			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
+			PublicInbox::DS::sig_setmask($sigset);
 			ipc_worker_loop($self, $r_req, $w_res);
 		};
 		warn "worker $ident PID:$$ died: $@\n" if $@;
@@ -293,7 +293,6 @@ sub _wq_worker_start ($$$) {
 		$SIG{$_} = 'IGNORE' for (qw(PIPE));
 		$SIG{$_} = 'DEFAULT' for (qw(TTOU TTIN TERM QUIT INT CHLD));
 		local $0 = $self->{-wq_ident};
-		PublicInbox::DS::sig_setmask($oldset);
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
@@ -301,6 +300,7 @@ sub _wq_worker_start ($$$) {
 			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
+			PublicInbox::DS::sig_setmask($oldset);
 			wq_worker_loop($self);
 		};
 		warn "worker $self->{-wq_ident} PID:$$ died: $@" if $@;
@@ -395,9 +395,9 @@ sub wq_close {
 }
 
 sub wq_kill_old {
-	my ($self) = @_;
+	my ($self, $sig) = @_;
 	my $pids = $self->{"-wq_old_pids.$$"} or return;
-	kill 'TERM', @$pids;
+	kill($sig // 'TERM', @$pids);
 }
 
 sub wq_kill {
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 31e6b4a8..e52154e5 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -741,7 +741,9 @@ sub start_mua {
 	} elsif ($self->{oneshot}) {
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
 	}
-	delete $self->{-progress};
+	if ($self->{lxs} && $self->{au_done}) { # kick wait_startq
+		syswrite($self->{au_done}, 'q' x ($self->{lxs}->{jobs} // 0));
+	}
 }
 
 # caller needs to "-t $self->{1}" to check if tty
diff --git a/lib/PublicInbox/LeiCurl.pm b/lib/PublicInbox/LeiCurl.pm
index 38b17c78..f346a1b4 100644
--- a/lib/PublicInbox/LeiCurl.pm
+++ b/lib/PublicInbox/LeiCurl.pm
@@ -8,6 +8,12 @@ use v5.10.1;
 use PublicInbox::Spawn qw(which);
 use PublicInbox::Config;
 
+# Ensures empty strings are quoted, we don't need more
+# sophisticated quoting than for empty strings: curl -d ''
+use overload '""' => sub {
+	join(' ', map { $_ eq '' ?  "''" : $_ } @{$_[0]});
+};
+
 my %lei2curl = (
 	'curl-config=s@' => 'config|K=s@',
 );
@@ -63,10 +69,9 @@ EOM
 
 # completes the result of cmd() for $uri
 sub for_uri {
-	my ($self, $lei, $uri) = @_;
+	my ($self, $lei, $uri, @opt) = @_;
 	my $pfx = torsocks($self, $lei, $uri) or return; # error
-	[ @$pfx, @$self, substr($uri->path, -3) eq '.gz' ? () : '--compressed',
-		$uri->as_string ]
+	bless [ @$pfx, @$self, @opt, $uri->as_string ], ref($self);
 }
 
 1;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 5ba69287..c5153148 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -31,7 +31,7 @@ sub try_scrape {
 	my $uri = URI->new($self->{src});
 	my $lei = $self->{lei};
 	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
-	my $cmd = $curl->for_uri($lei, $uri);
+	my $cmd = $curl->for_uri($lei, $uri, '--compressed');
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
 	my $fh = popen_rd($cmd, $lei->{env}, $opt);
 	my $html = do { local $/; <$fh> } // die "read(curl $uri): $!";
@@ -93,8 +93,7 @@ sub _try_config {
 	my $path = $uri->path;
 	chop($path) eq '/' or die "BUG: $uri not canonicalized";
 	$uri->path($path . '/_/text/config/raw');
-	my $cmd = $self->{curl}->for_uri($lei, $uri);
-	push @$cmd, '--compressed'; # curl decompresses for us
+	my $cmd = $self->{curl}->for_uri($lei, $uri, '--compressed');
 	my $ce = "$dst/inbox.config.example";
 	my $f = "$ce-$$.tmp";
 	open(my $fh, '+>', $f) or return $lei->err("open $f: $! (non-fatal)");
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index dcfb9cc7..f0ac4684 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -95,9 +95,10 @@ sub new {
 		$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
 	} else {
 		# default to the cheapest sort since MUA usually resorts
-		$lei->{opt}->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
+		$opt->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
 		$lei->{l2m} = eval { PublicInbox::LeiToMail->new($lei) };
 		return $lei->fail($@) if $@;
+		$lei->{early_mua} = 1 if $opt->{mua} && $lei->{l2m}->lock_free;
 	}
 	$self;
 }
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 3f65e9e9..857aeb63 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -370,7 +370,17 @@ sub new {
 	$self;
 }
 
-sub _pre_augment_maildir {} # noop
+sub _pre_augment_maildir {
+	my ($self, $lei) = @_;
+	my $dst = $lei->{ovv}->{dst};
+	for my $x (qw(tmp new cur)) {
+		my $d = $dst.$x;
+		next if -d $d;
+		require File::Path;
+		File::Path::mkpath($d);
+		-d $d or die "$d is not a directory";
+	}
+}
 
 sub _do_augment_maildir {
 	my ($self, $lei) = @_;
@@ -387,17 +397,7 @@ sub _do_augment_maildir {
 	}
 }
 
-sub _post_augment_maildir {
-	my ($self, $lei) = @_;
-	my $dst = $lei->{ovv}->{dst};
-	for my $x (qw(tmp new cur)) {
-		my $d = $dst.$x;
-		next if -d $d;
-		require File::Path;
-		File::Path::mkpath($d);
-		-d $d or die "$d is not a directory";
-	}
-}
+sub _post_augment_maildir {} # noop
 
 sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2794140a..0e99e4b4 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -101,9 +101,23 @@ sub _mset_more ($$) {
 # $startq will EOF when query_prepare is done augmenting and allow
 # query_mset and query_thread_mset to proceed.
 sub wait_startq ($) {
-	my ($startq) = @_;
-	$_[0] = undef;
-	read($startq, my $query_prepare_done, 1);
+	my ($lei) = @_;
+	my $startq = delete $lei->{startq} or return;
+	while (1) {
+		my $n = sysread($startq, my $query_prepare_done, 1);
+		if (defined $n) {
+			return if $n == 0; # no MUA
+			if ($query_prepare_done eq 'q') {
+				$lei->{opt}->{quiet} = 1;
+				delete $lei->{opt}->{verbose};
+				delete $lei->{-progress};
+			} else {
+				$lei->fail("$$ WTF `$query_prepare_done'");
+			}
+			return;
+		}
+		return $lei->fail("$$ wait_startq: $!") unless $!{EINTR};
+	}
 }
 
 sub mset_progress {
@@ -140,7 +154,7 @@ sub query_thread_mset { # for --threads
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
-				wait_startq($startq) if $startq;
+				wait_startq($lei);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
 			}
@@ -155,7 +169,6 @@ sub query_mset { # non-parallel for non-"--threads" users
 	my ($self) = @_;
 	local $0 = "$0 query_mset";
 	my $lei = $self->{lei};
-	my $startq = delete $lei->{startq};
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	for my $loc (locals($self)) {
@@ -168,7 +181,7 @@ sub query_mset { # non-parallel for non-"--threads" users
 				$mset->size, $mset->get_matches_estimated);
 		for my $mitem ($mset->items) {
 			my $smsg = smsg_for($self, $mitem) or next;
-			wait_startq($startq) if $startq;
+			wait_startq($lei);
 			$each_smsg->($smsg, $mitem);
 		}
 	} while (_mset_more($mset, $mo));
@@ -183,7 +196,7 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$smsg->parse_references($eml, mids($eml));
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
-	if (my $startq = delete($lei->{startq})) { wait_startq($startq) }
+	wait_startq($lei);
 	if ($lei->{-progress}) {
 		++$lei->{-nr_remote_eml};
 		my $now = now();
@@ -200,6 +213,10 @@ sub each_eml { # callback for MboxReader->mboxrd
 sub query_remote_mboxrd {
 	my ($self, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
+open my $dbg, '>>', '/tmp/dbg'; $dbg->autoflush(1); use Data::Dumper;
+	local $SIG{__WARN__} = sub {
+		print $dbg "$$ @_";
+	};
 	local $SIG{TERM} = sub { exit(0) }; # for DESTROY (File::Temp, $reap)
 	my $lei = $self->{lei};
 	my ($opt, $env) = @$lei{qw(opt env)};
@@ -210,7 +227,6 @@ sub query_remote_mboxrd {
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
 	fcntl($cerr, F_SETFL, O_APPEND|O_RDWR) or warn "set O_APPEND: $!";
 	my $rdr = { 2 => $cerr, pgid => 0 };
-	my $coff = 0;
 	my $sigint_reap = $lei->can('sigint_reap');
 	if ($verbose) {
 		# spawn a process to force line-buffering, otherwise curl
@@ -228,13 +244,14 @@ sub query_remote_mboxrd {
 		$lei->{-nr_remote_eml} = 0;
 		$uri->query_form(@qform);
 		my $cmd = $curl->for_uri($lei, $uri);
-		$lei->err("# @$cmd") if $verbose;
+		$lei->qerr("# $cmd");
 		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
 		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
 						$lei, $each_smsg);
-		my $err = waitpid($pid, 0) == $pid ? undef : "BUG: waitpid: $!";
+		my $err = waitpid($pid, 0) == $pid ? undef
+						: "BUG: waitpid($cmd): $!";
 		@$reap_curl = (); # cancel OnDestroy
 		die $err if $err;
 		if ($? == 0) {
@@ -242,16 +259,18 @@ sub query_remote_mboxrd {
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
 		}
-		seek($cerr, $coff, SEEK_SET) or warn "seek(curl stderr): $!\n";
-		my $e = do { local $/; <$cerr> } //
-				die "read(curl stderr): $!\n";
-		$coff += length($e);
-		truncate($cerr, 0);
-		next if (($? >> 8) == 22 && $e =~ /\b404\b/);
-		$lei->child_error($?);
+		$err = '';
+		if (-s $cerr) {
+			seek($cerr, 0, SEEK_SET) or
+					$lei->err("seek($cmd stderr): $!");
+			$err = do { local $/; <$cerr> } //
+					"read($cmd stderr): $!";
+			truncate($cerr, 0) or
+					$lei->err("truncate($cmd stderr): $!");
+		}
+		next if (($? >> 8) == 22 && $err =~ /\b404\b/);
 		$uri->query_form(q => $lei->{mset_opt}->{qstr});
-		# --verbose already showed the error via tail(1)
-		$lei->err("E: $uri \$?=$?\n", $verbose ? () : $e);
+		$lei->child_error($?, "E: <$uri> $err");
 	}
 	undef $each_smsg;
 	$lei->{ovv}->ovv_atexit_child($lei);
@@ -311,15 +330,23 @@ Error closing $lei->{ovv}->{dst}: $!
 
 sub do_post_augment {
 	my ($lei) = @_;
-	eval { $lei->{l2m}->post_augment($lei) };
-	if (my $err = $@) {
-		if (my $lxs = delete $lei->{lxs}) {
-			$lxs->wq_kill;
-			$lxs->wq_close(0, undef, $lei);
+	my $l2m = $lei->{l2m};
+	my $err;
+	if ($l2m) {
+		eval { $l2m->post_augment($lei) };
+		$err = $@;
+		if ($err) {
+			if (my $lxs = delete $lei->{lxs}) {
+				$lxs->wq_kill;
+				$lxs->wq_close(0, undef, $lei);
+			}
+			$lei->fail("$err");
 		}
-		$lei->fail("$err");
 	}
-	close(delete $lei->{au_done}); # triggers wait_startq
+	if (!$err && delete $lei->{early_mua}) { # non-augment case
+		$lei->start_mua;
+	}
+	close(delete $lei->{au_done}); # triggers wait_startq in lei_xsearch
 }
 
 my $MAX_PER_HOST = 4;
@@ -334,9 +361,6 @@ sub concurrency {
 
 sub start_query { # always runs in main (lei-daemon) process
 	my ($self, $lei) = @_;
-	if (my $l2m = $lei->{l2m}) {
-		$lei->start_mua if $l2m->lock_free;
-	}
 	if ($lei->{opt}->{threads}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_thread_mset', [], $ibxish);
@@ -387,6 +411,9 @@ sub do_query {
 	my $l2m = $lei->{l2m};
 	if ($l2m) {
 		$l2m->pre_augment($lei);
+		if ($lei->{opt}->{augment} && delete $lei->{early_mua}) {
+			$lei->start_mua;
+		}
 		$l2m->wq_workers_start('lei2mail', $l2m->{jobs},
 					$lei->oldset, { lei => $lei });
 		pipe($lei->{startq}, $lei->{au_done}) or die "pipe: $!";
@@ -404,7 +431,7 @@ sub do_query {
 	delete $lei->{pkt_op_p};
 	$l2m->wq_close(1) if $l2m;
 	$lei->event_step_init; # wait for shutdowns
-	$self->wq_io_do('query_prepare', []) if $l2m;
+	$self->wq_io_do('query_prepare', []) if $l2m; # for augment/dedupe
 	start_query($self, $lei);
 	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 16/19] lei q: SIGWINCH process group with the terminal
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (14 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:51 ` [PATCH 17/19] lei import: support Maildirs Eric Wong
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

While using utime on the destination Maildir is enough for mutt
to eventually notice new mail, "eventually" isn't good enough.

Send a SIGWINCH to wake mutt (and likely other MUAs)
immediately.  This is more portable than relying on MUAs to
support inotify or EVFILT_VNODE.
---
 lib/PublicInbox/LEI.pm        | 11 +++++++++++
 lib/PublicInbox/LeiXSearch.pm |  7 ++++++-
 script/lei                    |  8 +++++---
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e52154e5..00affe82 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -746,6 +746,17 @@ sub start_mua {
 	}
 }
 
+sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
+	my ($self) = @_;
+	return unless $self->{opt}->{mua} && -t $self->{1};
+	# hit the process group that started the MUA
+	if (my $s = $self->{sock}) {
+		send($s, '-WINCH', MSG_EOR);
+	} elsif ($self->{oneshot}) {
+		kill('-WINCH', $$);
+	}
+}
+
 # caller needs to "-t $self->{1}" to check if tty
 sub start_pager {
 	my ($self) = @_;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 0e99e4b4..a7668a17 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -321,7 +321,12 @@ Error closing $lei->{ovv}->{dst}: $!
 			}
 			$lei->{1} = $out;
 		}
-		$l2m->lock_free ? $l2m->poke_dst : $lei->start_mua;
+		if ($l2m->lock_free) {
+			$l2m->poke_dst;
+			$lei->poke_mua;
+		} else { # mbox users
+			$lei->start_mua;
+		}
 	}
 	$lei->{-progress} and
 		$lei->err('# ', $lei->{-mset_total} // 0, " matches");
diff --git a/script/lei b/script/lei
index b7f21f14..0b0e2976 100755
--- a/script/lei
+++ b/script/lei
@@ -105,13 +105,15 @@ Falling back to (slow) one-shot mode
 			die "recvmsg: $!";
 		}
 		last if $buf eq '';
-		if ($buf =~ /\Ax_it ([0-9]+)\z/) {
+		if ($buf =~ /\Aexec (.+)\z/) {
+			$exec_cmd->(\@fds, split(/\0/, $1));
+		} elsif ($buf eq '-WINCH') {
+			kill($buf, $$); # for MUA
+		} elsif ($buf =~ /\Ax_it ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
 			last;
 		} elsif ($buf =~ /\Achild_error ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
-		} elsif ($buf =~ /\Aexec (.+)\z/) {
-			$exec_cmd->(\@fds, split(/\0/, $1));
 		} else {
 			$sigchld->();
 			die $buf;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 17/19] lei import: support Maildirs
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (15 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 16/19] lei q: SIGWINCH process group with the terminal Eric Wong
@ 2021-02-07  8:51 ` Eric Wong
  2021-02-07  8:52 ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

It seems to be working trivially, though I'm probably
going to split out Maildir reading into a separate
package rather than using LeiToMail.
---
 MANIFEST                     |  1 +
 lib/PublicInbox/LeiImport.pm | 20 +++++++++++++++++---
 lib/PublicInbox/LeiStore.pm  |  8 +++++++-
 lib/PublicInbox/LeiToMail.pm | 11 ++++++-----
 t/lei-import-maildir.t       | 33 +++++++++++++++++++++++++++++++++
 t/lei_to_mail.t              |  6 +++---
 6 files changed, 67 insertions(+), 12 deletions(-)
 create mode 100644 t/lei-import-maildir.t

diff --git a/MANIFEST b/MANIFEST
index 521f1f68..7f417743 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -359,6 +359,7 @@ t/iso-2202-jp.eml
 t/kqnotify.t
 t/lei-daemon.t
 t/lei-externals.t
+t/lei-import-maildir.t
 t/lei-import.t
 t/lei-mirror.t
 t/lei.t
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 2b2dc2f7..a63bfdfd 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -8,6 +8,8 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::MboxReader;
 use PublicInbox::Eml;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::PktOp;
 
 sub _import_eml { # MboxReader callback
 	my ($eml, $sto, $set_kw) = @_;
@@ -35,7 +37,9 @@ sub call { # the main "lei import" method
 	$lei->{opt}->{kw} //= 1;
 	my $fmt = $lei->{opt}->{'format'};
 	my $self = $lei->{imp} = bless {}, $cls;
-	return $lei->fail('--format unspecified') if !$fmt;
+	if (my @f = grep { -f } @argv && !$fmt) {
+		return $lei->fail("--format unset for regular files:\n@f");
+	}
 	$self->{0} = $lei->{0} if $lei->{opt}->{stdin};
 	my $ops = {
 		'!' => [ $lei->can('fail_handler'), $lei ],
@@ -75,14 +79,14 @@ sub _import_fh {
 		if ($fmt eq 'eml') {
 			my $buf = do { local $/; <$fh> } //
 				return $lei->child_error(1 >> 8, <<"");
-		error reading $x: $!
+error reading $x: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
 			_import_eml($eml, $lei->{sto}, $set_kw);
 		} else { # some mbox
 			my $cb = PublicInbox::MboxReader->can($fmt);
 			$cb or return $lei->child_error(1 >> 8, <<"");
-	--format $fmt unsupported for $x
+--format $fmt unsupported for $x
 
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
@@ -90,6 +94,11 @@ sub _import_fh {
 	$lei->child_error(1 >> 8, "<stdin>: $@") if $@;
 }
 
+sub _import_maildir { # maildir_each_file cb
+	my ($f, $sto, $set_kw) = @_;
+	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
+}
+
 sub import_path_url {
 	my ($self, $x) = @_;
 	my $lei = $self->{lei};
@@ -99,6 +108,11 @@ sub import_path_url {
 unable to open $x: $!
 
 		_import_fh($lei, $fh, $x);
+	} elsif (-d _ && (-d "$x/cur" || -d "$x/new")) {
+		require PublicInbox::LeiToMail;
+		PublicInbox::LeiToMail::maildir_each_file($x,
+					\&_import_maildir,
+					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
 		$lei->fail("$x unsupported (TODO)");
 	}
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 3a215973..546d500b 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -12,7 +12,7 @@ use v5.10.1;
 use parent qw(PublicInbox::Lock PublicInbox::IPC);
 use PublicInbox::ExtSearchIdx;
 use PublicInbox::Import;
-use PublicInbox::InboxWritable;
+use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::V2Writable;
 use PublicInbox::ContentHash qw(content_hash content_digest);
 use PublicInbox::MID qw(mids mids_in);
@@ -224,6 +224,12 @@ sub set_eml {
 	add_eml($self, $eml, @kw) // set_eml_keywords($self, $eml, @kw);
 }
 
+sub set_eml_from_maildir {
+	my ($self, $f, $set_kw) = @_;
+	my $eml = eml_from_path($f) or return;
+	set_eml($self, $eml, $set_kw ? maildir_keywords($f) : ());
+}
+
 sub done {
 	my ($self) = @_;
 	my $err = '';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 857aeb63..a5a196db 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -266,8 +266,9 @@ sub _mbox_write_cb ($$) {
 	}
 }
 
-sub _maildir_each_file ($$;@) {
+sub maildir_each_file ($$;@) {
 	my ($dir, $cb, @arg) = @_;
+	$dir .= '/' unless substr($dir, -1) eq '/';
 	for my $d (qw(new/ cur/)) {
 		my $pfx = $dir.$d;
 		opendir my $dh, $pfx or next;
@@ -277,13 +278,13 @@ sub _maildir_each_file ($$;@) {
 	}
 }
 
-sub _augment_file { # _maildir_each_file cb
+sub _augment_file { # maildir_each_file cb
 	my ($f, $lei) = @_;
 	my $eml = PublicInbox::InboxWritable::eml_from_path($f) or return;
 	_augment($eml, $lei);
 }
 
-# _maildir_each_file callback, \&CORE::unlink doesn't work with it
+# maildir_each_file callback, \&CORE::unlink doesn't work with it
 sub _unlink { unlink($_[0]) }
 
 sub _rand () {
@@ -389,11 +390,11 @@ sub _do_augment_maildir {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
 			require PublicInbox::InboxWritable; # eml_from_path
-			_maildir_each_file($dst, \&_augment_file, $lei);
+			maildir_each_file($dst, \&_augment_file, $lei);
 			$dedupe->pause_dedupe;
 		}
 	} else { # clobber existing Maildir
-		_maildir_each_file($dst, \&_unlink);
+		maildir_each_file($dst, \&_unlink);
 	}
 }
 
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
new file mode 100644
index 00000000..5842e19e
--- /dev/null
+++ b/t/lei-import-maildir.t
@@ -0,0 +1,33 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use Cwd qw(abs_path);
+test_lei(sub {
+	my $md = "$ENV{HOME}/md";
+	for ($md, "$md/new", "$md/cur", "$md/tmp") {
+		mkdir($_) or BAIL_OUT("mkdir $_: $!");
+	}
+	symlink(abs_path('t/data/0001.patch'), "$md/cur/x:2,S") or
+		BAIL_OUT "symlink $md $!";
+	ok($lei->(qw(import), $md), 'import Maildir');
+	ok($lei->(qw(q s:boolean)), 'lei q');
+	my $res = json_utf8->decode($lei_out);
+	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
+	is_deeply($res->[0]->{kw}, ['seen'], 'keyword set');
+	is($res->[1], undef, 'only got one result');
+
+	ok($lei->(qw(import), $md), 'import Maildir again');
+	ok($lei->(qw(q -d none s:boolean)), 'lei q w/o dedupe');
+	my $r2 = json_utf8->decode($lei_out);
+	is_deeply($r2, $res, 'idempotent import');
+
+	rename("$md/cur/x:2,S", "$md/cur/x:2,SR") or BAIL_OUT "rename: $!";
+	ok($lei->(qw(import), $md), 'import Maildir after +answered');
+	ok($lei->(qw(q -d none s:boolean)), 'lei q after +answered');
+	$res = json_utf8->decode($lei_out);
+	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
+	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
+	is($res->[1], undef, 'only got one result');
+});
+done_testing;
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index f7535687..a25795ca 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -237,7 +237,7 @@ SKIP: { # FIFO support
 	$wcb->(\(my $x = $buf), $b4dc0ffee);
 
 	my @f;
-	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @f, shift });
+	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @f, shift });
 	open my $fh, $f[0] or BAIL_OUT $!;
 	is(do { local $/; <$fh> }, $buf, 'wrote to Maildir');
 
@@ -246,7 +246,7 @@ SKIP: { # FIFO support
 	$wcb->(\($x = $buf."\nx\n"), $deadcafe);
 
 	my @x = ();
-	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @x, shift });
+	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @x, shift });
 	is(scalar(@x), 1, 'wrote one new file');
 	ok(!-f $f[0], 'old file clobbered');
 	open $fh, $x[0] or BAIL_OUT $!;
@@ -257,7 +257,7 @@ SKIP: { # FIFO support
 	$wcb->(\($x = $buf."\ny\n"), $deadcafe);
 	$wcb->(\($x = $buf."\ny\n"), $b4dc0ffee); # skipped by dedupe
 	@f = ();
-	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @f, shift });
+	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @f, shift });
 	is(scalar grep(/\A\Q$x[0]\E\z/, @f), 1, 'old file still there');
 	my @new = grep(!/\A\Q$x[0]\E\z/, @f);
 	is(scalar @new, 1, '1 new file written (b4dc0ffee skipped)');

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 18/19] imap: avoid unnecessary on-stack delete
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (16 preceding siblings ...)
  2021-02-07  8:51 ` [PATCH 17/19] lei import: support Maildirs Eric Wong
@ 2021-02-07  8:52 ` Eric Wong
  2021-02-07  8:52 ` [PATCH 19/19] httpd/async: " Eric Wong
  2021-02-07 10:40 ` [PATCH 21/19] lei q: fix arbitrary --mua command handling Eric Wong
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:52 UTC (permalink / raw)
  To: meta

None of the Content-Type attributes are long-lived
(and unlikely to be memory intensive).  While these
callsites won't trigger $DB::args segfaults via
confess or longmess, it'll make future code audits
easier.

cf. commit 0795b0906cc81f40
    ("ds: guard against stack-not-refcounted quirk of Perl 5")
---
 lib/PublicInbox/HTTPD/Async.pm | 2 +-
---
 lib/PublicInbox/IMAP.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index 226e98a2..af8ce72b 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -499,7 +499,7 @@ sub body_disposition ($) {
 	my $cd = $eml->header_raw('Content-Disposition') or return 'NIL';
 	$cd = parse_content_disposition($cd);
 	my $buf = '('._esc($cd->{type});
-	$buf .= ' ' . _esc_hash(delete $cd->{attributes});
+	$buf .= ' ' . _esc_hash($cd->{attributes});
 	$buf .= ')';
 }
 
@@ -511,7 +511,7 @@ sub body_leaf ($$;$) {
 	my $ct = $eml->ct;
 	$buf .= '('._esc($ct->{type}).' ';
 	$buf .= _esc($ct->{subtype});
-	$buf .= ' ' . _esc_hash(delete $ct->{attributes});
+	$buf .= ' ' . _esc_hash($ct->{attributes});
 	$buf .= ' ' . _esc($eml->header_raw('Content-ID'));
 	$buf .= ' ' . _esc($eml->header_raw('Content-Description'));
 	my $cte = $eml->header_raw('Content-Transfer-Encoding') // '7bit';
@@ -540,7 +540,7 @@ sub body_parent ($$$) {
 		$buf .= @$hold ? join('', @$hold) : 'NIL';
 		$buf .= ' '._esc($ct->{subtype});
 		if ($structure) {
-			$buf .= ' '._esc_hash(delete $ct->{attributes});
+			$buf .= ' '._esc_hash($ct->{attributes});
 			$buf .= ' '.body_disposition($eml);
 			$buf .= ' '._esc($eml->header_raw('Content-Language'));
 			$buf .= ' '._esc($eml->header_raw('Content-Location'));

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 19/19] httpd/async: avoid unnecessary on-stack delete
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (17 preceding siblings ...)
  2021-02-07  8:52 ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
@ 2021-02-07  8:52 ` Eric Wong
  2021-02-07 10:40 ` [PATCH 21/19] lei q: fix arbitrary --mua command handling Eric Wong
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  8:52 UTC (permalink / raw)
  To: meta

While this doesn't fix a known problem, this was a risky
construct in case somebody uses confess/longmess inside
the user-supplied callback.

cf. commit 0795b0906cc81f40
    ("ds: guard against stack-not-refcounted quirk of Perl 5")
---
 lib/PublicInbox/HTTPD/Async.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/HTTPD/Async.pm b/lib/PublicInbox/HTTPD/Async.pm
index 1de9501d..7238650a 100644
--- a/lib/PublicInbox/HTTPD/Async.pm
+++ b/lib/PublicInbox/HTTPD/Async.pm
@@ -46,7 +46,7 @@ sub event_step {
 	my ($self) = @_;
 	if (my $cb = delete $self->{cb}) {
 		# this may call async_pass when headers are done
-		$cb->(delete $self->{arg});
+		$cb->(my $refcnt_guard = delete $self->{arg});
 	} elsif (my $sock = $self->{sock}) {
 		my $http = $self->{http};
 		# $self->{sock} is a read pipe for git-http-backend or cgit

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 20/19] lei_xsearch: allow quieting regular mset progress, too
  2021-02-07  8:51 ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
@ 2021-02-07  9:32   ` Eric Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07  9:32 UTC (permalink / raw)
  To: meta

And actually wait for startq when using --threads/-t.
Some of this is tricky to test in an automated way,
unfortunately :/
---
 lib/PublicInbox/LeiXSearch.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index a7668a17..8a863cb6 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -128,7 +128,7 @@ sub mset_progress {
 	} else { # single lei-daemon consumer
 		my ($desc, $mset_size, $mset_total_est) = @_;
 		$lei->{-mset_total} += $mset_size;
-		$lei->err("# $desc $mset_size/$mset_total_est");
+		$lei->qerr("# $desc $mset_size/$mset_total_est");
 	}
 }
 
@@ -136,7 +136,6 @@ sub query_thread_mset { # for --threads
 	my ($self, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
 	my $lei = $self->{lei};
-	my $startq = delete $lei->{startq};
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
 	my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
 	return warn("$desc not indexed by Xapian\n") unless ($srch && $over);

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 21/19] lei q: fix arbitrary --mua command handling
  2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (18 preceding siblings ...)
  2021-02-07  8:52 ` [PATCH 19/19] httpd/async: " Eric Wong
@ 2021-02-07 10:40 ` Eric Wong
  19 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07 10:40 UTC (permalink / raw)
  To: meta

Perl doesn't seem to warn for shadowed variables, here :x
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 00affe82..e95a674b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -731,7 +731,7 @@ sub start_mua {
 	# TODO: help wanted: other common FOSS MUAs
 	} else {
 		require Text::ParseWords;
-		my @cmd = Text::ParseWords::shellwords($mua);
+		@cmd = Text::ParseWords::shellwords($mua);
 		# mutt uses '%f' for open-hook with compressed mbox, we follow
 		@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
 	}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* dprintf(3) portability? [was [02/19] spawn: pi_fork_exec: support "pgid"]
  2021-02-07  8:51 ` [PATCH 02/19] spawn: pi_fork_exec: support "pgid" Eric Wong
@ 2021-02-07 23:10   ` Eric Wong
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Wong @ 2021-02-07 23:10 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +++ b/lib/PublicInbox/Spawn.pm

> +		/* continue message started by exit_err in child */
> +		dprintf(err_fd, ": %s\n", strerror(cerrnum));

dprintf(3) is POSIX.1-2008 and we already depend on Perl 5.10.1
from 2009, so no concerns there, right?

I remember it being a portability problem around 2004, but it's
probably not an issue today.  This usage is trivial enough to
replace with writev(2), though.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-02-07 23:10 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-07  8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
2021-02-07  8:51 ` [PATCH 01/19] spawn: pi_fork_exec: restore parent sigmask in child Eric Wong
2021-02-07  8:51 ` [PATCH 02/19] spawn: pi_fork_exec: support "pgid" Eric Wong
2021-02-07 23:10   ` dprintf(3) portability? [was [02/19] spawn: pi_fork_exec: support "pgid"] Eric Wong
2021-02-07  8:51 ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
2021-02-07  8:51 ` [PATCH 04/19] spawn_pp: die more consistently in child Eric Wong
2021-02-07  8:51 ` [PATCH 05/19] ipc: do not die inside wq_worker child process Eric Wong
2021-02-07  8:51 ` [PATCH 06/19] ipc: trim down the Storable checks Eric Wong
2021-02-07  8:51 ` [PATCH 07/19] Makefile.PL: depend on IO::Uncompress::Gunzip Eric Wong
2021-02-07  8:51 ` [PATCH 08/19] xapcmd: avoid potential die surprise in children Eric Wong
2021-02-07  8:51 ` [PATCH 09/19] tests: guard setup_public_inboxes for SQLite and Xapian Eric Wong
2021-02-07  8:51 ` [PATCH 10/19] Revert "ipc: add support for asynchronous callbacks" Eric Wong
2021-02-07  8:51 ` [PATCH 11/19] ipc: wq_do => wq_io_do Eric Wong
2021-02-07  8:51 ` [PATCH 12/19] lei: more consistent IPC exit and error handling Eric Wong
2021-02-07  8:51 ` [PATCH 13/19] lei: remove --mua-cmd alias for --mua Eric Wong
2021-02-07  8:51 ` [PATCH 14/19] lei: replace --thread with --threads Eric Wong
2021-02-07  8:51 ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
2021-02-07  9:32   ` [PATCH 20/19] lei_xsearch: allow quieting regular mset progress, too Eric Wong
2021-02-07  8:51 ` [PATCH 16/19] lei q: SIGWINCH process group with the terminal Eric Wong
2021-02-07  8:51 ` [PATCH 17/19] lei import: support Maildirs Eric Wong
2021-02-07  8:52 ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
2021-02-07  8:52 ` [PATCH 19/19] httpd/async: " Eric Wong
2021-02-07 10:40 ` [PATCH 21/19] lei q: fix arbitrary --mua command handling Eric Wong

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ https://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.mail.public-inbox.meta
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.mail.public-inbox.meta
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git