user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/5] lei: more input + worker-related stuff
@ 2021-03-23 11:48 Eric Wong
  2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

Drop a bunch of redundant code, yay!

Eric Wong (5):
  net_reader: nntp_each: pass keywords as `undef'
  test_common: check lei/errors.log
  lei: persistent workers (lei_store) run in /
  lei_input: more common code between <mark|convert|import>
  lei: improve management around short-lived workers

 lib/PublicInbox/LEI.pm         |  2 +-
 lib/PublicInbox/LeiConvert.pm  | 50 +++++-------------
 lib/PublicInbox/LeiExternal.pm |  3 +-
 lib/PublicInbox/LeiImport.pm   | 94 +++++++++-------------------------
 lib/PublicInbox/LeiInput.pm    | 45 ++++++++++++++--
 lib/PublicInbox/LeiMark.pm     | 59 ++++-----------------
 lib/PublicInbox/LeiMirror.pm   |  2 +-
 lib/PublicInbox/LeiP2q.pm      |  5 +-
 lib/PublicInbox/LeiQuery.pm    |  2 +-
 lib/PublicInbox/NetReader.pm   |  5 +-
 lib/PublicInbox/TestCommon.pm  | 13 +++--
 11 files changed, 109 insertions(+), 171 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef'
  2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
  2021-03-23 11:48 ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

We'll use `undef' to denote keywords are unknown/unsupported,
instead of an empty arrayref.

This will let callers use the same callback and args for
imap_each.  Passing an empty arrayref to set_eml in LeiStore
causes keywords to be cleared completely, which is not desired
behavior when "lei import" is importing already-seen messages
from NNTP.
---
 lib/PublicInbox/NetReader.pm | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index bc211029..6a52b479 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -554,11 +554,10 @@ sub _nntp_fetch_all ($$$) {
 		return if $l_art >= $end; # nothing to do
 		$beg = $l_art + 1;
 	}
-	my ($err, $art);
+	my ($err, $art, $last_art, $kw); # kw stays undef, no keywords in NNTP
 	unless ($self->{quiet}) {
 		warn "# $uri fetching ARTICLE $beg..$end\n";
 	}
-	my $last_art;
 	my $n = $self->{max_batch};
 	for ($beg..$end) {
 		last if $self->{quit};
@@ -582,7 +581,7 @@ sub _nntp_fetch_all ($$$) {
 		$raw = join('', @$raw);
 		$raw =~ s/\r\n/\n/sg;
 		my ($eml_cb, @args) = @{$self->{eml_each}};
-		$eml_cb->($uri, $art, [], PublicInbox::Eml->new(\$raw), @args);
+		$eml_cb->($uri, $art, $kw, PublicInbox::Eml->new(\$raw), @args);
 		$last_art = $art;
 	}
 	run_commit_cb($self);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/5] test_common: check lei/errors.log
  2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
  2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
  2021-03-23 11:48 ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

This will make it easier to diagnose some large internal
rewrites.
---
 lib/PublicInbox/TestCommon.pm | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index e67e94ea..d4117b6c 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -507,7 +507,7 @@ SKIP: {
 Socket::MsgHdr missing or Inline::C is unconfigured/missing
 EOM
 	$lei_opt = { 1 => \$lei_out, 2 => \$lei_err };
-	my ($daemon_pid, $for_destroy);
+	my ($daemon_pid, $for_destroy, $daemon_xrd);
 	my $tmpdir = $test_opt->{tmpdir};
 	($tmpdir, $for_destroy) = tmpdir unless $tmpdir;
 	SKIP: {
@@ -515,9 +515,9 @@ EOM
 		my $home = "$tmpdir/lei-daemon";
 		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
 		local $ENV{HOME} = $home;
-		my $xrd = "$home/xdg_run";
-		mkdir($xrd, 0700) or BAIL_OUT "mkdir: $!";
-		local $ENV{XDG_RUNTIME_DIR} = $xrd;
+		$daemon_xrd = "$home/xdg_run";
+		mkdir($daemon_xrd, 0700) or BAIL_OUT "mkdir: $!";
+		local $ENV{XDG_RUNTIME_DIR} = $daemon_xrd;
 		$cb->();
 		lei_ok(qw(daemon-pid), \"daemon-pid after $t");
 		chomp($daemon_pid = $lei_out);
@@ -547,6 +547,11 @@ EOM
 			tick;
 		}
 		ok(!kill(0, $daemon_pid), "$t daemon stopped after oneshot");
+		my $f = "$daemon_xrd/lei/errors.log";
+		open my $fh, '<', $f or BAIL_OUT "$f: $!";
+		my @l = <$fh>;
+		is_deeply(\@l, [],
+			"$t daemon XDG_RUNTIME_DIR/lei/errors.log empty");
 	}
 }; # SKIP if missing git 2.6+ || Xapian || SQLite || json
 } # /test_lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/5] lei: persistent workers (lei_store) run in /
  2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
  2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
  2021-03-23 11:48 ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
  2021-03-23 11:48 ` [PATCH 4/5] lei_input: more common code between <mark|convert|import> Eric Wong
  2021-03-23 11:48 ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

Since each lei->event_step can change the directory of
lei-daemon, we need to ensure the lei_store runs in a
directory that is stable.
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17ca637e..d3ac19b2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -453,6 +453,7 @@ sub _lei_atfork_child {
 	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
 	if ($persist) {
+		chdir '/' or die "chdir(/): $!";
 		my @io = delete @$self{qw(0 1 2 sock)};
 		unless ($self->{oneshot}) {
 			close($_) for @io;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/5] lei_input: more common code between <mark|convert|import>
  2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-23 11:48 ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
  2021-03-23 11:48 ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

"lei convert" is actually a bit of the odd one, since
it uses lei2mail for auth, unlike the others.
---
 lib/PublicInbox/LeiConvert.pm | 47 ++++++----------------
 lib/PublicInbox/LeiImport.pm  | 74 +++++++++--------------------------
 lib/PublicInbox/LeiInput.pm   | 45 +++++++++++++++++++--
 lib/PublicInbox/LeiMark.pm    | 57 +++++----------------------
 4 files changed, 80 insertions(+), 143 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 49e2b7af..bc86fe25 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -6,64 +6,39 @@ package PublicInbox::LeiConvert;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::Eml;
-use PublicInbox::LeiStore;
 use PublicInbox::LeiOverview;
 
-sub mbox_cb { # MboxReader callback used by PublicInbox::LeiInput::input_fh
+# /^input_/ subs are used by PublicInbox::LeiInput
+
+sub input_mbox_cb { # MboxReader callback
 	my ($eml, $self) = @_;
 	my $kw = PublicInbox::MboxReader::mbox_keywords($eml);
 	$eml->header_set($_) for qw(Status X-Status);
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
-sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
 	my ($self, $eml) = @_;
-	$self->{wcb}->(undef, { kw => [] }, $eml);
+	$self->{wcb}->(undef, {}, $eml);
 }
 
-sub net_cb { # callback for ->imap_each, ->nntp_each
+sub input_net_cb { # callback for ->imap_each, ->nntp_each
 	my (undef, undef, $kw, $eml, $self) = @_; # @_[0,1]: url + uid ignored
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
-sub mdir_cb {
-	my ($f, $kw, $eml, $self) = @_;
+sub input_maildir_cb {
+	my (undef, $kw, $eml, $self) = @_; # $_[0] $filename ignored
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
 sub do_convert { # via wq_do
 	my ($self) = @_;
-	my $lei = $self->{lei};
-	my $ifmt = $lei->{opt}->{'in-format'};
-	if (my $stdin = delete $self->{0}) {
-		$self->input_fh($ifmt, $stdin, '<stdin>');
-	}
+	$self->input_stdin;
 	for my $input (@{$self->{inputs}}) {
-		my $ifmt = lc($ifmt // '');
-		if ($input =~ m!\Aimaps?://!) {
-			$lei->{net}->imap_each($input, \&net_cb, $self);
-			next;
-		} elsif ($input =~ m!\A(?:nntps?|s?news)://!) {
-			$lei->{net}->nntp_each($input, \&net_cb, $self);
-			next;
-		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
-			$ifmt = lc $1;
-		}
-		if (-f $input) {
-			my $m = $lei->{opt}->{'lock'} //
-					($ifmt eq 'eml' ? ['none'] :
-					PublicInbox::MboxLock->defaults);
-			my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
-			$self->input_fh($ifmt, $mbl->{fh}, $input);
-		} elsif (-d _) {
-			PublicInbox::MdirReader::maildir_each_eml($input,
-							\&mdir_cb, $self);
-		} else {
-			die "BUG: $input unhandled"; # should've failed earlier
-		}
+		$self->input_path_url($input);
 	}
-	delete $lei->{1};
+	delete $self->{lei}->{1};
 	delete $self->{wcb}; # commit
 }
 
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 21af28a3..991c84f2 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -6,23 +6,33 @@ package PublicInbox::LeiImport;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::Eml;
-use PublicInbox::PktOp qw(pkt_do);
 
-sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+# /^input_/ subs are used by (or override) PublicInbox::LeiInput superclass
+
+sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
 	my ($self, $eml, $vmd) = @_;
 	my $xoids = $self->{lei}->{ale}->xoids_for($eml);
 	$self->{lei}->{sto}->ipc_do('set_eml', $eml, $vmd, $xoids);
 }
 
-sub mbox_cb { # MboxReader callback used by PublicInbox::LeiInput::input_fh
+sub input_mbox_cb { # MboxReader callback
 	my ($eml, $self) = @_;
 	my $vmd;
 	if ($self->{-import_kw}) {
 		my $kw = PublicInbox::MboxReader::mbox_keywords($eml);
 		$vmd = { kw => $kw } if scalar(@$kw);
 	}
-	eml_cb($self, $eml, $vmd);
+	input_eml_cb($self, $eml, $vmd);
+}
+
+sub input_maildir_cb { # maildir_each_eml cb
+	my ($f, $kw, $eml, $self) = @_;
+	input_eml_cb($self, $eml, $self->{-import_kw} ? { kw => $kw } : undef);
+}
+
+sub input_net_cb { # imap_each, nntp_each cb
+	my ($url, $uid, $kw, $eml, $self) = @_;
+	input_eml_cb($self, $eml, $self->{-import_kw} ? { kw => $kw } : undef);
 }
 
 sub import_done_wait { # dwaitpid callback
@@ -43,7 +53,7 @@ sub import_done { # EOF callback for main daemon
 sub net_merge_complete { # callback used by LeiAuth
 	my ($self) = @_;
 	for my $input (@{$self->{inputs}}) {
-		$self->wq_io_do('import_path_url', [], $input);
+		$self->wq_io_do('input_path_url', [], $input);
 	}
 	$self->wq_close(1);
 }
@@ -63,7 +73,8 @@ sub import_start {
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
 	$self->{-wq_nr_workers} = $j // 1; # locked
 	my $op = $lei->workers_start($self, 'lei_import', undef, $ops);
-	$self->wq_io_do('import_stdin', []) if $self->{0};
+	$lei->{imp} = $self;
+	$self->wq_io_do('input_stdin', []) if $self->{0};
 	net_merge_complete($self) unless $lei->{auth};
 	while ($op && $op->{sock}) { $op->event_step }
 }
@@ -78,55 +89,6 @@ sub lei_import { # the main "lei import" method
 	import_start($lei);
 }
 
-sub _import_maildir { # maildir_each_eml cb
-	my ($f, $kw, $eml, $sto, $set_kw) = @_;
-	$sto->ipc_do('set_eml', $eml, $set_kw ? { kw => $kw }: ());
-}
-
-sub _import_net { # imap_each, nntp_each cb
-	my ($url, $uid, $kw, $eml, $sto, $set_kw) = @_;
-	$sto->ipc_do('set_eml', $eml, $set_kw ? { kw => $kw } : ());
-}
-
-sub import_path_url {
-	my ($self, $input) = @_;
-	my $lei = $self->{lei};
-	my $ifmt = lc($lei->{opt}->{'in-format'} // '');
-	# TODO auto-detect?
-	if ($input =~ m!\Aimaps?://!i) {
-		$lei->{net}->imap_each($input, \&_import_net, $lei->{sto},
-					$self->{-import_kw});
-		return;
-	} elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
-		$lei->{net}->nntp_each($input, \&_import_net, $lei->{sto}, 0);
-		return;
-	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
-		$ifmt = lc $1;
-	}
-	if (-f $input) {
-		my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
-				PublicInbox::MboxLock->defaults);
-		my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
-		$self->input_fh($ifmt, $mbl->{fh}, $input);
-	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
-		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
-$input appears to a be a maildir, not $ifmt
-EOM
-		PublicInbox::MdirReader::maildir_each_eml($input,
-					\&_import_maildir,
-					$lei->{sto}, $self->{-import_kw});
-	} else {
-		$lei->fail("$input unsupported (TODO)");
-	}
-}
-
-sub import_stdin {
-	my ($self) = @_;
-	my $lei = $self->{lei};
-	my $in = delete $self->{0};
-	$self->input_fh($lei->{opt}->{'in-format'}, $in, '<stdin>');
-}
-
 no warnings 'once';
 *ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
 
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 2a4968d4..b059ecda 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -24,10 +24,11 @@ sub check_input_format ($;$) {
 }
 
 # import a single file handle of $name
-# Subclass must define ->eml_cb and ->mbox_cb
+# Subclass must define ->input_eml_cb and ->input_mbox_cb
 sub input_fh {
 	my ($self, $ifmt, $fh, $name, @args) = @_;
 	if ($ifmt eq 'eml') {
+		require PublicInbox::Eml;
 		my $buf = do { local $/; <$fh> } //
 			return $self->{lei}->child_error(1 << 8, <<"");
 error reading $name: $!
@@ -36,12 +37,50 @@ error reading $name: $!
 		# but no Content-Length or "From " escaping.
 		# "git format-patch" also generates such files by default.
 		$buf =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
-		$self->eml_cb(PublicInbox::Eml->new(\$buf), @args);
+		$self->input_eml_cb(PublicInbox::Eml->new(\$buf), @args);
 	} else {
 		# prepare_inputs already validated $ifmt
 		my $cb = PublicInbox::MboxReader->reads($ifmt) //
 				die "BUG: bad fmt=$ifmt";
-		$cb->(undef, $fh, $self->can('mbox_cb'), $self, @args);
+		$cb->(undef, $fh, $self->can('input_mbox_cb'), $self, @args);
+	}
+}
+
+sub input_stdin {
+	my ($self) = @_;
+	my $in = delete $self->{0} or return;
+	$self->input_fh($self->{lei}->{opt}->{'in-format'}, $in, '<stdin>');
+}
+
+sub input_path_url {
+	my ($self, $input, @args) = @_;
+	my $lei = $self->{lei};
+	my $ifmt = lc($lei->{opt}->{'in-format'} // '');
+	# TODO auto-detect?
+	if ($input =~ m!\Aimaps?://!i) {
+		$lei->{net}->imap_each($input, $self->can('input_net_cb'),
+					$self, @args);
+		return;
+	} elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
+		$lei->{net}->nntp_each($input, $self->can('input_net_cb'),
+					$self, @args);
+		return;
+	}
+	$input =~ s!\A([a-z0-9]+):!!i and $ifmt = lc($1);
+	if (-f $input) {
+		my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
+				PublicInbox::MboxLock->defaults);
+		my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
+		$self->input_fh($ifmt, $mbl->{fh}, $input, @args);
+	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
+		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
+$input appears to a be a maildir, not $ifmt
+EOM
+		PublicInbox::MdirReader::maildir_each_eml($input,
+					$self->can('input_maildir_cb'),
+					$self, @args);
+	} else {
+		$lei->fail("$input unsupported (TODO)");
 	}
 }
 
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
index 7b50aa51..3b5e6c2c 100644
--- a/lib/PublicInbox/LeiMark.pm
+++ b/lib/PublicInbox/LeiMark.pm
@@ -6,8 +6,6 @@ package PublicInbox::LeiMark;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::Eml;
-use PublicInbox::PktOp qw(pkt_do);
 
 # JMAP RFC 8621 4.1.1
 my @KW = (qw(seen answered flagged draft), # system
@@ -34,7 +32,6 @@ my %ERR = (
 `$kw' is not one of: `seen', `flagged', `answered', `draft'
 `junk', `notjunk', `phishing' or `forwarded'
 EOM
-
 	}
 );
 
@@ -60,7 +57,7 @@ sub vmd_mod_extract {
 	$vmd_mod;
 }
 
-sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
 	my ($self, $eml) = @_;
 	if (my $xoids = $self->{lei}->{ale}->xoids_for($eml)) {
 		$self->{lei}->{sto}->ipc_do('update_xvmd', $xoids,
@@ -70,7 +67,7 @@ sub eml_cb { # used by PublicInbox::LeiInput::input_fh
 	}
 }
 
-sub mbox_cb { eml_cb($_[1], $_[0]) } # used by PublicInbox::LeiInput::input_fh
+sub input_mbox_cb { input_eml_cb($_[1], $_[0]) }
 
 sub mark_done_wait { # dwaitpid callback
 	my ($arg, $pid) = @_;
@@ -90,19 +87,19 @@ sub mark_done { # EOF callback for main daemon
 sub net_merge_complete { # callback used by LeiAuth
 	my ($self) = @_;
 	for my $input (@{$self->{inputs}}) {
-		$self->wq_io_do('mark_path_url', [], $input);
+		$self->wq_io_do('input_path_url', [], $input);
 	}
 	$self->wq_close(1);
 }
 
-sub _mark_maildir { # maildir_each_eml cb
+sub input_maildir_cb { # maildir_each_eml cb
 	my ($f, $kw, $eml, $self) = @_;
-	eml_cb($self, $eml);
+	input_eml_cb($self, $eml);
 }
 
-sub _mark_net { # imap_each, nntp_each cb
+sub input_net_cb { # imap_each, nntp_each cb
 	my ($url, $uid, $kw, $eml, $self) = @_;
-	eml_cb($self, $eml)
+	input_eml_cb($self, $eml);
 }
 
 sub lei_mark { # the "lei mark" method
@@ -120,48 +117,12 @@ sub lei_mark { # the "lei mark" method
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
 	$self->{vmd_mod} = $vmd_mod;
 	my $op = $lei->workers_start($self, 'lei_mark', 1, $ops);
-	$self->wq_io_do('mark_stdin', []) if $self->{0};
+	$lei->{mark} = $self;
+	$self->wq_io_do('input_stdin', []) if $self->{0};
 	net_merge_complete($self) unless $lei->{auth};
 	while ($op && $op->{sock}) { $op->event_step }
 }
 
-sub mark_path_url {
-	my ($self, $input) = @_;
-	my $lei = $self->{lei};
-	my $ifmt = lc($lei->{opt}->{'in-format'} // '');
-	# TODO auto-detect?
-	if ($input =~ m!\Aimaps?://!i) {
-		$lei->{net}->imap_each($input, \&_mark_net, $self);
-		return;
-	} elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
-		$lei->{net}->nntp_each($input, \&_mark_net, $self);
-		return;
-	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
-		$ifmt = lc $1;
-	}
-	if (-f $input) {
-		my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
-				PublicInbox::MboxLock->defaults);
-		my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
-		$self->input_fh($ifmt, $mbl->{fh}, $input);
-	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
-		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
-$input appears to a be a maildir, not $ifmt
-EOM
-		PublicInbox::MdirReader::maildir_each_eml($input,
-					\&_mark_maildir, $self);
-	} else {
-		$lei->fail("$input unsupported (TODO)");
-	}
-}
-
-sub mark_stdin {
-	my ($self) = @_;
-	my $lei = $self->{lei};
-	my $in = delete $self->{0};
-	$self->input_fh($lei->{opt}->{'in-format'}, $in, '<stdin>');
-}
-
 sub note_missing {
 	my ($self) = @_;
 	$self->{lei}->child_error(1 << 8) if $self->{missing};

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 5/5] lei: improve management around short-lived workers
  2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
                   ` (3 preceding siblings ...)
  2021-03-23 11:48 ` [PATCH 4/5] lei_input: more common code between <mark|convert|import> Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

Instead of creating a short-lived circular reference,
ensure they don't exist in the first place.

Note the following changes to hold an extra ref to $sto:

	-	$self->_lei_store(1)->write_prepare($self);
	+	my $sto = $self->_lei_store(1);
	+	$sto->write_prepare($self);

I'm not a perlguts expert, but I actually wanted to switch
to the one-line version for LeiImport, but xt/lei-auth-fail.t
was getting stuck for some reason.  It seems the extra ref
to the LeiStore ($sto) object is necessary.
---
 lib/PublicInbox/LEI.pm         |  1 -
 lib/PublicInbox/LeiConvert.pm  |  3 ++-
 lib/PublicInbox/LeiExternal.pm |  3 ++-
 lib/PublicInbox/LeiImport.pm   | 20 +++++++-------------
 lib/PublicInbox/LeiMark.pm     |  2 +-
 lib/PublicInbox/LeiMirror.pm   |  2 +-
 lib/PublicInbox/LeiP2q.pm      |  5 +++--
 lib/PublicInbox/LeiQuery.pm    |  2 +-
 8 files changed, 17 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index d3ac19b2..8cbaac01 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -462,7 +462,6 @@ sub _lei_atfork_child {
 		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
 		delete $self->{0};
 	}
-	delete @$self{qw(cnv mark imp)};
 	for (delete @$self{qw(3 old_1 au_done)}) {
 		close($_) if defined($_);
 	}
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index bc86fe25..0cc65108 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -46,13 +46,14 @@ sub lei_convert { # the main "lei convert" method
 	my ($lei, @inputs) = @_;
 	$lei->{opt}->{kw} //= 1;
 	$lei->{opt}->{dedupe} //= 'none';
-	my $self = $lei->{cnv} = bless {}, __PACKAGE__;
+	my $self = bless {}, __PACKAGE__;
 	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
 	$lei->{l2m} or return
 		$lei->fail("output not specified or is not a mail destination");
 	$lei->{opt}->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	$self->prepare_inputs($lei, \@inputs) or return;
 	my $op = $lei->workers_start($self, 'lei_convert', 1);
+	$lei->{cnv} = $self;
 	$self->wq_io_do('do_convert', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 9a555831..56d6ef39 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -144,7 +144,8 @@ sub add_external_finish {
 
 sub lei_add_external {
 	my ($self, $location) = @_;
-	$self->_lei_store(1)->write_prepare($self);
+	my $sto = $self->_lei_store(1);
+	$sto->write_prepare($self);
 	my $opt = $self->{opt};
 	my $mirror = $opt->{mirror} // do {
 		my @fail;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 991c84f2..9da6b7f9 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -58,9 +58,13 @@ sub net_merge_complete { # callback used by LeiAuth
 	$self->wq_close(1);
 }
 
-sub import_start {
-	my ($lei) = @_;
-	my $self = $lei->{imp};
+sub lei_import { # the main "lei import" method
+	my ($lei, @inputs) = @_;
+	my $sto = $lei->_lei_store(1);
+	$sto->write_prepare($lei);
+	my $self = bless {}, __PACKAGE__;
+	$self->{-import_kw} = $lei->{opt}->{kw} // 1;
+	$self->prepare_inputs($lei, \@inputs) or return;
 	$lei->ale; # initialize for workers to read
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $net = $lei->{net}) {
@@ -79,16 +83,6 @@ sub import_start {
 	while ($op && $op->{sock}) { $op->event_step }
 }
 
-sub lei_import { # the main "lei import" method
-	my ($lei, @inputs) = @_;
-	my $sto = $lei->_lei_store(1);
-	$sto->write_prepare($lei);
-	my $self = $lei->{imp} = bless {}, __PACKAGE__;
-	$self->{-import_kw} = $lei->{opt}->{kw} // 1;
-	$self->prepare_inputs($lei, \@inputs) or return;
-	import_start($lei);
-}
-
 no warnings 'once';
 *ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
 
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
index 3b5e6c2c..9d77f4b4 100644
--- a/lib/PublicInbox/LeiMark.pm
+++ b/lib/PublicInbox/LeiMark.pm
@@ -105,8 +105,8 @@ sub input_net_cb { # imap_each, nntp_each cb
 sub lei_mark { # the "lei mark" method
 	my ($lei, @argv) = @_;
 	my $sto = $lei->_lei_store(1);
-	my $self = $lei->{mark} = bless { missing => 0 }, __PACKAGE__;
 	$sto->write_prepare($lei);
+	my $self = bless { missing => 0 }, __PACKAGE__;
 	$lei->ale; # refresh and prepare
 	my $vmd_mod = vmd_mod_extract(\@argv);
 	return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index c916f2d0..6e62625d 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -269,7 +269,6 @@ sub do_mirror { # via wq_io_do
 sub start {
 	my ($cls, $lei, $src, $dst) = @_;
 	my $self = bless { lei => $lei, src => $src, dst => $dst }, $cls;
-	$lei->{mrr} = $self;
 	if ($src =~ m!https?://!) {
 		require URI;
 		require PublicInbox::LeiCurl;
@@ -281,6 +280,7 @@ sub start {
 	my $op = $lei->workers_start($self, 'lei_mirror', 1, {
 		'' => [ \&mirror_done, $lei ]
 	});
+	$lei->{mrr} = $self;
 	$self->wq_io_do('do_mirror', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index 0f7ffb5f..fda055fe 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -176,13 +176,14 @@ sub do_p2q { # via wq_do
 
 sub lei_p2q { # the "lei patch-to-query" entry point
 	my ($lei, $input) = @_;
-	my $self = $lei->{p2q} = bless {}, __PACKAGE__;
+	my $self = bless {}, __PACKAGE__;
 	if ($lei->{opt}->{stdin}) {
 		$self->{0} = delete $lei->{0}; # guard from _lei_atfork_child
 	} else {
 		$self->{input} = $input;
 	}
-	my $op = $lei->workers_start($self, 'lei patch2query', 1);
+	my $op = $lei->workers_start($self, 'lei_p2q', 1);
+	$lei->{p2q} = $self;
 	$self->wq_io_do('do_p2q', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 148e8524..84996e7e 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -50,11 +50,11 @@ sub lei_q {
 	# --local is enabled by default unless --only is used
 	# we'll allow "--only $LOCATION --local"
 	my $sto = $self->_lei_store(1);
-	my $lse = $sto->search;
 	if (($opt->{'import-remote'} //= 1) |
 			(($opt->{'import-before'} //= \1) ? 1 : 0)) {
 		$sto->write_prepare($self);
 	}
+	my $lse = $sto->search;
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		$lxs->prepare_external($lse);
 	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-23 11:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
2021-03-23 11:48 ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
2021-03-23 11:48 ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
2021-03-23 11:48 ` [PATCH 4/5] lei_input: more common code between <mark|convert|import> Eric Wong
2021-03-23 11:48 ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ http://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git