user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 00/19] lei import Maildir, remote mboxrd fixes
@ 2021-02-07  8:51 63% Eric Wong
  2021-02-07  8:51 37% ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

"lei q" with remote mboxrd + early MUA spawning is
nicer, too.  Several risky constructs eliminated,

Interrupting "add-external --mirror" is less bad, now;
though it could probably support indexlevel=none in
case somebody wants to run index themselves.

Eric Wong (19):
  spawn: pi_fork_exec: restore parent sigmask in child
  spawn: pi_fork_exec: support "pgid"
  lei add-external: handle interrupts with --mirror
  spawn_pp: die more consistently in child
  ipc: do not die inside wq_worker child process
  ipc: trim down the Storable checks
  Makefile.PL: depend on IO::Uncompress::Gunzip
  xapcmd: avoid potential die surprise in children
  tests: guard setup_public_inboxes for SQLite and Xapian
  Revert "ipc: add support for asynchronous callbacks"
  ipc: wq_do => wq_io_do
  lei: more consistent IPC exit and error handling
  lei: remove --mua-cmd alias for --mua
  lei: replace --thread with --threads
  lei q: improve remote mboxrd UX
  lei q: SIGWINCH process group with the terminal
  lei import: support Maildirs
  imap: avoid unnecessary delete on stack
  httpd/async: avoid unnecessary on-stack delete

 Documentation/lei-q.pod        |   4 +-
 MANIFEST                       |   1 +
 Makefile.PL                    |   1 +
 lib/PublicInbox/HTTPD/Async.pm |   2 +-
 lib/PublicInbox/IMAP.pm        |   6 +-
 lib/PublicInbox/IPC.pm         | 105 +++++++-----------------
 lib/PublicInbox/LEI.pm         |  49 +++++++----
 lib/PublicInbox/LeiCurl.pm     |  11 ++-
 lib/PublicInbox/LeiHelp.pm     |   6 +-
 lib/PublicInbox/LeiImport.pm   |  38 ++++++---
 lib/PublicInbox/LeiMirror.pm   |  75 ++++++++++-------
 lib/PublicInbox/LeiOverview.pm |   7 +-
 lib/PublicInbox/LeiQuery.pm    |   4 +-
 lib/PublicInbox/LeiStore.pm    |   8 +-
 lib/PublicInbox/LeiToMail.pm   |  37 ++++-----
 lib/PublicInbox/LeiXSearch.pm  | 143 ++++++++++++++++++++-------------
 lib/PublicInbox/Mbox.pm        |   2 +-
 lib/PublicInbox/OnDestroy.pm   |   2 +-
 lib/PublicInbox/Search.pm      |   2 +-
 lib/PublicInbox/SearchView.pm  |   2 +-
 lib/PublicInbox/Spawn.pm       |  63 +++++++++------
 lib/PublicInbox/SpawnPP.pm     |  44 +++++-----
 lib/PublicInbox/Xapcmd.pm      |  11 +--
 script/lei                     |   8 +-
 t/ipc.t                        |  39 ++-------
 t/lei-externals.t              |   2 +
 t/lei-import-maildir.t         |  33 ++++++++
 t/lei-mirror.t                 |  14 ++++
 t/lei.t                        |   2 +-
 t/lei_to_mail.t                |   6 +-
 t/spawn.t                      |  18 +++++
 xt/stress-sharedkv.t           |   6 +-
 32 files changed, 433 insertions(+), 318 deletions(-)
 create mode 100644 t/lei-import-maildir.t


^ permalink raw reply	[relevance 63%]

* [PATCH 03/19] lei add-external: handle interrupts with --mirror
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
@ 2021-02-07  8:51 37% ` Eric Wong
  2021-02-07  8:51 50% ` [PATCH 12/19] lei: more consistent IPC exit and error handling Eric Wong
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

This also updates lei_xsearch to follow the same pattern for
stopping curl(1) and tail(1) processes it spawns.
---
 lib/PublicInbox/IPC.pm        |  5 +--
 lib/PublicInbox/LEI.pm        |  6 ++++
 lib/PublicInbox/LeiMirror.pm  | 66 +++++++++++++++++++++++------------
 lib/PublicInbox/LeiXSearch.pm | 21 +++++------
 lib/PublicInbox/OnDestroy.pm  |  2 +-
 t/lei-mirror.t                | 12 +++++++
 6 files changed, 74 insertions(+), 38 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 0dee2a92..b936c27a 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -150,9 +150,10 @@ sub ipc_worker_reap { # dwaitpid callback
 }
 
 sub wq_wait_old {
-	my ($self, $args) = @_;
+	my ($self, @args) = @_;
+	my $cb = ref($args[0]) eq 'CODE' ? shift(@args) : \&ipc_worker_reap;
 	my $pids = delete $self->{"-wq_old_pids.$$"} or return;
-	dwaitpid($_, \&ipc_worker_reap, [$self, $args]) for @$pids;
+	dwaitpid($_, $cb, [$self, @args]) for @$pids;
 }
 
 # for base class, override in sub classes
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3098ade7..515bc2a3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -370,6 +370,12 @@ sub sigpipe_handler { # handles SIGPIPE from @WQ_KEYS workers
 	fail_handler($_[0], 13, delete $_[0]->{1});
 }
 
+# PublicInbox::OnDestroy callback for SIGINT to take out the entire pgid
+sub sigint_reap {
+	my ($pgid) = @_;
+	dwaitpid($pgid) if kill('-INT', $pgid);
+}
+
 sub fail ($$;$) {
 	my ($self, $buf, $exit_code) = @_;
 	err($self, $buf) if defined $buf;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index bb172e6a..13795a58 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -10,13 +10,19 @@ use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
 use PublicInbox::Spawn qw(popen_rd spawn);
 use PublicInbox::PktOp;
 
+sub do_finish_mirror { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($mrr, $lei) = @$arg;
+	if ($? == 0 && unlink("$mrr->{dst}/mirror.done")) {
+		$lei->add_external_finish($mrr->{dst});
+	}
+	$lei->dclose;
+}
+
 sub mirror_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	my $mrr = delete $lei->{mrr};
-	$mrr->wq_wait_old($lei) if $mrr;
-	# FIXME: check $? before finish
-	$lei->add_external_finish($mrr->{dst});
-	$lei->dclose;
+	my $mrr = delete $lei->{mrr} or return;
+	$mrr->wq_wait_old(\&do_finish_mirror, $lei);
 }
 
 # for old installations without manifest.js.gz
@@ -59,8 +65,9 @@ E: confused by scraping <$uri>, got ambiguous results:
 }
 
 sub clone_cmd {
-	my ($lei) = @_;
+	my ($lei, $opt) = @_;
 	my @cmd = qw(git);
+	$opt->{$_} = $lei->{$_} for (0..2);
 	# we support "-c $key=$val" for arbitrary git config options
 	# e.g.: git -c http.proxy=socks5h://127.0.0.1:9050
 	push(@cmd, '-c', $_) for @{$lei->{opt}->{c} // []};
@@ -92,14 +99,12 @@ sub _try_config {
 	my $f = "$ce-$$.tmp";
 	open(my $fh, '+>', $f) or return $lei->err("open $f: $! (non-fatal)");
 	my $opt = { 0 => $lei->{0}, 1 => $fh, 2 => $lei->{2} };
-	$lei->qerr("# @$cmd");
-	my $pid = spawn($cmd, $lei->{env}, $opt);
-	waitpid($pid, 0) == $pid or return $lei->err("waitpid @$cmd: $!");
-	if (($? >> 8) == 22) { # 404 missing
+	my $cerr = run_reap($lei, $cmd, $opt) // return;
+	if (($cerr >> 8) == 22) { # 404 missing
 		unlink($f) if -s $fh == 0;
 		return;
 	}
-	return $lei->err("# @$cmd failed (non-fatal)") if $?;
+	return $lei->err("# @$cmd failed (non-fatal)") if $cerr;
 	rename($f, $ce) or return $lei->err("link($f, $ce): $! (non-fatal)");
 	my $cfg = PublicInbox::Config::git_config_dump($f);
 	my $ibx = $self->{ibx} = {};
@@ -132,6 +137,18 @@ sub index_cloned_inbox {
 	local %ENV = (%ENV, %$env) if $env;
 	PublicInbox::Admin::progress_prepare($opt, $lei->{2});
 	PublicInbox::Admin::index_inbox($ibx, undef, $opt);
+	open my $x, '>', "$self->{dst}/mirror.done"; # for do_finish_mirror
+}
+
+sub run_reap {
+	my ($lei, $cmd, $opt) = @_;
+	$lei->qerr("# @$cmd");
+	$opt->{pgid} = 0;
+	my $pid = spawn($cmd, $lei->{env}, $opt);
+	my $reap = PublicInbox::OnDestroy->new($lei->can('sigint_reap'), $pid);
+	my $err = waitpid($pid, 0) == $pid ? undef : "waitpid @$cmd: $!";
+	@$reap = (); # cancel reap
+	$err ? $lei->err($err) : $?
 }
 
 sub clone_v1 {
@@ -140,11 +157,10 @@ sub clone_v1 {
 	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
 	my $uri = URI->new($self->{src});
 	my $pfx = $curl->torsocks($lei, $uri) or return;
-	my $cmd = [ @$pfx, clone_cmd($lei), $uri->as_string, $self->{dst} ];
-	$lei->qerr("# @$cmd");
-	my $pid = spawn($cmd, $lei->{env}, $lei);
-	waitpid($pid, 0) == $pid or die "BUG: waitpid @$cmd: $!";
-	$? == 0 or return $lei->child_error($?, "@$cmd failed");
+	my $cmd = [ @$pfx, clone_cmd($lei, my $opt = {}),
+			$uri->as_string, $self->{dst} ];
+	my $cerr = run_reap($lei, $cmd, $opt) // return;
+	return $lei->child_error($cerr, "@$cmd failed") if $cerr;
 	_try_config($self);
 	index_cloned_inbox($self, 1);
 }
@@ -170,13 +186,11 @@ failed to extract epoch number from $src
 	my $lk = bless { lock_path => "$dst/inbox.lock" }, 'PublicInbox::Lock';
 	_try_config($self);
 	my $on_destroy = $lk->lock_for_scope($$);
-	my @cmd = clone_cmd($lei);
+	my @cmd = clone_cmd($lei, my $opt = {});
 	while (my $pair = shift(@src_edst)) {
 		my $cmd = [ @$pfx, @cmd, @$pair ];
-		$lei->qerr("# @$cmd");
-		my $pid = spawn($cmd, $lei->{env}, $lei);
-		waitpid($pid, 0) == $pid or die "BUG: waitpid @$cmd: $!";
-		$? == 0 or return $lei->child_error($?, "@$cmd failed");
+		my $cerr = run_reap($lei, $cmd, $opt) // return;
+		return $lei->child_error($cerr, "@$cmd failed") if $cerr;
 	}
 	undef $on_destroy; # unlock
 	index_cloned_inbox($self, 2);
@@ -193,9 +207,14 @@ sub try_manifest {
 	my $cmd = $curl->for_uri($lei, $uri);
 	$lei->qerr("# @$cmd");
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
-	my $fh = popen_rd($cmd, $lei->{env}, $opt);
+	my ($fh, $pid) = popen_rd($cmd, $lei->{env}, $opt);
+	my $reap = PublicInbox::OnDestroy->new($lei->can('sigint_reap'), $pid);
 	my $gz = do { local $/; <$fh> } // die "read(curl $uri): $!";
-	unless (close $fh) {
+	close $fh;
+	my $err = waitpid($pid, 0) == $pid ? undef : "waitpid @$cmd: $!";
+	@$reap = ();
+	return $lei->err($err) if $err;
+	if ($?) {
 		return try_scrape($self) if ($? >> 8) == 22; # 404 missing
 		return $lei->child_error($?, "@$cmd failed");
 	}
@@ -282,6 +301,7 @@ sub start {
 sub ipc_atfork_child {
 	my ($self) = @_;
 	$self->{lei}->lei_atfork_child;
+	$SIG{TERM} = sub { exit(128 + 15) }; # trigger OnDestroy $reap
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1e5d7ca6..6a1b107b 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -197,13 +197,6 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$each_smsg->($smsg, undef, $eml);
 }
 
-# PublicInbox::OnDestroy callback
-sub kill_reap {
-	my ($pid) = @_;
-	kill('KILL', $pid); # spawn() blocks other signals
-	waitpid($pid, 0);
-}
-
 sub query_remote_mboxrd {
 	my ($self, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
@@ -213,18 +206,19 @@ sub query_remote_mboxrd {
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
 	my $verbose = $opt->{verbose};
-	my $reap;
+	my ($reap_tail, $reap_curl);
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
 	fcntl($cerr, F_SETFL, O_APPEND|O_RDWR) or warn "set O_APPEND: $!";
-	my $rdr = { 2 => $cerr };
+	my $rdr = { 2 => $cerr, pgid => 0 };
 	my $coff = 0;
+	my $sigint_reap = $lei->can('sigint_reap');
 	if ($verbose) {
 		# spawn a process to force line-buffering, otherwise curl
 		# will write 1 character at-a-time and parallel outputs
 		# mmmaaayyy llloookkk llliiikkkeee ttthhhiiisss
-		my $o = { 1 => $lei->{2}, 2 => $lei->{2} };
+		my $o = { 1 => $lei->{2}, 2 => $lei->{2}, pgid => 0 };
 		my $pid = spawn(['tail', '-f', $cerr->filename], undef, $o);
-		$reap = PublicInbox::OnDestroy->new(\&kill_reap, $pid);
+		$reap_tail = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 	}
 	my $curl = PublicInbox::LeiCurl->new($lei, $self->{curl}) or return;
 	push @$curl, '-s', '-d', '';
@@ -236,10 +230,13 @@ sub query_remote_mboxrd {
 		my $cmd = $curl->for_uri($lei, $uri);
 		$lei->err("# @$cmd") if $verbose;
 		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
+		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
 						$lei, $each_smsg);
-		waitpid($pid, 0) == $pid or die "BUG: waitpid (curl): $!";
+		my $err = waitpid($pid, 0) == $pid ? undef : "BUG: waitpid: $!";
+		@$reap_curl = (); # cancel OnDestroy
+		die $err if $err;
 		if ($? == 0) {
 			my $nr = $lei->{-nr_remote_eml};
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
diff --git a/lib/PublicInbox/OnDestroy.pm b/lib/PublicInbox/OnDestroy.pm
index 0ae4c4c9..615bc450 100644
--- a/lib/PublicInbox/OnDestroy.pm
+++ b/lib/PublicInbox/OnDestroy.pm
@@ -10,7 +10,7 @@ sub new {
 
 sub DESTROY {
 	my ($cb, @args) = @{$_[0]};
-	if (!ref($cb)) {
+	if (!ref($cb) && $cb) {
 		my $pid = $cb;
 		return if $pid != $$;
 		$cb = shift @args;
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 6af49678..2373b370 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -13,15 +13,27 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	my $t1 = "$home/t1-mirror";
 	ok($lei->('add-external', $t1, '--mirror', "$http/t1/"), '--mirror v1');
 	ok(-f "$t1/public-inbox/msgmap.sqlite3", 't1-mirror indexed');
+
+	ok($lei->('ls-external'), 'ls-external');
+	like($lei_out, qr!\Q$t1\E!, 't1 added to ls-externals');
+
 	my $t2 = "$home/t2-mirror";
 	ok($lei->('add-external', $t2, '--mirror', "$http/t2/"), '--mirror v2');
 	ok(-f "$t2/msgmap.sqlite3", 't2-mirror indexed');
 
+	ok($lei->('ls-external'), 'ls-external');
+	like($lei_out, qr!\Q$t2\E!, 't2 added to ls-externals');
+
 	ok(!$lei->('add-external', $t2, '--mirror', "$http/t2/"),
 		'--mirror fails if reused');
 
+	ok($lei->('ls-external'), 'ls-external');
+	like($lei_out, qr!\Q$t2\E!, 'still in ls-externals');
+
 	ok(!$lei->('add-external', "$t2-fail", '-Lmedium'), '--mirror v2');
 	ok(!-d "$t2-fail", 'destination not created on failure');
+	ok($lei->('ls-external'), 'ls-external');
+	unlike($lei_out, qr!\Q$t2-fail\E!, 'not added to ls-external');
 });
 
 ok($td->kill, 'killed -httpd');

^ permalink raw reply related	[relevance 37%]

* [PATCH 13/19] lei: remove --mua-cmd alias for --mua
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
  2021-02-07  8:51 37% ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
  2021-02-07  8:51 50% ` [PATCH 12/19] lei: more consistent IPC exit and error handling Eric Wong
@ 2021-02-07  8:51 56% ` Eric Wong
  2021-02-07  8:51 41% ` [PATCH 14/19] lei: replace --thread with --threads Eric Wong
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

While "mua-cmd" may be more accurate, nobody is expected
to type 4 extra characters.  It's a needless ambiguity
with no precedence or prior art to follow.

Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
---
 Documentation/lei-q.pod    | 2 +-
 lib/PublicInbox/LEI.pm     | 6 +++---
 lib/PublicInbox/LeiHelp.pm | 2 +-
 t/lei.t                    | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 5c0ca843..07c742d2 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -36,7 +36,7 @@ Pretty print C<json> or C<concatjson> output.  If stdout is opened to
 a tty and used as the C<--output> destination, C<--pretty> is enabled
 by default.
 
-=item --mua-cmd=COMMAND, --mua=COMMAND
+=item --mua=COMMAND
 
 A command to run on C<--output> Maildir or mbox (e.g., C<mutt -f %f>).
 For a subset of MUAs known to accept a mailbox via C<-f>, COMMAND can
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 21862488..818f2cfb 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -112,7 +112,7 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	mua-cmd|mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
+	mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -232,7 +232,7 @@ my %OPTDESC = (
 'output|mfolder|o=s' => [ 'MFOLDER',
 	"destination (e.g.\xa0`/path/to/Maildir', ".
 	"or\xa0`-'\x{a0}for\x{a0}stdout)" ],
-'mua-cmd|mua=s' => [ 'CMD',
+'mua=s' => [ 'CMD',
 	"MUA to run on --output Maildir or mbox (e.g.\xa0`mutt\xa0-f\xa0%f')" ],
 
 'show	format|f=s' => [ 'OUT|plain|raw|html|mboxrd|mboxcl2|mboxcl',
@@ -723,7 +723,7 @@ sub exec_buf ($$) {
 
 sub start_mua {
 	my ($self) = @_;
-	my $mua = $self->{opt}->{'mua-cmd'} // return;
+	my $mua = $self->{opt}->{mua} // return;
 	my $mfolder = $self->{ovv}->{dst};
 	my (@cmd, $replaced);
 	if ($mua =~ /\A(?:mutt|mailx|mail|neomutt)\z/) {
diff --git a/lib/PublicInbox/LeiHelp.pm b/lib/PublicInbox/LeiHelp.pm
index 43414ab4..e62298f7 100644
--- a/lib/PublicInbox/LeiHelp.pm
+++ b/lib/PublicInbox/LeiHelp.pm
@@ -7,7 +7,7 @@ use strict;
 use v5.10.1;
 use Text::Wrap qw(wrap);
 
-my %NOHELP = map { $_ => 1 } qw(mua-cmd mfolder);
+my %NOHELP = map { $_ => 1 } qw(mfolder);
 
 sub call {
 	my ($self, $errmsg, $CMD, $OPTDESC) = @_;
diff --git a/t/lei.t b/t/lei.t
index f789f63a..8e771eb5 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -104,7 +104,7 @@ my $test_completion = sub {
 	ok($lei->(qw(_complete lei q)), 'complete q (no args)');
 	%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	for my $sw (qw(-f --format -o --output --mfolder --augment -a
-			--mua --mua-cmd --no-local --local --verbose -v
+			--mua --no-local --local --verbose -v
 			--save-as --no-remote --remote --torsocks
 			--reverse -r )) {
 		ok($out{$sw}, "$sw offered as `lei q' completion");

^ permalink raw reply related	[relevance 56%]

* [PATCH 12/19] lei: more consistent IPC exit and error handling
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
  2021-02-07  8:51 37% ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
@ 2021-02-07  8:51 50% ` Eric Wong
  2021-02-07  8:51 56% ` [PATCH 13/19] lei: remove --mua-cmd alias for --mua Eric Wong
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

We're able to propagate $? from wq_workers in a consistent
manner, now.
---
 lib/PublicInbox/IPC.pm        | 22 +++++++++++-----------
 lib/PublicInbox/LEI.pm        |  6 +++---
 lib/PublicInbox/LeiImport.pm  | 14 ++++++++++----
 lib/PublicInbox/LeiXSearch.pm | 12 +++++++++---
 4 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 728f726c..c8673e26 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -140,10 +140,9 @@ sub ipc_worker_reap { # dwaitpid callback
 }
 
 sub wq_wait_old {
-	my ($self, @args) = @_;
-	my $cb = ref($args[0]) eq 'CODE' ? shift(@args) : \&ipc_worker_reap;
+	my ($self, $cb, @args) = @_;
 	my $pids = delete $self->{"-wq_old_pids.$$"} or return;
-	dwaitpid($_, $cb, [$self, @args]) for @$pids;
+	dwaitpid($_, $cb // \&ipc_worker_reap, [$self, @args]) for @$pids;
 }
 
 # for base class, override in sub classes
@@ -348,13 +347,12 @@ sub wq_exit { # wakes up wq_worker_decr_wait
 sub wq_worker_decr { # SIGTTOU handler, kills first idle worker
 	my ($self) = @_;
 	return unless wq_workers($self);
-	my $s2 = $self->{-wq_s2} // die 'BUG: no wq_s2';
-	$self->wq_io_do('wq_exit', [ $s2, $s2, $s2 ]);
+	$self->wq_io_do('wq_exit');
 	# caller must call wq_worker_decr_wait in main loop
 }
 
 sub wq_worker_decr_wait {
-	my ($self, $timeout) = @_;
+	my ($self, $timeout, $cb, @args) = @_;
 	return if $self->{-wq_ppid} != $$; # can't reap siblings or parents
 	my $s1 = $self->{-wq_s1} // croak 'BUG: no wq_s1';
 	vec(my $rin = '', fileno($s1), 1) = 1;
@@ -363,17 +361,17 @@ sub wq_worker_decr_wait {
 	recv($s1, my $pid, 64, 0) // croak "recv: $!";
 	my $workers = $self->{-wq_workers} // croak 'BUG: no wq_workers';
 	delete $workers->{$pid} // croak "BUG: PID:$pid invalid";
-	dwaitpid($pid, \&ipc_worker_reap, $self);
+	dwaitpid($pid, $cb // \&ipc_worker_reap, [ $self, @args ]);
 }
 
 # set or retrieve number of workers
 sub wq_workers {
-	my ($self, $nr) = @_;
+	my ($self, $nr, $cb, @args) = @_;
 	my $cur = $self->{-wq_workers} or return;
 	if (defined $nr) {
 		while (scalar(keys(%$cur)) > $nr) {
 			$self->wq_worker_decr;
-			$self->wq_worker_decr_wait;
+			$self->wq_worker_decr_wait(undef, $cb, @args);
 		}
 		$self->wq_worker_incr while scalar(keys(%$cur)) < $nr;
 	}
@@ -381,7 +379,7 @@ sub wq_workers {
 }
 
 sub wq_close {
-	my ($self, $nohang) = @_;
+	my ($self, $nohang, $cb, @args) = @_;
 	delete @$self{qw(-wq_s1 -wq_s2)} or return;
 	my $ppid = delete $self->{-wq_ppid} or return;
 	my $workers = delete $self->{-wq_workers} // die 'BUG: no wq_workers';
@@ -390,7 +388,9 @@ sub wq_close {
 	if ($nohang) {
 		push @{$self->{"-wq_old_pids.$$"}}, @pids;
 	} else {
-		dwaitpid($_, \&ipc_worker_reap, $self) for @pids;
+		$cb //= \&ipc_worker_reap;
+		unshift @args, $self;
+		dwaitpid($_, $cb, \@args) for @pids;
 	}
 }
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 515bc2a3..21862488 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -360,7 +360,7 @@ sub fail_handler ($;$$) {
 	my ($lei, $code, $io) = @_;
 	for my $f (@WQ_KEYS) {
 		my $wq = delete $lei->{$f} or next;
-		$wq->wq_wait_old($lei) if $wq->wq_kill_old; # lei-daemon
+		$wq->wq_wait_old(undef, $lei) if $wq->wq_kill_old; # lei-daemon
 	}
 	close($io) if $io; # needed to avoid warnings on SIGPIPE
 	$lei->x_it($code // (1 >> 8));
@@ -827,9 +827,9 @@ sub dclose {
 	for my $f (@WQ_KEYS) {
 		my $wq = delete $self->{$f} or next;
 		if ($wq->wq_kill) {
-			$wq->wq_close
+			$wq->wq_close(0, undef, $self);
 		} elsif ($wq->wq_kill_old) {
-			$wq->wq_wait_old($self);
+			$wq->wq_wait_old(undef, $self);
 		}
 	}
 	close(delete $self->{1}) if $self->{1}; # may reap_compress
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 3a99570e..2b2dc2f7 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -14,12 +14,18 @@ sub _import_eml { # MboxReader callback
 	$sto->ipc_do('set_eml', $eml, $set_kw ? $sto->mbox_keywords($eml) : ());
 }
 
+sub import_done_wait { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($imp, $lei) = @$arg;
+	$lei->child_error($?, 'non-fatal errors during import') if $?;
+	my $ign = $lei->{sto}->ipc_do('done'); # PublicInbox::LeiStore::done
+	$lei->dclose;
+}
+
 sub import_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	my $imp = delete $lei->{imp};
-	$imp->wq_wait_old($lei) if $imp;
-	my $wait = $lei->{sto}->ipc_do('done');
-	$lei->dclose;
+	my $imp = delete $lei->{imp} or return;
+	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
 sub call { # the main "lei import" method
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1ba767c1..1024b020 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -279,12 +279,18 @@ sub git_tmp ($) {
 	$git;
 }
 
+sub xsearch_done_wait { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($wq, $lei) = @$arg;
+	$lei->child_error($?, 'non-fatal error from '.ref($wq)) if $?;
+}
+
 sub query_done { # EOF callback for main daemon
 	my ($lei) = @_;
 	my $l2m = delete $lei->{l2m};
-	$l2m->wq_wait_old($lei) if $l2m;
+	$l2m->wq_wait_old(\&xsearch_done_wait, $lei) if $l2m;
 	if (my $lxs = delete $lei->{lxs}) {
-		$lxs->wq_wait_old($lei);
+		$lxs->wq_wait_old(\&xsearch_done_wait, $lei);
 	}
 	$lei->{ovv}->ovv_end($lei);
 	if ($l2m) { # close() calls LeiToMail reap_compress
@@ -309,7 +315,7 @@ sub do_post_augment {
 	if (my $err = $@) {
 		if (my $lxs = delete $lei->{lxs}) {
 			$lxs->wq_kill;
-			$lxs->wq_close;
+			$lxs->wq_close(0, undef, $lei);
 		}
 		$lei->fail("$err");
 	}

^ permalink raw reply related	[relevance 50%]

* [PATCH 14/19] lei: replace --thread with --threads
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-07  8:51 56% ` [PATCH 13/19] lei: remove --mua-cmd alias for --mua Eric Wong
@ 2021-02-07  8:51 41% ` Eric Wong
  2021-02-07  8:51 33% ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

Nobody is expected to use long options, but for consistency
with mairix(1), we'll use the pluralized option throughout
(including existing PublicInbox::{Search,SearchView}).

Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
---
 Documentation/lei-q.pod       |  2 +-
 lib/PublicInbox/LEI.pm        | 16 ++++++++--------
 lib/PublicInbox/LeiHelp.pm    |  4 ++--
 lib/PublicInbox/LeiQuery.pm   |  4 ++--
 lib/PublicInbox/LeiXSearch.pm | 12 ++++++------
 lib/PublicInbox/Mbox.pm       |  2 +-
 lib/PublicInbox/Search.pm     |  2 +-
 lib/PublicInbox/SearchView.pm |  2 +-
 8 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 07c742d2..8f053a55 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -47,7 +47,7 @@ or C<neomutt>.
 
 Augment output destination instead of clobbering it.
 
-=item -t, --thread
+=item -t, --threads
 
 Return all messages in the same thread as the actual match(es).
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 818f2cfb..31e6b4a8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -109,14 +109,14 @@ sub index_opt {
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
-	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
+	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
 	mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
-	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
+	qw(type=s solve! format|f=s dedupe|d=s threads|t remote local!),
 	pass_through('git show') ],
 
 'add-external' => [ 'LOCATION',
@@ -135,9 +135,9 @@ our %CMD = ( # sorted in order of importance/use:
 'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
 'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],
 
-'plonk' => [ '--thread|--from=IDENT',
-	'exclude mail matching From: or thread from non-Message-ID searches',
-	qw(stdin| thread|t from|f=s mid=s oid=s) ],
+'plonk' => [ '--threads|--from=IDENT',
+	'exclude mail matching From: or threads from non-Message-ID searches',
+	qw(stdin| threads|t from|f=s mid=s oid=s) ],
 'mark' => [ 'MESSAGE_FLAGS...',
 	'set/unset keywords on message(s) from stdin',
 	qw(stdin| oid=s exact by-mid|mid:s) ],
@@ -224,9 +224,9 @@ my %OPTDESC = (
 
 'dedupe|d=s' => ['STRATEGY|content|oid|mid|none',
 		'deduplication strategy'],
-'show	thread|t' => 'display entire thread a message belongs to',
-'q	thread|t' =>
-	'return all messages in the same thread as the actual match(es)',
+'show	threads|t' => 'display entire thread a message belongs to',
+'q	threads|t' =>
+	'return all messages in the same threads as the actual match(es)',
 'augment|a' => 'augment --output destination instead of clobbering',
 
 'output|mfolder|o=s' => [ 'MFOLDER',
diff --git a/lib/PublicInbox/LeiHelp.pm b/lib/PublicInbox/LeiHelp.pm
index e62298f7..a654e1c2 100644
--- a/lib/PublicInbox/LeiHelp.pm
+++ b/lib/PublicInbox/LeiHelp.pm
@@ -40,7 +40,7 @@ sub call {
 			@vals = (' [', undef, ']');
 		} elsif ($x =~ s/=.+//) { # required arg: $x = "type=s"
 			@vals = (' ', undef);
-		} # else: no args $x = 'thread|t'
+		} # else: no args $x = 'threads|t'
 
 		# we support underscore options from public-inbox-* commands;
 		# but they've never been documented and will likely go away.
@@ -48,7 +48,7 @@ sub call {
 		for (grep { !/_/ && !$NOHELP{$_} } split(/\|/, $x)) {
 			length($_) > 1 ? push(@l, "--$_") : push(@s, "-$_");
 		}
-		if (!scalar(@vals)) { # no args 'thread|t'
+		if (!scalar(@vals)) { # no args 'threads|t'
 		} elsif ($arg_vals =~ s/\A([A-Z_]+)\b//) { # "NAME"
 			$vals[1] = $1;
 		} else {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 0346498f..9a6fa718 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -81,7 +81,7 @@ sub lei_q {
 	$self->{l2m}->{jobs} = ($mj // $nproc) if $self->{l2m};
 	PublicInbox::LeiOverview->new($self) or return;
 
-	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
+	my %mset_opt = map { $_ => $opt->{$_} } qw(threads limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{limit} //= 10000;
 	if (defined(my $sort = $opt->{'sort'})) {
@@ -96,7 +96,7 @@ sub lei_q {
 		}
 	}
 	# descending docid order
-	$mset_opt{relevance} //= -2 if $opt->{thread};
+	$mset_opt{relevance} //= -2 if $opt->{threads};
 	$self->{mset_opt} = \%mset_opt;
 
 	if ($opt->{stdin}) {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1024b020..2794140a 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -118,7 +118,7 @@ sub mset_progress {
 	}
 }
 
-sub query_thread_mset { # for --thread
+sub query_thread_mset { # for --threads
 	my ($self, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
 	my $lei = $self->{lei};
@@ -151,7 +151,7 @@ sub query_thread_mset { # for --thread
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
-sub query_mset { # non-parallel for non-"--thread" users
+sub query_mset { # non-parallel for non-"--threads" users
 	my ($self) = @_;
 	local $0 = "$0 query_mset";
 	my $lei = $self->{lei};
@@ -204,7 +204,7 @@ sub query_remote_mboxrd {
 	my $lei = $self->{lei};
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
-	push(@qform, t => 1) if $opt->{thread};
+	push(@qform, t => 1) if $opt->{threads};
 	my $verbose = $opt->{verbose};
 	my ($reap_tail, $reap_curl);
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
@@ -326,7 +326,7 @@ my $MAX_PER_HOST = 4;
 
 sub concurrency {
 	my ($self, $opt) = @_;
-	my $nl = $opt->{thread} ? locals($self) : 1;
+	my $nl = $opt->{threads} ? locals($self) : 1;
 	my $nr = remotes($self);
 	$nr = $MAX_PER_HOST if $nr > $MAX_PER_HOST;
 	$nl + $nr;
@@ -337,7 +337,7 @@ sub start_query { # always runs in main (lei-daemon) process
 	if (my $l2m = $lei->{l2m}) {
 		$lei->start_mua if $l2m->lock_free;
 	}
-	if ($lei->{opt}->{thread}) {
+	if ($lei->{opt}->{threads}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_thread_mset', [], $ibxish);
 		}
@@ -393,7 +393,7 @@ sub do_query {
 		# 1031: F_SETPIPE_SZ
 		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
 	}
-	if (!$lei->{opt}->{thread} && locals($self)) { # for query_mset
+	if (!$lei->{opt}->{threads} && locals($self)) { # for query_mset
 		# lei->{git_tmp} is set for wq_wait_old so we don't
 		# delete until all lei2mail + lei_xsearch workers are reaped
 		$lei->{git_tmp} = $self->{git_tmp} = git_tmp($self);
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 964147fa..1fca356b 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -236,7 +236,7 @@ sub mbox_all {
 		return PublicInbox::WWW::need($ctx, 'Overview');
 
 	my $qopts = $ctx->{qopts} = { relevance => -1 }; # ORDER BY docid ASC
-	$qopts->{thread} = 1 if $q->{t};
+	$qopts->{threads} = 1 if $q->{t};
 	my $mset = $srch->mset($q_string, $qopts);
 	$qopts->{offset} = $mset->size or
 			return [404, [qw(Content-Type text/plain)],
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 7c6a16be..dbae3bc5 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -336,7 +336,7 @@ sub _enquire_once { # retry_reopen callback
 	}
 
 	# `mairix -t / --threads' or JMAP collapseThreads
-	if ($opts->{thread} && has_threadid($self)) {
+	if ($opts->{threads} && has_threadid($self)) {
 		$enquire->set_collapse_key(THREADID);
 	}
 	$enquire->get_mset($opts->{offset} || 0, $opts->{limit} || 50);
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index d50d3cf6..08c77f35 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -48,7 +48,7 @@ sub sres_top_html {
 		limit => $q->{l},
 		offset => $o,
 		relevance => $q->{r},
-		thread => $q->{t},
+		threads => $q->{t},
 		asc => $asc,
 	};
 	my ($mset, $total, $err, $html);

^ permalink raw reply related	[relevance 41%]

* [PATCH 16/19] lei q: SIGWINCH process group with the terminal
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-07  8:51 33% ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
@ 2021-02-07  8:51 65% ` Eric Wong
  2021-02-07  8:51 43% ` [PATCH 17/19] lei import: support Maildirs Eric Wong
  2021-02-07 10:40 71% ` [PATCH 21/19] lei q: fix arbitrary --mua command handling Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

While using utime on the destination Maildir is enough for mutt
to eventually notice new mail, "eventually" isn't good enough.

Send a SIGWINCH to wake mutt (and likely other MUAs)
immediately.  This is more portable than relying on MUAs to
support inotify or EVFILT_VNODE.
---
 lib/PublicInbox/LEI.pm        | 11 +++++++++++
 lib/PublicInbox/LeiXSearch.pm |  7 ++++++-
 script/lei                    |  8 +++++---
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e52154e5..00affe82 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -746,6 +746,17 @@ sub start_mua {
 	}
 }
 
+sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
+	my ($self) = @_;
+	return unless $self->{opt}->{mua} && -t $self->{1};
+	# hit the process group that started the MUA
+	if (my $s = $self->{sock}) {
+		send($s, '-WINCH', MSG_EOR);
+	} elsif ($self->{oneshot}) {
+		kill('-WINCH', $$);
+	}
+}
+
 # caller needs to "-t $self->{1}" to check if tty
 sub start_pager {
 	my ($self) = @_;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 0e99e4b4..a7668a17 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -321,7 +321,12 @@ Error closing $lei->{ovv}->{dst}: $!
 			}
 			$lei->{1} = $out;
 		}
-		$l2m->lock_free ? $l2m->poke_dst : $lei->start_mua;
+		if ($l2m->lock_free) {
+			$l2m->poke_dst;
+			$lei->poke_mua;
+		} else { # mbox users
+			$lei->start_mua;
+		}
 	}
 	$lei->{-progress} and
 		$lei->err('# ', $lei->{-mset_total} // 0, " matches");
diff --git a/script/lei b/script/lei
index b7f21f14..0b0e2976 100755
--- a/script/lei
+++ b/script/lei
@@ -105,13 +105,15 @@ Falling back to (slow) one-shot mode
 			die "recvmsg: $!";
 		}
 		last if $buf eq '';
-		if ($buf =~ /\Ax_it ([0-9]+)\z/) {
+		if ($buf =~ /\Aexec (.+)\z/) {
+			$exec_cmd->(\@fds, split(/\0/, $1));
+		} elsif ($buf eq '-WINCH') {
+			kill($buf, $$); # for MUA
+		} elsif ($buf =~ /\Ax_it ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
 			last;
 		} elsif ($buf =~ /\Achild_error ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
-		} elsif ($buf =~ /\Aexec (.+)\z/) {
-			$exec_cmd->(\@fds, split(/\0/, $1));
 		} else {
 			$sigchld->();
 			die $buf;

^ permalink raw reply related	[relevance 65%]

* [PATCH 17/19] lei import: support Maildirs
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-07  8:51 65% ` [PATCH 16/19] lei q: SIGWINCH process group with the terminal Eric Wong
@ 2021-02-07  8:51 43% ` Eric Wong
  2021-02-07 10:40 71% ` [PATCH 21/19] lei q: fix arbitrary --mua command handling Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

It seems to be working trivially, though I'm probably
going to split out Maildir reading into a separate
package rather than using LeiToMail.
---
 MANIFEST                     |  1 +
 lib/PublicInbox/LeiImport.pm | 20 +++++++++++++++++---
 lib/PublicInbox/LeiStore.pm  |  8 +++++++-
 lib/PublicInbox/LeiToMail.pm | 11 ++++++-----
 t/lei-import-maildir.t       | 33 +++++++++++++++++++++++++++++++++
 t/lei_to_mail.t              |  6 +++---
 6 files changed, 67 insertions(+), 12 deletions(-)
 create mode 100644 t/lei-import-maildir.t

diff --git a/MANIFEST b/MANIFEST
index 521f1f68..7f417743 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -359,6 +359,7 @@ t/iso-2202-jp.eml
 t/kqnotify.t
 t/lei-daemon.t
 t/lei-externals.t
+t/lei-import-maildir.t
 t/lei-import.t
 t/lei-mirror.t
 t/lei.t
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 2b2dc2f7..a63bfdfd 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -8,6 +8,8 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::MboxReader;
 use PublicInbox::Eml;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::PktOp;
 
 sub _import_eml { # MboxReader callback
 	my ($eml, $sto, $set_kw) = @_;
@@ -35,7 +37,9 @@ sub call { # the main "lei import" method
 	$lei->{opt}->{kw} //= 1;
 	my $fmt = $lei->{opt}->{'format'};
 	my $self = $lei->{imp} = bless {}, $cls;
-	return $lei->fail('--format unspecified') if !$fmt;
+	if (my @f = grep { -f } @argv && !$fmt) {
+		return $lei->fail("--format unset for regular files:\n@f");
+	}
 	$self->{0} = $lei->{0} if $lei->{opt}->{stdin};
 	my $ops = {
 		'!' => [ $lei->can('fail_handler'), $lei ],
@@ -75,14 +79,14 @@ sub _import_fh {
 		if ($fmt eq 'eml') {
 			my $buf = do { local $/; <$fh> } //
 				return $lei->child_error(1 >> 8, <<"");
-		error reading $x: $!
+error reading $x: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
 			_import_eml($eml, $lei->{sto}, $set_kw);
 		} else { # some mbox
 			my $cb = PublicInbox::MboxReader->can($fmt);
 			$cb or return $lei->child_error(1 >> 8, <<"");
-	--format $fmt unsupported for $x
+--format $fmt unsupported for $x
 
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
@@ -90,6 +94,11 @@ sub _import_fh {
 	$lei->child_error(1 >> 8, "<stdin>: $@") if $@;
 }
 
+sub _import_maildir { # maildir_each_file cb
+	my ($f, $sto, $set_kw) = @_;
+	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
+}
+
 sub import_path_url {
 	my ($self, $x) = @_;
 	my $lei = $self->{lei};
@@ -99,6 +108,11 @@ sub import_path_url {
 unable to open $x: $!
 
 		_import_fh($lei, $fh, $x);
+	} elsif (-d _ && (-d "$x/cur" || -d "$x/new")) {
+		require PublicInbox::LeiToMail;
+		PublicInbox::LeiToMail::maildir_each_file($x,
+					\&_import_maildir,
+					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
 		$lei->fail("$x unsupported (TODO)");
 	}
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 3a215973..546d500b 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -12,7 +12,7 @@ use v5.10.1;
 use parent qw(PublicInbox::Lock PublicInbox::IPC);
 use PublicInbox::ExtSearchIdx;
 use PublicInbox::Import;
-use PublicInbox::InboxWritable;
+use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::V2Writable;
 use PublicInbox::ContentHash qw(content_hash content_digest);
 use PublicInbox::MID qw(mids mids_in);
@@ -224,6 +224,12 @@ sub set_eml {
 	add_eml($self, $eml, @kw) // set_eml_keywords($self, $eml, @kw);
 }
 
+sub set_eml_from_maildir {
+	my ($self, $f, $set_kw) = @_;
+	my $eml = eml_from_path($f) or return;
+	set_eml($self, $eml, $set_kw ? maildir_keywords($f) : ());
+}
+
 sub done {
 	my ($self) = @_;
 	my $err = '';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 857aeb63..a5a196db 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -266,8 +266,9 @@ sub _mbox_write_cb ($$) {
 	}
 }
 
-sub _maildir_each_file ($$;@) {
+sub maildir_each_file ($$;@) {
 	my ($dir, $cb, @arg) = @_;
+	$dir .= '/' unless substr($dir, -1) eq '/';
 	for my $d (qw(new/ cur/)) {
 		my $pfx = $dir.$d;
 		opendir my $dh, $pfx or next;
@@ -277,13 +278,13 @@ sub _maildir_each_file ($$;@) {
 	}
 }
 
-sub _augment_file { # _maildir_each_file cb
+sub _augment_file { # maildir_each_file cb
 	my ($f, $lei) = @_;
 	my $eml = PublicInbox::InboxWritable::eml_from_path($f) or return;
 	_augment($eml, $lei);
 }
 
-# _maildir_each_file callback, \&CORE::unlink doesn't work with it
+# maildir_each_file callback, \&CORE::unlink doesn't work with it
 sub _unlink { unlink($_[0]) }
 
 sub _rand () {
@@ -389,11 +390,11 @@ sub _do_augment_maildir {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
 			require PublicInbox::InboxWritable; # eml_from_path
-			_maildir_each_file($dst, \&_augment_file, $lei);
+			maildir_each_file($dst, \&_augment_file, $lei);
 			$dedupe->pause_dedupe;
 		}
 	} else { # clobber existing Maildir
-		_maildir_each_file($dst, \&_unlink);
+		maildir_each_file($dst, \&_unlink);
 	}
 }
 
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
new file mode 100644
index 00000000..5842e19e
--- /dev/null
+++ b/t/lei-import-maildir.t
@@ -0,0 +1,33 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use Cwd qw(abs_path);
+test_lei(sub {
+	my $md = "$ENV{HOME}/md";
+	for ($md, "$md/new", "$md/cur", "$md/tmp") {
+		mkdir($_) or BAIL_OUT("mkdir $_: $!");
+	}
+	symlink(abs_path('t/data/0001.patch'), "$md/cur/x:2,S") or
+		BAIL_OUT "symlink $md $!";
+	ok($lei->(qw(import), $md), 'import Maildir');
+	ok($lei->(qw(q s:boolean)), 'lei q');
+	my $res = json_utf8->decode($lei_out);
+	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
+	is_deeply($res->[0]->{kw}, ['seen'], 'keyword set');
+	is($res->[1], undef, 'only got one result');
+
+	ok($lei->(qw(import), $md), 'import Maildir again');
+	ok($lei->(qw(q -d none s:boolean)), 'lei q w/o dedupe');
+	my $r2 = json_utf8->decode($lei_out);
+	is_deeply($r2, $res, 'idempotent import');
+
+	rename("$md/cur/x:2,S", "$md/cur/x:2,SR") or BAIL_OUT "rename: $!";
+	ok($lei->(qw(import), $md), 'import Maildir after +answered');
+	ok($lei->(qw(q -d none s:boolean)), 'lei q after +answered');
+	$res = json_utf8->decode($lei_out);
+	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
+	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
+	is($res->[1], undef, 'only got one result');
+});
+done_testing;
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index f7535687..a25795ca 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -237,7 +237,7 @@ SKIP: { # FIFO support
 	$wcb->(\(my $x = $buf), $b4dc0ffee);
 
 	my @f;
-	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @f, shift });
+	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @f, shift });
 	open my $fh, $f[0] or BAIL_OUT $!;
 	is(do { local $/; <$fh> }, $buf, 'wrote to Maildir');
 
@@ -246,7 +246,7 @@ SKIP: { # FIFO support
 	$wcb->(\($x = $buf."\nx\n"), $deadcafe);
 
 	my @x = ();
-	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @x, shift });
+	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @x, shift });
 	is(scalar(@x), 1, 'wrote one new file');
 	ok(!-f $f[0], 'old file clobbered');
 	open $fh, $x[0] or BAIL_OUT $!;
@@ -257,7 +257,7 @@ SKIP: { # FIFO support
 	$wcb->(\($x = $buf."\ny\n"), $deadcafe);
 	$wcb->(\($x = $buf."\ny\n"), $b4dc0ffee); # skipped by dedupe
 	@f = ();
-	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @f, shift });
+	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @f, shift });
 	is(scalar grep(/\A\Q$x[0]\E\z/, @f), 1, 'old file still there');
 	my @new = grep(!/\A\Q$x[0]\E\z/, @f);
 	is(scalar @new, 1, '1 new file written (b4dc0ffee skipped)');

^ permalink raw reply related	[relevance 43%]

* [PATCH 15/19] lei q: improve remote mboxrd UX
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-07  8:51 41% ` [PATCH 14/19] lei: replace --thread with --threads Eric Wong
@ 2021-02-07  8:51 33% ` Eric Wong
  2021-02-07  8:51 65% ` [PATCH 16/19] lei q: SIGWINCH process group with the terminal Eric Wong
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  8:51 UTC (permalink / raw)
  To: meta

For early MUA spawners using lock-free outputs, we we need to
on the startq pipe to silence progress reporting.  For
--augment users, we can start the MUA even earlier by
creating Maildirs in the pre-augment phase.

To improve progress reporting for non-MUA (or late-MUA)
spawners, we'll no longer blindly append "--compressed" to the
curl(1) command when POST-ing for the gzipped mboxrd.
Furthermore, we'll overload stringify ('""') in LeiCurl to
ensure the empty -d '' string shows up properly.
---
 lib/PublicInbox/IPC.pm         |  8 ++--
 lib/PublicInbox/LEI.pm         |  4 +-
 lib/PublicInbox/LeiCurl.pm     | 11 +++--
 lib/PublicInbox/LeiMirror.pm   |  5 +-
 lib/PublicInbox/LeiOverview.pm |  3 +-
 lib/PublicInbox/LeiToMail.pm   | 24 +++++-----
 lib/PublicInbox/LeiXSearch.pm  | 87 ++++++++++++++++++++++------------
 7 files changed, 88 insertions(+), 54 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index c8673e26..9331233a 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -109,7 +109,6 @@ sub ipc_worker_spawn {
 		$w_res->autoflush(1);
 		$SIG{$_} = 'IGNORE' for (qw(TERM INT QUIT));
 		local $0 = $ident;
-		PublicInbox::DS::sig_setmask($sigset);
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
@@ -117,6 +116,7 @@ sub ipc_worker_spawn {
 			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
+			PublicInbox::DS::sig_setmask($sigset);
 			ipc_worker_loop($self, $r_req, $w_res);
 		};
 		warn "worker $ident PID:$$ died: $@\n" if $@;
@@ -293,7 +293,6 @@ sub _wq_worker_start ($$$) {
 		$SIG{$_} = 'IGNORE' for (qw(PIPE));
 		$SIG{$_} = 'DEFAULT' for (qw(TTOU TTIN TERM QUIT INT CHLD));
 		local $0 = $self->{-wq_ident};
-		PublicInbox::DS::sig_setmask($oldset);
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
@@ -301,6 +300,7 @@ sub _wq_worker_start ($$$) {
 			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
+			PublicInbox::DS::sig_setmask($oldset);
 			wq_worker_loop($self);
 		};
 		warn "worker $self->{-wq_ident} PID:$$ died: $@" if $@;
@@ -395,9 +395,9 @@ sub wq_close {
 }
 
 sub wq_kill_old {
-	my ($self) = @_;
+	my ($self, $sig) = @_;
 	my $pids = $self->{"-wq_old_pids.$$"} or return;
-	kill 'TERM', @$pids;
+	kill($sig // 'TERM', @$pids);
 }
 
 sub wq_kill {
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 31e6b4a8..e52154e5 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -741,7 +741,9 @@ sub start_mua {
 	} elsif ($self->{oneshot}) {
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
 	}
-	delete $self->{-progress};
+	if ($self->{lxs} && $self->{au_done}) { # kick wait_startq
+		syswrite($self->{au_done}, 'q' x ($self->{lxs}->{jobs} // 0));
+	}
 }
 
 # caller needs to "-t $self->{1}" to check if tty
diff --git a/lib/PublicInbox/LeiCurl.pm b/lib/PublicInbox/LeiCurl.pm
index 38b17c78..f346a1b4 100644
--- a/lib/PublicInbox/LeiCurl.pm
+++ b/lib/PublicInbox/LeiCurl.pm
@@ -8,6 +8,12 @@ use v5.10.1;
 use PublicInbox::Spawn qw(which);
 use PublicInbox::Config;
 
+# Ensures empty strings are quoted, we don't need more
+# sophisticated quoting than for empty strings: curl -d ''
+use overload '""' => sub {
+	join(' ', map { $_ eq '' ?  "''" : $_ } @{$_[0]});
+};
+
 my %lei2curl = (
 	'curl-config=s@' => 'config|K=s@',
 );
@@ -63,10 +69,9 @@ EOM
 
 # completes the result of cmd() for $uri
 sub for_uri {
-	my ($self, $lei, $uri) = @_;
+	my ($self, $lei, $uri, @opt) = @_;
 	my $pfx = torsocks($self, $lei, $uri) or return; # error
-	[ @$pfx, @$self, substr($uri->path, -3) eq '.gz' ? () : '--compressed',
-		$uri->as_string ]
+	bless [ @$pfx, @$self, @opt, $uri->as_string ], ref($self);
 }
 
 1;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 5ba69287..c5153148 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -31,7 +31,7 @@ sub try_scrape {
 	my $uri = URI->new($self->{src});
 	my $lei = $self->{lei};
 	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
-	my $cmd = $curl->for_uri($lei, $uri);
+	my $cmd = $curl->for_uri($lei, $uri, '--compressed');
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
 	my $fh = popen_rd($cmd, $lei->{env}, $opt);
 	my $html = do { local $/; <$fh> } // die "read(curl $uri): $!";
@@ -93,8 +93,7 @@ sub _try_config {
 	my $path = $uri->path;
 	chop($path) eq '/' or die "BUG: $uri not canonicalized";
 	$uri->path($path . '/_/text/config/raw');
-	my $cmd = $self->{curl}->for_uri($lei, $uri);
-	push @$cmd, '--compressed'; # curl decompresses for us
+	my $cmd = $self->{curl}->for_uri($lei, $uri, '--compressed');
 	my $ce = "$dst/inbox.config.example";
 	my $f = "$ce-$$.tmp";
 	open(my $fh, '+>', $f) or return $lei->err("open $f: $! (non-fatal)");
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index dcfb9cc7..f0ac4684 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -95,9 +95,10 @@ sub new {
 		$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
 	} else {
 		# default to the cheapest sort since MUA usually resorts
-		$lei->{opt}->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
+		$opt->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
 		$lei->{l2m} = eval { PublicInbox::LeiToMail->new($lei) };
 		return $lei->fail($@) if $@;
+		$lei->{early_mua} = 1 if $opt->{mua} && $lei->{l2m}->lock_free;
 	}
 	$self;
 }
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 3f65e9e9..857aeb63 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -370,7 +370,17 @@ sub new {
 	$self;
 }
 
-sub _pre_augment_maildir {} # noop
+sub _pre_augment_maildir {
+	my ($self, $lei) = @_;
+	my $dst = $lei->{ovv}->{dst};
+	for my $x (qw(tmp new cur)) {
+		my $d = $dst.$x;
+		next if -d $d;
+		require File::Path;
+		File::Path::mkpath($d);
+		-d $d or die "$d is not a directory";
+	}
+}
 
 sub _do_augment_maildir {
 	my ($self, $lei) = @_;
@@ -387,17 +397,7 @@ sub _do_augment_maildir {
 	}
 }
 
-sub _post_augment_maildir {
-	my ($self, $lei) = @_;
-	my $dst = $lei->{ovv}->{dst};
-	for my $x (qw(tmp new cur)) {
-		my $d = $dst.$x;
-		next if -d $d;
-		require File::Path;
-		File::Path::mkpath($d);
-		-d $d or die "$d is not a directory";
-	}
-}
+sub _post_augment_maildir {} # noop
 
 sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2794140a..0e99e4b4 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -101,9 +101,23 @@ sub _mset_more ($$) {
 # $startq will EOF when query_prepare is done augmenting and allow
 # query_mset and query_thread_mset to proceed.
 sub wait_startq ($) {
-	my ($startq) = @_;
-	$_[0] = undef;
-	read($startq, my $query_prepare_done, 1);
+	my ($lei) = @_;
+	my $startq = delete $lei->{startq} or return;
+	while (1) {
+		my $n = sysread($startq, my $query_prepare_done, 1);
+		if (defined $n) {
+			return if $n == 0; # no MUA
+			if ($query_prepare_done eq 'q') {
+				$lei->{opt}->{quiet} = 1;
+				delete $lei->{opt}->{verbose};
+				delete $lei->{-progress};
+			} else {
+				$lei->fail("$$ WTF `$query_prepare_done'");
+			}
+			return;
+		}
+		return $lei->fail("$$ wait_startq: $!") unless $!{EINTR};
+	}
 }
 
 sub mset_progress {
@@ -140,7 +154,7 @@ sub query_thread_mset { # for --threads
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
-				wait_startq($startq) if $startq;
+				wait_startq($lei);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
 			}
@@ -155,7 +169,6 @@ sub query_mset { # non-parallel for non-"--threads" users
 	my ($self) = @_;
 	local $0 = "$0 query_mset";
 	my $lei = $self->{lei};
-	my $startq = delete $lei->{startq};
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	for my $loc (locals($self)) {
@@ -168,7 +181,7 @@ sub query_mset { # non-parallel for non-"--threads" users
 				$mset->size, $mset->get_matches_estimated);
 		for my $mitem ($mset->items) {
 			my $smsg = smsg_for($self, $mitem) or next;
-			wait_startq($startq) if $startq;
+			wait_startq($lei);
 			$each_smsg->($smsg, $mitem);
 		}
 	} while (_mset_more($mset, $mo));
@@ -183,7 +196,7 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$smsg->parse_references($eml, mids($eml));
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
-	if (my $startq = delete($lei->{startq})) { wait_startq($startq) }
+	wait_startq($lei);
 	if ($lei->{-progress}) {
 		++$lei->{-nr_remote_eml};
 		my $now = now();
@@ -200,6 +213,10 @@ sub each_eml { # callback for MboxReader->mboxrd
 sub query_remote_mboxrd {
 	my ($self, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
+open my $dbg, '>>', '/tmp/dbg'; $dbg->autoflush(1); use Data::Dumper;
+	local $SIG{__WARN__} = sub {
+		print $dbg "$$ @_";
+	};
 	local $SIG{TERM} = sub { exit(0) }; # for DESTROY (File::Temp, $reap)
 	my $lei = $self->{lei};
 	my ($opt, $env) = @$lei{qw(opt env)};
@@ -210,7 +227,6 @@ sub query_remote_mboxrd {
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
 	fcntl($cerr, F_SETFL, O_APPEND|O_RDWR) or warn "set O_APPEND: $!";
 	my $rdr = { 2 => $cerr, pgid => 0 };
-	my $coff = 0;
 	my $sigint_reap = $lei->can('sigint_reap');
 	if ($verbose) {
 		# spawn a process to force line-buffering, otherwise curl
@@ -228,13 +244,14 @@ sub query_remote_mboxrd {
 		$lei->{-nr_remote_eml} = 0;
 		$uri->query_form(@qform);
 		my $cmd = $curl->for_uri($lei, $uri);
-		$lei->err("# @$cmd") if $verbose;
+		$lei->qerr("# $cmd");
 		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
 		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
 						$lei, $each_smsg);
-		my $err = waitpid($pid, 0) == $pid ? undef : "BUG: waitpid: $!";
+		my $err = waitpid($pid, 0) == $pid ? undef
+						: "BUG: waitpid($cmd): $!";
 		@$reap_curl = (); # cancel OnDestroy
 		die $err if $err;
 		if ($? == 0) {
@@ -242,16 +259,18 @@ sub query_remote_mboxrd {
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
 		}
-		seek($cerr, $coff, SEEK_SET) or warn "seek(curl stderr): $!\n";
-		my $e = do { local $/; <$cerr> } //
-				die "read(curl stderr): $!\n";
-		$coff += length($e);
-		truncate($cerr, 0);
-		next if (($? >> 8) == 22 && $e =~ /\b404\b/);
-		$lei->child_error($?);
+		$err = '';
+		if (-s $cerr) {
+			seek($cerr, 0, SEEK_SET) or
+					$lei->err("seek($cmd stderr): $!");
+			$err = do { local $/; <$cerr> } //
+					"read($cmd stderr): $!";
+			truncate($cerr, 0) or
+					$lei->err("truncate($cmd stderr): $!");
+		}
+		next if (($? >> 8) == 22 && $err =~ /\b404\b/);
 		$uri->query_form(q => $lei->{mset_opt}->{qstr});
-		# --verbose already showed the error via tail(1)
-		$lei->err("E: $uri \$?=$?\n", $verbose ? () : $e);
+		$lei->child_error($?, "E: <$uri> $err");
 	}
 	undef $each_smsg;
 	$lei->{ovv}->ovv_atexit_child($lei);
@@ -311,15 +330,23 @@ Error closing $lei->{ovv}->{dst}: $!
 
 sub do_post_augment {
 	my ($lei) = @_;
-	eval { $lei->{l2m}->post_augment($lei) };
-	if (my $err = $@) {
-		if (my $lxs = delete $lei->{lxs}) {
-			$lxs->wq_kill;
-			$lxs->wq_close(0, undef, $lei);
+	my $l2m = $lei->{l2m};
+	my $err;
+	if ($l2m) {
+		eval { $l2m->post_augment($lei) };
+		$err = $@;
+		if ($err) {
+			if (my $lxs = delete $lei->{lxs}) {
+				$lxs->wq_kill;
+				$lxs->wq_close(0, undef, $lei);
+			}
+			$lei->fail("$err");
 		}
-		$lei->fail("$err");
 	}
-	close(delete $lei->{au_done}); # triggers wait_startq
+	if (!$err && delete $lei->{early_mua}) { # non-augment case
+		$lei->start_mua;
+	}
+	close(delete $lei->{au_done}); # triggers wait_startq in lei_xsearch
 }
 
 my $MAX_PER_HOST = 4;
@@ -334,9 +361,6 @@ sub concurrency {
 
 sub start_query { # always runs in main (lei-daemon) process
 	my ($self, $lei) = @_;
-	if (my $l2m = $lei->{l2m}) {
-		$lei->start_mua if $l2m->lock_free;
-	}
 	if ($lei->{opt}->{threads}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_thread_mset', [], $ibxish);
@@ -387,6 +411,9 @@ sub do_query {
 	my $l2m = $lei->{l2m};
 	if ($l2m) {
 		$l2m->pre_augment($lei);
+		if ($lei->{opt}->{augment} && delete $lei->{early_mua}) {
+			$lei->start_mua;
+		}
 		$l2m->wq_workers_start('lei2mail', $l2m->{jobs},
 					$lei->oldset, { lei => $lei });
 		pipe($lei->{startq}, $lei->{au_done}) or die "pipe: $!";
@@ -404,7 +431,7 @@ sub do_query {
 	delete $lei->{pkt_op_p};
 	$l2m->wq_close(1) if $l2m;
 	$lei->event_step_init; # wait for shutdowns
-	$self->wq_io_do('query_prepare', []) if $l2m;
+	$self->wq_io_do('query_prepare', []) if $l2m; # for augment/dedupe
 	start_query($self, $lei);
 	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {

^ permalink raw reply related	[relevance 33%]

* [PATCH 21/19] lei q: fix arbitrary --mua command handling
  2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
                   ` (6 preceding siblings ...)
  2021-02-07  8:51 43% ` [PATCH 17/19] lei import: support Maildirs Eric Wong
@ 2021-02-07 10:40 71% ` Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07 10:40 UTC (permalink / raw)
  To: meta

Perl doesn't seem to warn for shadowed variables, here :x
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 00affe82..e95a674b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -731,7 +731,7 @@ sub start_mua {
 	# TODO: help wanted: other common FOSS MUAs
 	} else {
 		require Text::ParseWords;
-		my @cmd = Text::ParseWords::shellwords($mua);
+		@cmd = Text::ParseWords::shellwords($mua);
 		# mutt uses '%f' for open-hook with compressed mbox, we follow
 		@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
 	}

^ permalink raw reply related	[relevance 71%]

* lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands]
  @ 2021-02-07 19:58 90%   ` Eric Wong
  2021-02-07 20:33 90%     ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-07 19:58 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> +++ b/Documentation/lei-q.pod

> +=item -o PATH, --output=PATH, --mfolder=PATH
> +
> +Destination for results (e.g., C<path/to/Maildir> or - for stdout).

Fwiw, I didn't really like the term "mfolder" but that's what
mairix users are used to...  Perhaps:

	=item -o MFOLDER, --output=MFOLDER

Or is "mfolder" not that bad and we ditch --output instead?

(mairix uses '-o', so we match it there)

I never really got used to using "folder" to describe mailboxes
or directories.  I associate "folder" with something that holds
only a few pages of paper in the physical world; not thousands
or millions of items.

^ permalink raw reply	[relevance 90%]

* Re: lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands]
  2021-02-07 19:58 90%   ` lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands] Eric Wong
@ 2021-02-07 20:33 90%     ` Kyle Meyer
  2021-02-07 20:59 90%       ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-07 20:33 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Kyle Meyer <kyle@kyleam.com> wrote:
>> +++ b/Documentation/lei-q.pod
>
>> +=item -o PATH, --output=PATH, --mfolder=PATH
>> +
>> +Destination for results (e.g., C<path/to/Maildir> or - for stdout).
>
> Fwiw, I didn't really like the term "mfolder" but that's what
> mairix users are used to...  Perhaps:
>
> 	=item -o MFOLDER, --output=MFOLDER
>
> Or is "mfolder" not that bad and we ditch --output instead?

Yeah, I don't like "mfolder" either.  In the --thread{,s} case,
consistency seemed worth it to me because either one sounded okay, and a
one character difference is easy to not notice.  But in this case, I
think --output is the better name, so if we're dropping one, I'd vote to
cast out --mfolder.

Using MFOLDER rather than PATH as the metavariable seems a bit confusing
to me because the target isn't a directory for --format values other
than "maildir".

^ permalink raw reply	[relevance 90%]

* Re: lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands]
  2021-02-07 20:33 90%     ` Kyle Meyer
@ 2021-02-07 20:59 90%       ` Eric Wong
  2021-02-07 21:47 90%         ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-07 20:59 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> > Kyle Meyer <kyle@kyleam.com> wrote:
> >> +++ b/Documentation/lei-q.pod
> >
> >> +=item -o PATH, --output=PATH, --mfolder=PATH
> >> +
> >> +Destination for results (e.g., C<path/to/Maildir> or - for stdout).
> >
> > Fwiw, I didn't really like the term "mfolder" but that's what
> > mairix users are used to...  Perhaps:
> >
> > 	=item -o MFOLDER, --output=MFOLDER
> >
> > Or is "mfolder" not that bad and we ditch --output instead?
> 
> Yeah, I don't like "mfolder" either.  In the --thread{,s} case,
> consistency seemed worth it to me because either one sounded okay, and a
> one character difference is easy to not notice.  But in this case, I
> think --output is the better name, so if we're dropping one, I'd vote to
> cast out --mfolder.

Alright, I'll think about dropping ...  Right now, it's still
supported, but masked out of --help output but with a "MFOLDER"
placeholder.

Perhaps a note in the man page noting it's mairix analogue is
sufficient?

> Using MFOLDER rather than PATH as the metavariable seems a bit confusing
> to me because the target isn't a directory for --format values other
> than "maildir".

Fwiw, mairix uses "mfolder" for mbox and IMAP destinations, too;
(I've never used IMAP with mairix, but have every intention of
supporting IMAP in lei).

"LOCATION" may also be a suitable placeholder *shrug*

^ permalink raw reply	[relevance 90%]

* Re: lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands]
  2021-02-07 20:59 90%       ` Eric Wong
@ 2021-02-07 21:47 90%         ` Kyle Meyer
  2021-02-07 21:55 90%           ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-07 21:47 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Kyle Meyer <kyle@kyleam.com> wrote:
[...]
>> But in this case, I think --output is the better name, so if we're
>> dropping one, I'd vote to cast out --mfolder.
>
> Alright, I'll think about dropping ...  Right now, it's still
> supported, but masked out of --help output but with a "MFOLDER"
> placeholder.
>
> Perhaps a note in the man page noting it's mairix analogue is
> sufficient?

As someone that has only looked into mairix a few times but not used it,
I'd find the pointer helpful, I think.

>> Using MFOLDER rather than PATH as the metavariable seems a bit confusing
>> to me because the target isn't a directory for --format values other
>> than "maildir".
>
> Fwiw, mairix uses "mfolder" for mbox and IMAP destinations, too;
> (I've never used IMAP with mairix, but have every intention of
> supporting IMAP in lei).
>
> "LOCATION" may also be a suitable placeholder *shrug*

Oh, IMAP destinations.  So PATH isn't a great choice then.  In that
case, switching to MFOLDER or LOCATION as the placeholder sounds fine to
me.  I guess MFOLDER would be good at this point for consistency with `q
--help'.  I've made a note to do that when updating the lei manpages
(hope to get to that tomorrow or the next day) assuming you haven't
handle it before then.

^ permalink raw reply	[relevance 90%]

* Re: lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands]
  2021-02-07 21:47 90%         ` Kyle Meyer
@ 2021-02-07 21:55 90%           ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07 21:55 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> > Fwiw, mairix uses "mfolder" for mbox and IMAP destinations, too;
> > (I've never used IMAP with mairix, but have every intention of
> > supporting IMAP in lei).
> >
> > "LOCATION" may also be a suitable placeholder *shrug*
> 
> Oh, IMAP destinations.  So PATH isn't a great choice then.  In that
> case, switching to MFOLDER or LOCATION as the placeholder sounds fine to
> me.  I guess MFOLDER would be good at this point for consistency with `q
> --help'.  I've made a note to do that when updating the lei manpages
> (hope to get to that tomorrow or the next day) assuming you haven't
> handle it before then.

Right, I started using LOCATION for external inboxes instead of
"URL_OR_PATH" in --help; so MFOLDER may help disambiguate it for
non-external destinations.

Thanks again for tackling the doc stuff; I'll leave the .pod
untouched for now to avoid conflicts.  I have an aversion to all
mark{up,down} languages, thus it's most natural for me to
document stuff in commit messages, --help text and emails :>

^ permalink raw reply	[relevance 90%]

* lei q --remote-if-local-missing ?
@ 2021-02-08  8:49 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  8:49 UTC (permalink / raw)
  To: meta

Right now, local and remote externals are searched in parallel
if they're both enabled.  Local requests are thrown into the
work queue first, but's the difference apparent when there's
enough worker processes to start all requests right away.

This still means local results show up first due to a lack of
latency compared to remote externals.

The same effect can be had by running the following commands
in sequence:

	lei q -o MFOLDER foo bar
	lei q -o MFOLDER --remote --no-local --augment foo bar

So I'm wondering if replacing the above two commands with one:

	lei q -o MFOLDER --remote-if-local-missing foo bar

It could be a bit of a pain to implement + test + support,
though.

^ permalink raw reply	[relevance 71%]

* [PATCH 00/13] lei approxidate, startup fix, --alert
@ 2021-02-08  9:05 63% Eric Wong
  2021-02-08  9:05 32% ` [PATCHv2 01/13] lei q: improve remote mboxrd UX + MUA Eric Wong
                   ` (6 more replies)
  0 siblings, 7 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

I've redone and squashed some changes into PATCH 1/13 which
was posted yesterday.

3/13 (SIGWINCH) is rebase necessary after 1/13,
4/13 (--alert=CMD) is a generalized take on 3/13.

12/13 is...

Eric Wong (13):
  lei q: improve remote mboxrd UX + MUA
  lei_xsearch: quiet Eml warnings from remote mboxrds
  lei q: SIGWINCH process group with the terminal
  lei q: support --alert=CMD for early MUA users
  tests: favor IPv6
  ds: improve add_timer usability
  lei: start_pager: drop COLUMNS default
  lei: avoid racing on unlink + bind + listen
  lei: drop BSD::Resource usage
  git: implement date_parse method
  lei q: use git approxidate with d:, dt: and rt: ranges
  search: use one git-rev-parse process for all dates
  spawnpp: raise exception on E2BIG errors

 lib/PublicInbox/DS.pm           |  10 ++--
 lib/PublicInbox/ExtSearchIdx.pm |   5 +-
 lib/PublicInbox/FakeInotify.pm  |   4 +-
 lib/PublicInbox/Git.pm          |  10 +++-
 lib/PublicInbox/IPC.pm          |   8 +--
 lib/PublicInbox/LEI.pm          | 100 ++++++++++++++++++++++----------
 lib/PublicInbox/LeiCurl.pm      |  11 +++-
 lib/PublicInbox/LeiMirror.pm    |   5 +-
 lib/PublicInbox/LeiOverview.pm  |   6 +-
 lib/PublicInbox/LeiQuery.pm     |  12 ++--
 lib/PublicInbox/LeiToMail.pm    |  24 ++++----
 lib/PublicInbox/LeiXSearch.pm   |  97 ++++++++++++++++++++-----------
 lib/PublicInbox/Search.pm       |  86 +++++++++++++++++++++++++++
 lib/PublicInbox/SpawnPP.pm      |  23 ++++++--
 lib/PublicInbox/TestCommon.pm   |  30 ++++++++--
 lib/PublicInbox/Watch.pm        |  19 +++---
 script/lei                      |  16 ++---
 t/extsearch.t                   |   2 +-
 t/git.t                         |  17 +++++-
 t/httpd-corner.psgi             |   2 +-
 t/httpd-corner.t                |  12 ++--
 t/httpd-https.t                 |   2 +-
 t/httpd-unix.t                  |   7 +--
 t/httpd.t                       |   8 +--
 t/imapd-tls.t                   |   4 +-
 t/imapd.t                       |   8 +--
 t/lei-mirror.t                  |   2 +-
 t/nntpd-tls.t                   |   4 +-
 t/nntpd.t                       |  11 ++--
 t/psgi_attach.t                 |   2 +-
 t/psgi_v2.t                     |   2 +-
 t/search.t                      |  51 ++++++++++++++++
 t/solver_git.t                  |   2 +-
 t/v2mirror.t                    |   3 +-
 t/v2writable.t                  |   3 +-
 t/www_altid.t                   |   2 +-
 t/www_listing.t                 |   3 +-
 xt/git-http-backend.t           |   4 +-
 xt/httpd-async-stream.t         |   2 +-
 xt/imapd-mbsync-oimap.t         |   4 +-
 xt/imapd-validate.t             |   4 +-
 xt/mem-imapd-tls.t              |   2 +-
 xt/nntpd-validate.t             |   3 +-
 xt/perf-nntpd.t                 |  16 ++---
 xt/solver.t                     |   3 +-
 45 files changed, 441 insertions(+), 210 deletions(-)


^ permalink raw reply	[relevance 63%]

* [PATCH 07/13] lei: start_pager: drop COLUMNS default
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-08  9:05 50% ` [PATCH 04/13] lei q: support --alert=CMD for early MUA users Eric Wong
@ 2021-02-08  9:05 71% ` Eric Wong
  2021-02-08  9:05 56% ` [PATCH 08/13] lei: avoid racing on unlink + bind + listen Eric Wong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

It shouldn't be needed since none of our subcommands will care
or attempt to format output.  Once "lei show" is implemented,
we'll run "git show" directly on the result.
---
 lib/PublicInbox/LEI.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7b2a3e6f..2f370f52 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -792,8 +792,7 @@ sub start_pager {
 	chomp(my $pager = <$fh> // '');
 	close($fh) or warn "`git var PAGER' error: \$?=$?";
 	return if $pager eq 'cat' || $pager eq '';
-	# TODO TIOCGWINSZ
-	my $new_env = { LESS => 'FRX', LV => '-c', COLUMNS => 80 };
+	my $new_env = { LESS => 'FRX', LV => '-c' };
 	$new_env->{MORE} = 'FRX' if $^O eq 'freebsd';
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };

^ permalink raw reply related	[relevance 71%]

* [PATCH 09/13] lei: drop BSD::Resource usage
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-08  9:05 56% ` [PATCH 08/13] lei: avoid racing on unlink + bind + listen Eric Wong
@ 2021-02-08  9:05 68% ` Eric Wong
  2021-02-08  9:05 42% ` [PATCH 11/13] lei q: use git approxidate with d:, dt: and rt: ranges Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

It's no longer necessary with the changes to stop doing
FD passing in our backend.

cf. commits 5180ed0a1cd65139 and 7d440bf3667b8ef5
    ("lei q: eliminate $not_done temporary git dir hack")
    ("lei q: reorder internals to reduce FD passing")
---
 lib/PublicInbox/LEI.pm | 5 -----
 script/lei             | 8 --------
 2 files changed, 13 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index cddb94e9..e2a945a4 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -941,11 +941,6 @@ sub lazy_start {
 		$! = $errno; # allow interpolation to stringify in die
 		die "connect($path): $!";
 	}
-	if (eval { require BSD::Resource }) {
-		my $NOFILE = BSD::Resource::RLIMIT_NOFILE();
-		my ($s, $h) = BSD::Resource::getrlimit($NOFILE);
-		BSD::Resource::setrlimit($NOFILE, $h, $h) if $s < $h;
-	}
 	umask(077) // die("umask(077): $!");
 	bind($listener, $addr) or die "bind($path): $!";
 	listen($listener, 1024) or die "listen: $!";
diff --git a/script/lei b/script/lei
index 0b0e2976..cb605e2e 100755
--- a/script/lei
+++ b/script/lei
@@ -82,14 +82,6 @@ Falling back to (slow) one-shot mode
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
 	my $n = $send_cmd->($sock, [0, 1, 2, fileno($dh)], $buf, MSG_EOR);
-	if (!$n && $!{ETOOMANYREFS} && eval { require BSD::Resource }) {
-		my $NOFILE = BSD::Resource::RLIMIT_NOFILE();
-		my ($s, $h) = BSD::Resource::getrlimit($NOFILE);
-		if ($s < $h && BSD::Resource::setrlimit($NOFILE, $h, $h)) {
-			$n = $send_cmd->($sock, [0, 1, 2, fileno($dh)],
-					$buf, MSG_EOR);
-		}
-	}
 	if (!$n) {
 		die "sendmsg: $! (check RLIMIT_NOFILE)\n" if $!{ETOOMANYREFS};
 		die "sendmsg: $!\n";

^ permalink raw reply related	[relevance 68%]

* [PATCH 03/13] lei q: SIGWINCH process group with the terminal
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
  2021-02-08  9:05 32% ` [PATCHv2 01/13] lei q: improve remote mboxrd UX + MUA Eric Wong
@ 2021-02-08  9:05 64% ` Eric Wong
  2021-02-08  9:05 50% ` [PATCH 04/13] lei q: support --alert=CMD for early MUA users Eric Wong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

While using utime on the destination Maildir is enough for mutt
to eventually notice new mail, "eventually" isn't good enough.

Send a SIGWINCH to wake mutt (and likely other MUAs)
immediately.  This is more portable than relying on MUAs to
support inotify or EVFILT_VNODE.
---
 resent after rebasing due to 1/13 squashes

 lib/PublicInbox/LEI.pm        | 11 +++++++++++
 lib/PublicInbox/LeiXSearch.pm |  7 ++++++-
 script/lei                    |  8 +++++---
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index c3645698..e95a674b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -746,6 +746,17 @@ sub start_mua {
 	}
 }
 
+sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
+	my ($self) = @_;
+	return unless $self->{opt}->{mua} && -t $self->{1};
+	# hit the process group that started the MUA
+	if (my $s = $self->{sock}) {
+		send($s, '-WINCH', MSG_EOR);
+	} elsif ($self->{oneshot}) {
+		kill('-WINCH', $$);
+	}
+}
+
 # caller needs to "-t $self->{1}" to check if tty
 sub start_pager {
 	my ($self) = @_;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 588df3a4..10485220 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -317,7 +317,12 @@ Error closing $lei->{ovv}->{dst}: $!
 			}
 			$lei->{1} = $out;
 		}
-		$l2m->lock_free ? $l2m->poke_dst : $lei->start_mua;
+		if ($l2m->lock_free) {
+			$l2m->poke_dst;
+			$lei->poke_mua;
+		} else { # mbox users
+			$lei->start_mua;
+		}
 	}
 	$lei->{-progress} and
 		$lei->err('# ', $lei->{-mset_total} // 0, " matches");
diff --git a/script/lei b/script/lei
index b7f21f14..0b0e2976 100755
--- a/script/lei
+++ b/script/lei
@@ -105,13 +105,15 @@ Falling back to (slow) one-shot mode
 			die "recvmsg: $!";
 		}
 		last if $buf eq '';
-		if ($buf =~ /\Ax_it ([0-9]+)\z/) {
+		if ($buf =~ /\Aexec (.+)\z/) {
+			$exec_cmd->(\@fds, split(/\0/, $1));
+		} elsif ($buf eq '-WINCH') {
+			kill($buf, $$); # for MUA
+		} elsif ($buf =~ /\Ax_it ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
 			last;
 		} elsif ($buf =~ /\Achild_error ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
-		} elsif ($buf =~ /\Aexec (.+)\z/) {
-			$exec_cmd->(\@fds, split(/\0/, $1));
 		} else {
 			$sigchld->();
 			die $buf;

^ permalink raw reply related	[relevance 64%]

* [PATCH 08/13] lei: avoid racing on unlink + bind + listen
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-08  9:05 71% ` [PATCH 07/13] lei: start_pager: drop COLUMNS default Eric Wong
@ 2021-02-08  9:05 56% ` Eric Wong
  2021-02-08  9:05 68% ` [PATCH 09/13] lei: drop BSD::Resource usage Eric Wong
  2021-02-08  9:05 42% ` [PATCH 11/13] lei q: use git approxidate with d:, dt: and rt: ranges Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

When multiple lei(1) processes are starting in parallel without
lei-daemon already running, it's possible for them to trample
each others' socket path trying to start lei-daemon.  Lock
errors.log before unlink/bind/listen.  We'll add an extra
connect(2) attempt to check if the starter lost the race.

Without this change, a stress script like the following could
easily cause problems:

	lei q -o ~/tmp/a foo ... &
	lei q -o ~/tmp/b bar ... &
	lei q -o ~/tmp/c quux ... &
	lei q -o ~/tmp/d baz ... &
---
 lib/PublicInbox/LEI.pm | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2f370f52..cddb94e9 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -22,7 +22,7 @@ use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLET);
 use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now dwaitpid);
 use PublicInbox::Spawn qw(spawn popen_rd);
-use PublicInbox::OnDestroy;
+use PublicInbox::Lock;
 use Time::HiRes qw(stat); # ctime comparisons for config cache
 use File::Path qw(mkpath);
 use File::Spec;
@@ -828,17 +828,19 @@ sub accept_dispatch { # Listener {post_accept} callback
 	vec(my $rvec = '', fileno($sock), 1) = 1;
 	select($rvec, undef, undef, 60) or
 		return send($sock, 'timed out waiting to recv FDs', MSG_EOR);
-	my @fds = $recv_cmd->($sock, my $buf, 4096 * 33); # >MAX_ARG_STRLEN
+	# (4096 * 33) >MAX_ARG_STRLEN
+	my @fds = $recv_cmd->($sock, my $buf, 4096 * 33) or return; # EOF
 	if (scalar(@fds) == 4) {
 		for my $i (0..3) {
 			my $fd = shift(@fds);
 			open($self->{$i}, '+<&=', $fd) and next;
 			send($sock, "open(+<&=$fd) (FD=$i): $!", MSG_EOR);
 		}
-	} else {
-		my $msg = "recv_cmd failed: $!";
-		warn $msg;
+	} elsif (!defined($fds[0])) {
+		warn(my $msg = "recv_cmd failed: $!");
 		return send($sock, $msg, MSG_EOR);
+	} else {
+		return;
 	}
 	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	# $ENV_STR = join('', map { "\0$_=$ENV{$_}" } keys %ENV);
@@ -923,9 +925,19 @@ sub dump_and_clear_log {
 # lei(1) calls this when it can't connect
 sub lazy_start {
 	my ($path, $errno, $narg) = @_;
-	if ($errno == ECONNREFUSED) {
-		unlink($path) or die "unlink($path): $!";
-	} elsif ($errno != ENOENT) {
+	local ($errors_log, $listener);
+	($errors_log) = ($path =~ m!\A(.+?/)[^/]+\z!);
+	$errors_log .= 'errors.log';
+	my $addr = pack_sockaddr_un($path);
+	my $lk = bless { lock_path => $errors_log }, 'PublicInbox::Lock';
+	$lk->lock_acquire;
+	socket($listener, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
+	if ($errno == ECONNREFUSED || $errno == ENOENT) {
+		return if connect($listener, $addr); # another process won
+		if ($errno == ECONNREFUSED && -S $path) {
+			unlink($path) or die "unlink($path): $!";
+		}
+	} else {
 		$! = $errno; # allow interpolation to stringify in die
 		die "connect($path): $!";
 	}
@@ -935,10 +947,10 @@ sub lazy_start {
 		BSD::Resource::setrlimit($NOFILE, $h, $h) if $s < $h;
 	}
 	umask(077) // die("umask(077): $!");
-	local $listener;
-	socket($listener, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
-	bind($listener, pack_sockaddr_un($path)) or die "bind($path): $!";
+	bind($listener, $addr) or die "bind($path): $!";
 	listen($listener, 1024) or die "listen: $!";
+	$lk->lock_release;
+	undef $lk;
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	local $oldset = PublicInbox::DS::block_signals();
@@ -956,9 +968,6 @@ sub lazy_start {
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
 	(-p STDOUT) or die "E: stdout must be a pipe\n";
-	local $errors_log;
-	($errors_log) = ($path =~ m!\A(.+?/)[^/]+\z!);
-	$errors_log .= 'errors.log';
 	open(STDIN, '+>>', $errors_log) or die "open($errors_log): $!";
 	STDIN->autoflush(1);
 	dump_and_clear_log("from previous daemon process:\n");

^ permalink raw reply related	[relevance 56%]

* [PATCH 04/13] lei q: support --alert=CMD for early MUA users
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
  2021-02-08  9:05 32% ` [PATCHv2 01/13] lei q: improve remote mboxrd UX + MUA Eric Wong
  2021-02-08  9:05 64% ` [PATCH 03/13] lei q: SIGWINCH process group with the terminal Eric Wong
@ 2021-02-08  9:05 50% ` Eric Wong
  2021-02-08  9:05 71% ` [PATCH 07/13] lei: start_pager: drop COLUMNS default Eric Wong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

For --mua users writing to lock-free -o MFOLDER destinations;
we'll keep -WINCH and send an ASCII terminal bell when results
are complete.  This is intended to let early MUA spawners know
when lei2mail is done writing results.

We'll also support running arbitrary commands.  It may be used
to run play(1) (from SoX), handle pipelines+redirects
(e.g. "/bin/sh -c 'echo search done | wall'") or other commands.
---
 lib/PublicInbox/LEI.pm         | 54 ++++++++++++++++++++++++----------
 lib/PublicInbox/LeiOverview.pm |  5 +++-
 2 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e95a674b..7b2a3e6f 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -112,7 +112,7 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
+	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -227,6 +227,11 @@ my %OPTDESC = (
 'show	threads|t' => 'display entire thread a message belongs to',
 'q	threads|t' =>
 	'return all messages in the same threads as the actual match(es)',
+'alert=s@' => ['CMD,-WINCH,-bell,<any command>',
+	'run command(s) or perform ops when done writing to output ' .
+	'(default: "-WINCH,-bell" with --mua and Maildir/IMAP output, ' .
+	'nothing otherwise)' ],
+
 'augment|a' => 'augment --output destination instead of clobbering',
 
 'output|mfolder|o=s' => [ 'MFOLDER',
@@ -739,21 +744,43 @@ sub start_mua {
 	if (my $sock = $self->{sock}) { # lei(1) client process runs it
 		send($sock, exec_buf(\@cmd, {}), MSG_EOR);
 	} elsif ($self->{oneshot}) {
-		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
+		$self->{"pid.$self.$$"}->{spawn(\@cmd)} = \@cmd;
 	}
 	if ($self->{lxs} && $self->{au_done}) { # kick wait_startq
 		syswrite($self->{au_done}, 'q' x ($self->{lxs}->{jobs} // 0));
 	}
+	$self->{opt}->{quiet} = 1;
+	delete $self->{-progress};
+	delete $self->{opt}->{verbose};
 }
 
 sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 	my ($self) = @_;
-	return unless $self->{opt}->{mua} && -t $self->{1};
-	# hit the process group that started the MUA
-	if (my $s = $self->{sock}) {
-		send($s, '-WINCH', MSG_EOR);
-	} elsif ($self->{oneshot}) {
-		kill('-WINCH', $$);
+	my $alerts = $self->{opt}->{alert} // return;
+	while (my $op = shift(@$alerts)) {
+		if ($op eq '-WINCH') {
+			# hit the process group that started the MUA
+			if ($self->{sock}) {
+				send($self->{sock}, '-WINCH', MSG_EOR);
+			} elsif ($self->{oneshot}) {
+				kill('-WINCH', $$);
+			}
+		} elsif ($op eq '-bell') {
+			out($self, "\a");
+		} elsif ($op =~ /(?<!\\),/) { # bare ',' (not ',,')
+			push @$alerts, split(/(?<!\\),/, $op);
+		} elsif ($op =~ m!\A([/a-z0-9A-Z].+)!) {
+			my $cmd = $1; # run an arbitrary command
+			require Text::ParseWords;
+			$cmd = [ Text::ParseWords::shellwords($cmd) ];
+			if (my $s = $self->{sock}) {
+				send($s, exec_buf($cmd, {}), MSG_EOR);
+			} elsif ($self->{oneshot}) {
+				$self->{"pid.$self.$$"}->{spawn($cmd)} = $cmd;
+			}
+		} else {
+			err($self, "W: unsupported --alert=$op"); # non-fatal
+		}
 	}
 }
 
@@ -776,8 +803,8 @@ sub start_pager {
 		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
 		$send_cmd->($sock, $fds, exec_buf([$pager], $new_env), MSG_EOR);
 	} elsif ($self->{oneshot}) {
-		$pgr->[0] = spawn([$pager], $new_env, $rdr);
-		$pgr->[3] = $$; # ew'll reap it
+		my $cmd = [$pager];
+		$self->{"pid.$self.$$"}->{spawn($cmd, $new_env, $rdr)} = $cmd;
 	} else {
 		die 'BUG: start_pager w/o socket';
 	}
@@ -793,8 +820,6 @@ sub stop_pager {
 	$self->{2} = $pgr->[2];
 	# do not restore original stdout, just close it so we error out
 	close(delete($self->{1})) if $self->{1};
-	my $pid = $pgr->[0];
-	dwaitpid($pid) if $pid && ($pgr->[3] // 0) == $$;
 }
 
 sub accept_dispatch { # Listener {post_accept} callback
@@ -1044,9 +1069,8 @@ sub DESTROY {
 	my ($self) = @_;
 	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
-	if (my $mua_pid = delete $self->{"mua.pid.$self.$$"}) {
-		waitpid($mua_pid, 0);
-	}
+	my $oneshot_pids = delete $self->{"pid.$self.$$"} or return;
+	waitpid($_, 0) for keys %$oneshot_pids;
 }
 
 1;
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index f0ac4684..98c89d12 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -98,7 +98,10 @@ sub new {
 		$opt->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
 		$lei->{l2m} = eval { PublicInbox::LeiToMail->new($lei) };
 		return $lei->fail($@) if $@;
-		$lei->{early_mua} = 1 if $opt->{mua} && $lei->{l2m}->lock_free;
+		if ($opt->{mua} && $lei->{l2m}->lock_free) {
+			$lei->{early_mua} = 1;
+			$opt->{alert} //= [ '-WINCH,-bell' ] if -t $lei->{1};
+		}
 	}
 	$self;
 }

^ permalink raw reply related	[relevance 50%]

* [PATCH 11/13] lei q: use git approxidate with d:, dt: and rt: ranges
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-08  9:05 68% ` [PATCH 09/13] lei: drop BSD::Resource usage Eric Wong
@ 2021-02-08  9:05 42% ` Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

Instead of having --(sent|received)-(before|after)=s
command-line switches, we'll just try to make sense of argv so
it's usable within parenthesized statements and such.

Given the negligible performance penalty with Inline::C
process spawning, we'll probably wire this up to the
WWW interface, too.

"d:" is for mairix compatibility.  I don't know if "dt:" and
"rt:" will be too useful, but they exist because of IMAP
(and JMAP).
---
 lib/PublicInbox/LeiQuery.pm | 12 +++----
 lib/PublicInbox/Search.pm   | 67 +++++++++++++++++++++++++++++++++++++
 t/search.t                  | 44 ++++++++++++++++++++++++
 3 files changed, 115 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 9a6fa718..d637b1ae 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -34,9 +34,10 @@ sub lei_q {
 	my @only = @{$opt->{only} // []};
 	# --local is enabled by default unless --only is used
 	# we'll allow "--only $LOCATION --local"
+	my $sto = $self->_lei_store(1);
+	my $lse = $sto->search;
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
-		my $sto = $self->_lei_store(1);
-		$lxs->prepare_external($sto->search);
+		$lxs->prepare_external($lse);
 	}
 	if (@only) {
 		for my $loc (@only) {
@@ -107,12 +108,7 @@ no query allowed on command-line with --stdin
 		PublicInbox::InputPipe::consume($self->{0}, \&qstr_add, $self);
 		return;
 	}
-	# Consider spaces in argv to be for phrase search in Xapian.
-	# In other words, the users should need only care about
-	# normal shell quotes and not have to learn Xapian quoting.
-	$mset_opt{qstr} = join(' ', map {;
-		/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_"}) : $_
-	} @argv);
+	$mset_opt{qstr} = $lse->query_argv_to_string($lse->git, \@argv);
 	$lxs->do_query($self);
 }
 
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index dbae3bc5..f42d70e3 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -8,6 +8,7 @@ use strict;
 use parent qw(Exporter);
 our @EXPORT_OK = qw(retry_reopen int_val get_pct xap_terms);
 use List::Util qw(max);
+use POSIX qw(strftime);
 
 # values for searching, changing the numeric value breaks
 # compatibility with old indices (so don't change them it)
@@ -259,6 +260,72 @@ sub reopen {
 	$self; # make chaining easier
 }
 
+# Convert git "approxidate" ranges to something usable with our
+# Xapian indices.  At the moment, Xapian only offers a C++-only API
+# and neither the SWIG nor XS bindings allow us to use custom code
+# to parse dates (and libgit2 doesn't expose git__date_parse, either,
+# so we're running git-rev-parse(1)).
+sub date_range {
+	my ($git, $pfx, $range) = @_;
+	# are we inside a parenthesized statement?
+	my $end = $range =~ s/([\)\s]*)\z// ? $1 : '';
+	my @r = split(/\.\./, $range, 2);
+
+	# expand "d:20101002" => "d:20101002..20101003" and like
+	# n.b. git doesn't do YYYYMMDD w/o '-', it needs YYYY-MM-DD
+	if ($pfx eq 'd') {
+		if (!defined($r[1])) {
+			$r[0] =~ s/\A([0-9]{4})([0-9]{2})([0-9]{2})\z/$1-$2-$3/;
+			$r[0] = $git->date_parse($r[0]);
+			$r[1] = $r[0] + 86400;
+			for my $x (@r) {
+				$x = strftime('%Y%m%d', gmtime($x));
+			}
+		} else {
+			for my $x (@r) {
+				next if $x eq '' || $x =~ /\A[0-9]{8}\z/;
+				$x = strftime('%Y%m%d',
+						gmtime($git->date_parse($x)));
+			}
+		}
+	} elsif ($pfx eq 'dt') {
+		if (!defined($r[1])) { # git needs gaps and not /\d{14}/
+			$r[0] =~ s/\A([0-9]{4})([0-9]{2})([0-9]{2})
+					([0-9]{2})([0-9]{2})([0-9]{2})\z
+				/$1-$2-$3 $4:$5:$6/x;
+			$r[0] = $git->date_parse($r[0]);
+			$r[1] = $r[0] + 86400;
+			for my $x (@r) {
+				$x = strftime('%Y%m%d%H%M%S', gmtime($x));
+			}
+		} else {
+			for my $x (@r) {
+				next if $x eq '' || $x =~ /\A[0-9]{14}\z/;
+				$x = strftime('%Y%m%d%H%M%S',
+						gmtime($git->date_parse($x)));
+			}
+		}
+	} else { # "rt", let git interpret "YYYY", deal with Y10K later :P
+		for my $x (@r) {
+			next if $x eq '' || $x =~ /\A[0-9]{5,}\z/;
+			$x = $git->date_parse($x);
+		}
+		$r[1] //= $r[0] + 86400;
+	}
+	"$pfx:".join('..', @r).$end;
+}
+
+sub query_argv_to_string {
+	my (undef, $git, $argv) = @_;
+	join(' ', map {;
+		if (s!\b(d|rt|dt):(.+)\z!date_range($git, $1, $2)!sge) {
+			$_;
+		} else {
+			/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_}) : $_
+		}
+	} @$argv);
+}
+
 # read-only
 sub mset {
 	my ($self, $query_string, $opts) = @_;
diff --git a/t/search.t b/t/search.t
index b2958c00..56c7db1c 100644
--- a/t/search.t
+++ b/t/search.t
@@ -9,6 +9,7 @@ require PublicInbox::SearchIdx;
 require PublicInbox::Inbox;
 require PublicInbox::InboxWritable;
 use PublicInbox::Eml;
+use POSIX qw(strftime);
 my ($tmpdir, $for_destroy) = tmpdir();
 my $git_dir = "$tmpdir/a.git";
 my $ibx = PublicInbox::Inbox->new({ inboxdir => $git_dir });
@@ -534,4 +535,47 @@ $ibx->with_umask(sub {
 		'Subject search reaches inside message/rfc822');
 });
 
+SKIP: {
+	local $ENV{TZ} = 'UTC';
+	my $now = strftime('%H:%M:%S', gmtime(time));
+	if ($now =~ /\A23:(?:59|60)/ || $now =~ /\A00:00:0[01]\z/) {
+		skip 'too close to midnight, time is tricky', 6;
+	}
+	my ($s, $g) = ($ibx->search, $ibx->git);
+	my $q = $s->query_argv_to_string($g, [qw(d:20101002 blah)]);
+	is($q, 'd:20101002..20101003 blah', 'YYYYMMDD expanded to range');
+	$q = $s->query_argv_to_string($g, [qw(d:2010-10-02)]);
+	is($q, 'd:20101002..20101003', 'YYYY-MM-DD expanded to range');
+	$q = $s->query_argv_to_string($g, [qw(rt:2010-10-02.. yy)]);
+	$q =~ /\Art:(\d+)\.\. yy/ or fail("rt: expansion failed: $q");
+	is(strftime('%Y-%m-%d', gmtime($1//0)), '2010-10-02', 'rt: beg expand');
+	$q = $s->query_argv_to_string($g, [qw(rt:..2010-10-02 zz)]);
+	$q =~ /\Art:\.\.(\d+) zz/ or fail("rt: expansion failed: $q");
+	is(strftime('%Y-%m-%d', gmtime($1//0)), '2010-10-02', 'rt: end expand');
+	$q = $s->query_argv_to_string($g, [qw(something dt:2010-10-02..)]);
+	like($q, qr/\Asomething dt:20101002\d{6}\.\./, 'dt: expansion');
+	$q = $s->query_argv_to_string($g, [qw(x d:yesterday.. y)]);
+	is($q, strftime('x d:%Y%m%d.. y', gmtime(time - 86400)),
+		'"yesterday" handled');
+	$q = $s->query_argv_to_string($g, [qw(x dt:20101002054123)]);
+	is($q, 'x dt:20101002054123..20101003054123', 'single dt: expanded');
+	$q = $s->query_argv_to_string($g, [qw(x dt:2010-10-02T05:41:23Z)]);
+	is($q, 'x dt:20101002054123..20101003054123', 'ISO8601 dt: expanded');
+	$q = $s->query_argv_to_string($g, [qw(rt:1970..1971)]);
+	$q =~ /\Art:(\d+)\.\.(\d+)\z/ or fail "YYYY rt: expansion: $q";
+	my ($beg, $end) = ($1, $2);
+	is(strftime('%Y', gmtime($beg)), 1970, 'rt: starts at 1970');
+	is(strftime('%Y', gmtime($end)), 1971, 'rt: ends at 1971');
+	$q = $s->query_argv_to_string($g, [qw(rt:1970-01-01)]);
+	$q =~ /\Art:(\d+)\.\.(\d+)\z/ or fail "YYYY-MM-DD rt: expansion: $q";
+	($beg, $end) = ($1, $2);
+	is(strftime('%Y-%m-%d', gmtime($beg)), '1970-01-01',
+			'rt: date-only w/o range');
+	is(strftime('%Y-%m-%d', gmtime($end)), '1970-01-02',
+			'rt: date-only auto-end');
+	$q = $s->query_argv_to_string($g, [qw{OR (rt:1993-10-02)}]);
+	like($q, qr/\AOR \(rt:749\d{6}\.\.749\d{6}\)\z/,
+		'trailing parentheses preserved');
+}
+
 done_testing();

^ permalink raw reply related	[relevance 42%]

* [PATCHv2 01/13] lei q: improve remote mboxrd UX + MUA
  2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
@ 2021-02-08  9:05 32% ` Eric Wong
  2021-02-08  9:05 64% ` [PATCH 03/13] lei q: SIGWINCH process group with the terminal Eric Wong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-08  9:05 UTC (permalink / raw)
  To: meta

For early MUA spawners using lock-free outputs, we we need to
on the startq pipe to silence progress reporting.  For
--augment users, we can start the MUA even earlier by
creating Maildirs in the pre-augment phase.

To improve progress reporting for non-MUA (or late-MUA)
spawners, we'll no longer blindly append "--compressed" to the
curl(1) command when POST-ing for the gzipped mboxrd.
Furthermore, we'll overload stringify ('""') in LeiCurl to
ensure the empty -d '' string shows up properly.

v2: fix startq waiting with --threads
    mset_progress is never shown with early MUA spawning,
    The plan is to still show progress when augmenting and
    deduping.  This fixes all local search cases.
    A leftover debug bit is dropped, too
---
 lib/PublicInbox/IPC.pm         |  8 ++--
 lib/PublicInbox/LEI.pm         |  4 +-
 lib/PublicInbox/LeiCurl.pm     | 11 +++--
 lib/PublicInbox/LeiMirror.pm   |  5 +-
 lib/PublicInbox/LeiOverview.pm |  3 +-
 lib/PublicInbox/LeiToMail.pm   | 24 +++++-----
 lib/PublicInbox/LeiXSearch.pm  | 88 +++++++++++++++++++++-------------
 7 files changed, 86 insertions(+), 57 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index c8673e26..9331233a 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -109,7 +109,6 @@ sub ipc_worker_spawn {
 		$w_res->autoflush(1);
 		$SIG{$_} = 'IGNORE' for (qw(TERM INT QUIT));
 		local $0 = $ident;
-		PublicInbox::DS::sig_setmask($sigset);
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
@@ -117,6 +116,7 @@ sub ipc_worker_spawn {
 			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
+			PublicInbox::DS::sig_setmask($sigset);
 			ipc_worker_loop($self, $r_req, $w_res);
 		};
 		warn "worker $ident PID:$$ died: $@\n" if $@;
@@ -293,7 +293,6 @@ sub _wq_worker_start ($$$) {
 		$SIG{$_} = 'IGNORE' for (qw(PIPE));
 		$SIG{$_} = 'DEFAULT' for (qw(TTOU TTIN TERM QUIT INT CHLD));
 		local $0 = $self->{-wq_ident};
-		PublicInbox::DS::sig_setmask($oldset);
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
@@ -301,6 +300,7 @@ sub _wq_worker_start ($$$) {
 			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
+			PublicInbox::DS::sig_setmask($oldset);
 			wq_worker_loop($self);
 		};
 		warn "worker $self->{-wq_ident} PID:$$ died: $@" if $@;
@@ -395,9 +395,9 @@ sub wq_close {
 }
 
 sub wq_kill_old {
-	my ($self) = @_;
+	my ($self, $sig) = @_;
 	my $pids = $self->{"-wq_old_pids.$$"} or return;
-	kill 'TERM', @$pids;
+	kill($sig // 'TERM', @$pids);
 }
 
 sub wq_kill {
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index dce80762..c3645698 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -741,7 +741,9 @@ sub start_mua {
 	} elsif ($self->{oneshot}) {
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
 	}
-	delete $self->{-progress};
+	if ($self->{lxs} && $self->{au_done}) { # kick wait_startq
+		syswrite($self->{au_done}, 'q' x ($self->{lxs}->{jobs} // 0));
+	}
 }
 
 # caller needs to "-t $self->{1}" to check if tty
diff --git a/lib/PublicInbox/LeiCurl.pm b/lib/PublicInbox/LeiCurl.pm
index 38b17c78..f346a1b4 100644
--- a/lib/PublicInbox/LeiCurl.pm
+++ b/lib/PublicInbox/LeiCurl.pm
@@ -8,6 +8,12 @@ use v5.10.1;
 use PublicInbox::Spawn qw(which);
 use PublicInbox::Config;
 
+# Ensures empty strings are quoted, we don't need more
+# sophisticated quoting than for empty strings: curl -d ''
+use overload '""' => sub {
+	join(' ', map { $_ eq '' ?  "''" : $_ } @{$_[0]});
+};
+
 my %lei2curl = (
 	'curl-config=s@' => 'config|K=s@',
 );
@@ -63,10 +69,9 @@ EOM
 
 # completes the result of cmd() for $uri
 sub for_uri {
-	my ($self, $lei, $uri) = @_;
+	my ($self, $lei, $uri, @opt) = @_;
 	my $pfx = torsocks($self, $lei, $uri) or return; # error
-	[ @$pfx, @$self, substr($uri->path, -3) eq '.gz' ? () : '--compressed',
-		$uri->as_string ]
+	bless [ @$pfx, @$self, @opt, $uri->as_string ], ref($self);
 }
 
 1;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 5ba69287..c5153148 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -31,7 +31,7 @@ sub try_scrape {
 	my $uri = URI->new($self->{src});
 	my $lei = $self->{lei};
 	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
-	my $cmd = $curl->for_uri($lei, $uri);
+	my $cmd = $curl->for_uri($lei, $uri, '--compressed');
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
 	my $fh = popen_rd($cmd, $lei->{env}, $opt);
 	my $html = do { local $/; <$fh> } // die "read(curl $uri): $!";
@@ -93,8 +93,7 @@ sub _try_config {
 	my $path = $uri->path;
 	chop($path) eq '/' or die "BUG: $uri not canonicalized";
 	$uri->path($path . '/_/text/config/raw');
-	my $cmd = $self->{curl}->for_uri($lei, $uri);
-	push @$cmd, '--compressed'; # curl decompresses for us
+	my $cmd = $self->{curl}->for_uri($lei, $uri, '--compressed');
 	my $ce = "$dst/inbox.config.example";
 	my $f = "$ce-$$.tmp";
 	open(my $fh, '+>', $f) or return $lei->err("open $f: $! (non-fatal)");
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index dcfb9cc7..f0ac4684 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -95,9 +95,10 @@ sub new {
 		$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
 	} else {
 		# default to the cheapest sort since MUA usually resorts
-		$lei->{opt}->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
+		$opt->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
 		$lei->{l2m} = eval { PublicInbox::LeiToMail->new($lei) };
 		return $lei->fail($@) if $@;
+		$lei->{early_mua} = 1 if $opt->{mua} && $lei->{l2m}->lock_free;
 	}
 	$self;
 }
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 4c5a5685..a5a196db 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -371,7 +371,17 @@ sub new {
 	$self;
 }
 
-sub _pre_augment_maildir {} # noop
+sub _pre_augment_maildir {
+	my ($self, $lei) = @_;
+	my $dst = $lei->{ovv}->{dst};
+	for my $x (qw(tmp new cur)) {
+		my $d = $dst.$x;
+		next if -d $d;
+		require File::Path;
+		File::Path::mkpath($d);
+		-d $d or die "$d is not a directory";
+	}
+}
 
 sub _do_augment_maildir {
 	my ($self, $lei) = @_;
@@ -388,17 +398,7 @@ sub _do_augment_maildir {
 	}
 }
 
-sub _post_augment_maildir {
-	my ($self, $lei) = @_;
-	my $dst = $lei->{ovv}->{dst};
-	for my $x (qw(tmp new cur)) {
-		my $d = $dst.$x;
-		next if -d $d;
-		require File::Path;
-		File::Path::mkpath($d);
-		-d $d or die "$d is not a directory";
-	}
-}
+sub _post_augment_maildir {} # noop
 
 sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2794140a..db089a67 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -101,20 +101,34 @@ sub _mset_more ($$) {
 # $startq will EOF when query_prepare is done augmenting and allow
 # query_mset and query_thread_mset to proceed.
 sub wait_startq ($) {
-	my ($startq) = @_;
-	$_[0] = undef;
-	read($startq, my $query_prepare_done, 1);
+	my ($lei) = @_;
+	my $startq = delete $lei->{startq} or return;
+	while (1) {
+		my $n = sysread($startq, my $query_prepare_done, 1);
+		if (defined $n) {
+			return if $n == 0; # no MUA
+			if ($query_prepare_done eq 'q') {
+				$lei->{opt}->{quiet} = 1;
+				delete $lei->{opt}->{verbose};
+				delete $lei->{-progress};
+			} else {
+				$lei->fail("$$ WTF `$query_prepare_done'");
+			}
+			return;
+		}
+		return $lei->fail("$$ wait_startq: $!") unless $!{EINTR};
+	}
 }
 
 sub mset_progress {
 	my $lei = shift;
-	return unless $lei->{-progress};
+	return if $lei->{early_mua} || !$lei->{-progress};
 	if ($lei->{pkt_op_p}) {
 		pkt_do($lei->{pkt_op_p}, 'mset_progress', @_);
 	} else { # single lei-daemon consumer
 		my ($desc, $mset_size, $mset_total_est) = @_;
 		$lei->{-mset_total} += $mset_size;
-		$lei->err("# $desc $mset_size/$mset_total_est");
+		$lei->qerr("# $desc $mset_size/$mset_total_est");
 	}
 }
 
@@ -122,7 +136,6 @@ sub query_thread_mset { # for --threads
 	my ($self, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
 	my $lei = $self->{lei};
-	my $startq = delete $lei->{startq};
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
 	my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
 	return warn("$desc not indexed by Xapian\n") unless ($srch && $over);
@@ -140,7 +153,7 @@ sub query_thread_mset { # for --threads
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
-				wait_startq($startq) if $startq;
+				wait_startq($lei);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
 			}
@@ -155,7 +168,6 @@ sub query_mset { # non-parallel for non-"--threads" users
 	my ($self) = @_;
 	local $0 = "$0 query_mset";
 	my $lei = $self->{lei};
-	my $startq = delete $lei->{startq};
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	for my $loc (locals($self)) {
@@ -168,7 +180,7 @@ sub query_mset { # non-parallel for non-"--threads" users
 				$mset->size, $mset->get_matches_estimated);
 		for my $mitem ($mset->items) {
 			my $smsg = smsg_for($self, $mitem) or next;
-			wait_startq($startq) if $startq;
+			wait_startq($lei);
 			$each_smsg->($smsg, $mitem);
 		}
 	} while (_mset_more($mset, $mo));
@@ -183,7 +195,7 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$smsg->parse_references($eml, mids($eml));
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
-	if (my $startq = delete($lei->{startq})) { wait_startq($startq) }
+	wait_startq($lei);
 	if ($lei->{-progress}) {
 		++$lei->{-nr_remote_eml};
 		my $now = now();
@@ -210,7 +222,6 @@ sub query_remote_mboxrd {
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
 	fcntl($cerr, F_SETFL, O_APPEND|O_RDWR) or warn "set O_APPEND: $!";
 	my $rdr = { 2 => $cerr, pgid => 0 };
-	my $coff = 0;
 	my $sigint_reap = $lei->can('sigint_reap');
 	if ($verbose) {
 		# spawn a process to force line-buffering, otherwise curl
@@ -228,13 +239,14 @@ sub query_remote_mboxrd {
 		$lei->{-nr_remote_eml} = 0;
 		$uri->query_form(@qform);
 		my $cmd = $curl->for_uri($lei, $uri);
-		$lei->err("# @$cmd") if $verbose;
+		$lei->qerr("# $cmd");
 		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
 		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
 						$lei, $each_smsg);
-		my $err = waitpid($pid, 0) == $pid ? undef : "BUG: waitpid: $!";
+		my $err = waitpid($pid, 0) == $pid ? undef
+						: "BUG: waitpid($cmd): $!";
 		@$reap_curl = (); # cancel OnDestroy
 		die $err if $err;
 		if ($? == 0) {
@@ -242,16 +254,18 @@ sub query_remote_mboxrd {
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
 		}
-		seek($cerr, $coff, SEEK_SET) or warn "seek(curl stderr): $!\n";
-		my $e = do { local $/; <$cerr> } //
-				die "read(curl stderr): $!\n";
-		$coff += length($e);
-		truncate($cerr, 0);
-		next if (($? >> 8) == 22 && $e =~ /\b404\b/);
-		$lei->child_error($?);
+		$err = '';
+		if (-s $cerr) {
+			seek($cerr, 0, SEEK_SET) or
+					$lei->err("seek($cmd stderr): $!");
+			$err = do { local $/; <$cerr> } //
+					"read($cmd stderr): $!";
+			truncate($cerr, 0) or
+					$lei->err("truncate($cmd stderr): $!");
+		}
+		next if (($? >> 8) == 22 && $err =~ /\b404\b/);
 		$uri->query_form(q => $lei->{mset_opt}->{qstr});
-		# --verbose already showed the error via tail(1)
-		$lei->err("E: $uri \$?=$?\n", $verbose ? () : $e);
+		$lei->child_error($?, "E: <$uri> $err");
 	}
 	undef $each_smsg;
 	$lei->{ovv}->ovv_atexit_child($lei);
@@ -311,15 +325,23 @@ Error closing $lei->{ovv}->{dst}: $!
 
 sub do_post_augment {
 	my ($lei) = @_;
-	eval { $lei->{l2m}->post_augment($lei) };
-	if (my $err = $@) {
-		if (my $lxs = delete $lei->{lxs}) {
-			$lxs->wq_kill;
-			$lxs->wq_close(0, undef, $lei);
+	my $l2m = $lei->{l2m};
+	my $err;
+	if ($l2m) {
+		eval { $l2m->post_augment($lei) };
+		$err = $@;
+		if ($err) {
+			if (my $lxs = delete $lei->{lxs}) {
+				$lxs->wq_kill;
+				$lxs->wq_close(0, undef, $lei);
+			}
+			$lei->fail("$err");
 		}
-		$lei->fail("$err");
 	}
-	close(delete $lei->{au_done}); # triggers wait_startq
+	if (!$err && delete $lei->{early_mua}) { # non-augment case
+		$lei->start_mua;
+	}
+	close(delete $lei->{au_done}); # triggers wait_startq in lei_xsearch
 }
 
 my $MAX_PER_HOST = 4;
@@ -334,9 +356,6 @@ sub concurrency {
 
 sub start_query { # always runs in main (lei-daemon) process
 	my ($self, $lei) = @_;
-	if (my $l2m = $lei->{l2m}) {
-		$lei->start_mua if $l2m->lock_free;
-	}
 	if ($lei->{opt}->{threads}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_thread_mset', [], $ibxish);
@@ -387,6 +406,9 @@ sub do_query {
 	my $l2m = $lei->{l2m};
 	if ($l2m) {
 		$l2m->pre_augment($lei);
+		if ($lei->{opt}->{augment} && delete $lei->{early_mua}) {
+			$lei->start_mua;
+		}
 		$l2m->wq_workers_start('lei2mail', $l2m->{jobs},
 					$lei->oldset, { lei => $lei });
 		pipe($lei->{startq}, $lei->{au_done}) or die "pipe: $!";
@@ -404,7 +426,7 @@ sub do_query {
 	delete $lei->{pkt_op_p};
 	$l2m->wq_close(1) if $l2m;
 	$lei->event_step_init; # wait for shutdowns
-	$self->wq_io_do('query_prepare', []) if $l2m;
+	$self->wq_io_do('query_prepare', []) if $l2m; # for augment/dedupe
 	start_query($self, $lei);
 	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {

^ permalink raw reply related	[relevance 32%]

* [PATCH 08/11] lei q: prefix --alert ops with ':' instead of '-'
    2021-02-09  8:09 36% ` [PATCH 05/11] lei: split out MdirReader package, lazy-require earlier Eric Wong
@ 2021-02-09  8:09 64% ` Eric Wong
  2021-02-09  8:09 67% ` [PATCH 10/11] lei: replace "I:"-prefixed info messages with "#" Eric Wong
  2021-02-09  8:09 81% ` [PATCH 11/11] tests|lei: fixes for TEST_RUN_MODE=0 and lei oneshot Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-09  8:09 UTC (permalink / raw)
  To: meta

Using dashed keywords confuses the option parser without
"=" signs (and bash completion doesn't yet work with "=").

So use ":" instead of "-" as the prefix for internal ops,
since ":" is just as unlikely to be the first character of
an executable file in a user's $PATH.
---
 lib/PublicInbox/LEI.pm         | 8 ++++----
 lib/PublicInbox/LeiOverview.pm | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e2a945a4..e29b13c3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -227,9 +227,9 @@ my %OPTDESC = (
 'show	threads|t' => 'display entire thread a message belongs to',
 'q	threads|t' =>
 	'return all messages in the same threads as the actual match(es)',
-'alert=s@' => ['CMD,-WINCH,-bell,<any command>',
+'alert=s@' => ['CMD,:WINCH,:bell,<any command>',
 	'run command(s) or perform ops when done writing to output ' .
-	'(default: "-WINCH,-bell" with --mua and Maildir/IMAP output, ' .
+	'(default: ":WINCH,:bell" with --mua and Maildir/IMAP output, ' .
 	'nothing otherwise)' ],
 
 'augment|a' => 'augment --output destination instead of clobbering',
@@ -758,14 +758,14 @@ sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 	my ($self) = @_;
 	my $alerts = $self->{opt}->{alert} // return;
 	while (my $op = shift(@$alerts)) {
-		if ($op eq '-WINCH') {
+		if ($op eq ':WINCH') {
 			# hit the process group that started the MUA
 			if ($self->{sock}) {
 				send($self->{sock}, '-WINCH', MSG_EOR);
 			} elsif ($self->{oneshot}) {
 				kill('-WINCH', $$);
 			}
-		} elsif ($op eq '-bell') {
+		} elsif ($op eq ':bell') {
 			out($self, "\a");
 		} elsif ($op =~ /(?<!\\),/) { # bare ',' (not ',,')
 			push @$alerts, split(/(?<!\\),/, $op);
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 98c89d12..c820f0d7 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -100,7 +100,7 @@ sub new {
 		return $lei->fail($@) if $@;
 		if ($opt->{mua} && $lei->{l2m}->lock_free) {
 			$lei->{early_mua} = 1;
-			$opt->{alert} //= [ '-WINCH,-bell' ] if -t $lei->{1};
+			$opt->{alert} //= [ ':WINCH,:bell' ] if -t $lei->{1};
 		}
 	}
 	$self;

^ permalink raw reply related	[relevance 64%]

* [PATCH 05/11] lei: split out MdirReader package, lazy-require earlier
  @ 2021-02-09  8:09 36% ` Eric Wong
  2021-02-09  8:09 64% ` [PATCH 08/11] lei q: prefix --alert ops with ':' instead of '-' Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-09  8:09 UTC (permalink / raw)
  To: meta

We'll do more requires in the top-level lei-daemon process to
save work in workers.  We can also work towards aborting on
user errors in lei-daemon rather than worker processes.

"lei import -f mbox*" is finally tested inside t/lei_to_mail.t
---
 MANIFEST                      |  1 +
 lib/PublicInbox/LeiImport.pm  | 25 +++++++++++++++----------
 lib/PublicInbox/LeiToMail.pm  | 26 ++++++++++----------------
 lib/PublicInbox/MdirReader.pm | 21 +++++++++++++++++++++
 lib/PublicInbox/TestCommon.pm |  4 +++-
 t/lei-import.t                |  5 ++++-
 t/lei_to_mail.t               | 19 ++++++++++++++++---
 7 files changed, 70 insertions(+), 31 deletions(-)
 create mode 100644 lib/PublicInbox/MdirReader.pm

diff --git a/MANIFEST b/MANIFEST
index 7f417743..6b3fc812 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -199,6 +199,7 @@ lib/PublicInbox/ManifestJsGz.pm
 lib/PublicInbox/Mbox.pm
 lib/PublicInbox/MboxGz.pm
 lib/PublicInbox/MboxReader.pm
+lib/PublicInbox/MdirReader.pm
 lib/PublicInbox/MiscIdx.pm
 lib/PublicInbox/MiscSearch.pm
 lib/PublicInbox/MsgIter.pm
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index a63bfdfd..8358d9d4 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -6,7 +6,6 @@ package PublicInbox::LeiImport;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC);
-use PublicInbox::MboxReader;
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::PktOp;
@@ -37,8 +36,17 @@ sub call { # the main "lei import" method
 	$lei->{opt}->{kw} //= 1;
 	my $fmt = $lei->{opt}->{'format'};
 	my $self = $lei->{imp} = bless {}, $cls;
-	if (my @f = grep { -f } @argv && !$fmt) {
-		return $lei->fail("--format unset for regular files:\n@f");
+	my @f;
+	for my $x (@argv) {
+		if (-f $x) { push @f, $x }
+		elsif (-d _) { require PublicInbox::MdirReader }
+	}
+	(@f && !$fmt) and
+		return $lei->fail("--format unset for regular file(s):\n@f");
+	if (@f && $fmt ne 'eml') {
+		require PublicInbox::MboxReader;
+		PublicInbox::MboxReader->can($fmt) or
+			return $lei->fail( "--format=$fmt unrecognized\n");
 	}
 	$self->{0} = $lei->{0} if $lei->{opt}->{stdin};
 	my $ops = {
@@ -83,11 +91,9 @@ error reading $x: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
 			_import_eml($eml, $lei->{sto}, $set_kw);
-		} else { # some mbox
-			my $cb = PublicInbox::MboxReader->can($fmt);
-			$cb or return $lei->child_error(1 >> 8, <<"");
---format $fmt unsupported for $x
-
+		} else { # some mbox (->can already checked in call);
+			my $cb = PublicInbox::MboxReader->can($fmt) //
+				die "BUG: bad fmt=$fmt";
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
 	};
@@ -109,8 +115,7 @@ unable to open $x: $!
 
 		_import_fh($lei, $fh, $x);
 	} elsif (-d _ && (-d "$x/cur" || -d "$x/new")) {
-		require PublicInbox::LeiToMail;
-		PublicInbox::LeiToMail::maildir_each_file($x,
+		PublicInbox::MdirReader::maildir_each_file($x,
 					\&_import_maildir,
 					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a5a196db..e3e512be 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -18,6 +18,7 @@ use Symbol qw(gensym);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
 use Errno qw(EEXIST ESPIPE ENOENT EPIPE);
+my ($maildir_each_file);
 
 # struggles with short-lived repos, Gcf2Client makes little sense with lei;
 # but we may use in-process libgit2 in the future.
@@ -266,18 +267,6 @@ sub _mbox_write_cb ($$) {
 	}
 }
 
-sub maildir_each_file ($$;@) {
-	my ($dir, $cb, @arg) = @_;
-	$dir .= '/' unless substr($dir, -1) eq '/';
-	for my $d (qw(new/ cur/)) {
-		my $pfx = $dir.$d;
-		opendir my $dh, $pfx or next;
-		while (defined(my $fn = readdir($dh))) {
-			$cb->($pfx.$fn, @arg) if $fn =~ /:2,[A-Za-z]*\z/;
-		}
-	}
-}
-
 sub _augment_file { # maildir_each_file cb
 	my ($f, $lei) = @_;
 	my $eml = PublicInbox::InboxWritable::eml_from_path($f) or return;
@@ -354,11 +343,18 @@ sub new {
 	my $dst = $lei->{ovv}->{dst};
 	my $self = bless {}, $cls;
 	if ($fmt eq 'maildir') {
+		$maildir_each_file //= do {
+			require PublicInbox::MdirReader;
+			PublicInbox::MdirReader->can('maildir_each_file');
+		};
+		$lei->{opt}->{augment} and
+			require PublicInbox::InboxWritable; # eml_from_path
 		$self->{base_type} = 'maildir';
 		-e $dst && !-d _ and die
 				"$dst exists and is not a directory\n";
 		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
 	} elsif (substr($fmt, 0, 4) eq 'mbox') {
+		require PublicInbox::MboxReader if $lei->{opt}->{augment};
 		(-d $dst || (-e _ && !-w _)) and die
 			"$dst exists and is not a writable file\n";
 		$self->can("eml2$fmt") or die "bad mbox --format=$fmt\n";
@@ -389,12 +385,11 @@ sub _do_augment_maildir {
 	if ($lei->{opt}->{augment}) {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
-			require PublicInbox::InboxWritable; # eml_from_path
-			maildir_each_file($dst, \&_augment_file, $lei);
+			$maildir_each_file->($dst, \&_augment_file, $lei);
 			$dedupe->pause_dedupe;
 		}
 	} else { # clobber existing Maildir
-		maildir_each_file($dst, \&_unlink);
+		$maildir_each_file->($dst, \&_unlink);
 	}
 }
 
@@ -435,7 +430,6 @@ sub _do_augment_mbox {
 		my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) :
 				dup_src($out);
 		my $fmt = $lei->{ovv}->{fmt};
-		require PublicInbox::MboxReader;
 		PublicInbox::MboxReader->$fmt($rd, \&_augment, $lei);
 	}
 	# maybe some systems don't honor O_APPEND, Perl does this:
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
new file mode 100644
index 00000000..c6a0e7a8
--- /dev/null
+++ b/lib/PublicInbox/MdirReader.pm
@@ -0,0 +1,21 @@
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Maildirs for now, MH eventually
+package PublicInbox::MdirReader;
+use strict;
+use v5.10.1;
+
+sub maildir_each_file ($$;@) {
+	my ($dir, $cb, @arg) = @_;
+	$dir .= '/' unless substr($dir, -1) eq '/';
+	for my $d (qw(new/ cur/)) {
+		my $pfx = $dir.$d;
+		opendir my $dh, $pfx or next;
+		while (defined(my $fn = readdir($dh))) {
+			$cb->($pfx.$fn, @arg) if $fn =~ /:2,[A-Za-z]*\z/;
+		}
+	}
+}
+
+1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index ec9191b6..53f13437 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -14,7 +14,7 @@ BEGIN {
 	@EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes
-		tcp_host_port test_lei $lei $lei_out $lei_err $lei_opt);
+		tcp_host_port test_lei lei $lei $lei_out $lei_err $lei_opt);
 	require Test::More;
 	my @methods = grep(!/\W/, @Test::More::EXPORT);
 	eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
@@ -457,6 +457,8 @@ our $lei = sub {
 	$res;
 };
 
+sub lei (@) { $lei->(@_) }
+
 sub json_utf8 () {
 	state $x = ref(PublicInbox::Config->json)->new->utf8->canonical;
 }
diff --git a/t/lei-import.t b/t/lei-import.t
index 709d89fa..b691798a 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -3,12 +3,14 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 test_lei(sub {
+ok(!$lei->(qw(import -f bogus), 't/plack-qp.eml'), 'fails with bogus format');
+like($lei_err, qr/\bbogus unrecognized/, 'gave error message');
 
 ok($lei->(qw(q s:boolean)), 'search miss before import');
 unlike($lei_out, qr/boolean/i, 'no results, yet');
 open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
 ok($lei->([qw(import -f eml -)], undef, { %$lei_opt, 0 => $fh }),
-	'import single file from stdin');
+	'import single file from stdin') or diag $lei_err;
 close $fh;
 ok($lei->(qw(q s:boolean)), 'search hit after import');
 ok($lei->(qw(import -f eml), 't/data/message_embed.eml'),
@@ -35,5 +37,6 @@ $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, [], 'no keywords set');
 
+# see t/lei_to_mail.t for "import -f mbox*"
 });
 done_testing;
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index a25795ca..77e9902e 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -10,6 +10,7 @@ use Fcntl qw(SEEK_SET);
 use PublicInbox::Spawn qw(popen_rd which);
 use List::Util qw(shuffle);
 require_mods(qw(DBD::SQLite));
+require PublicInbox::MdirReader;
 require PublicInbox::MboxReader;
 require PublicInbox::LeiOverview;
 require PublicInbox::LEI;
@@ -127,6 +128,17 @@ my $orig = do {
 	is(do { local $/; <$fh> }, $raw, 'jobs > 1');
 	$raw;
 };
+
+test_lei(sub {
+	ok(lei(qw(import -f), $mbox, $fn), 'imported mbox');
+	ok(lei(qw(q s:x)), 'lei q works') or diag $lei_err;
+	my $res = json_utf8->decode($lei_out);
+	my $x = $res->[0];
+	is($x->{'s'}, 'x', 'subject imported') or diag $lei_out;
+	is_deeply($x->{'kw'}, ['seen'], 'kw imported') or diag $lei_out;
+	is($res->[1], undef, 'only one result');
+});
+
 for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 	my $zsfx2cmd = PublicInbox::LeiToMail->can('zsfx2cmd');
 	SKIP: {
@@ -230,6 +242,7 @@ SKIP: { # FIFO support
 }
 
 { # Maildir support
+	my $each_file = PublicInbox::MdirReader->can('maildir_each_file');
 	my $md = "$tmpdir/maildir/";
 	my $wcb = $wcb_get->('maildir', $md);
 	is(ref($wcb), 'CODE', 'got Maildir callback');
@@ -237,7 +250,7 @@ SKIP: { # FIFO support
 	$wcb->(\(my $x = $buf), $b4dc0ffee);
 
 	my @f;
-	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @f, shift });
+	$each_file->($md, sub { push @f, shift });
 	open my $fh, $f[0] or BAIL_OUT $!;
 	is(do { local $/; <$fh> }, $buf, 'wrote to Maildir');
 
@@ -246,7 +259,7 @@ SKIP: { # FIFO support
 	$wcb->(\($x = $buf."\nx\n"), $deadcafe);
 
 	my @x = ();
-	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @x, shift });
+	$each_file->($md, sub { push @x, shift });
 	is(scalar(@x), 1, 'wrote one new file');
 	ok(!-f $f[0], 'old file clobbered');
 	open $fh, $x[0] or BAIL_OUT $!;
@@ -257,7 +270,7 @@ SKIP: { # FIFO support
 	$wcb->(\($x = $buf."\ny\n"), $deadcafe);
 	$wcb->(\($x = $buf."\ny\n"), $b4dc0ffee); # skipped by dedupe
 	@f = ();
-	PublicInbox::LeiToMail::maildir_each_file($md, sub { push @f, shift });
+	$each_file->($md, sub { push @f, shift });
 	is(scalar grep(/\A\Q$x[0]\E\z/, @f), 1, 'old file still there');
 	my @new = grep(!/\A\Q$x[0]\E\z/, @f);
 	is(scalar @new, 1, '1 new file written (b4dc0ffee skipped)');

^ permalink raw reply related	[relevance 36%]

* [PATCH 11/11] tests|lei: fixes for TEST_RUN_MODE=0 and lei oneshot
                     ` (2 preceding siblings ...)
  2021-02-09  8:09 67% ` [PATCH 10/11] lei: replace "I:"-prefixed info messages with "#" Eric Wong
@ 2021-02-09  8:09 81% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-09  8:09 UTC (permalink / raw)
  To: meta

DESTROY callbacks can clobber $?, so we must take care to
preserve it when exiting.  We'll also try to make an effort to
ensure better DESTROY ordering and delete as much as possible
before x_it finishes.

We also need to load PublicInbox::Config when setting up
public inboxes.
---
 lib/PublicInbox/IPC.pm        | 2 ++
 lib/PublicInbox/LEI.pm        | 7 +++++--
 lib/PublicInbox/TestCommon.pm | 3 ++-
 t/lei-mirror.t                | 2 +-
 4 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 9331233a..efac4c4d 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -412,9 +412,11 @@ sub DESTROY {
 	my ($self) = @_;
 	my $ppid = $self->{-wq_ppid};
 	wq_kill($self) if $ppid && $ppid == $$;
+	my $err = $?;
 	wq_close($self);
 	wq_wait_old($self);
 	ipc_worker_stop($self);
+	$? = $err if $err;
 }
 
 sub detect_nproc () {
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 5f265087..dd831c54 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -336,8 +336,9 @@ sub x_it ($$) {
 			my $wq = delete $self->{$f} or next;
 			$wq->DESTROY;
 		}
-		# cleanup anything that has tempfiles
-		delete @$self{qw(ovv dedupe)};
+		# cleanup anything that has tempfiles or open file handles
+		%PATH2CFG = ();
+		delete @$self{qw(ovv dedupe sto cfg)};
 		if (my $signum = ($code & 127)) { # usually SIGPIPE (13)
 			$SIG{PIPE} = 'DEFAULT'; # $SIG{$signum} doesn't work
 			kill $signum, $$;
@@ -1072,8 +1073,10 @@ sub DESTROY {
 	my ($self) = @_;
 	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
+	my $err = $?;
 	my $oneshot_pids = delete $self->{"pid.$self.$$"} or return;
 	waitpid($_, 0) for keys %$oneshot_pids;
+	$? = $err if $err; # preserve ->fail or ->x_it code
 }
 
 1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 53f13437..63d45ac3 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -541,7 +541,6 @@ sub setup_public_inboxes () {
 	my $end = $lk->lock_for_scope;
 	return @ret if -f $stamp;
 
-	require PublicInbox::InboxWritable;
 	local $ENV{PI_CONFIG} = $pi_config;
 	for my $V (1, 2) {
 		run_script([qw(-init), "-V$V", "t$V",
@@ -549,6 +548,8 @@ sub setup_public_inboxes () {
 				"$test_home/t$V", "http://example.com/t$V",
 				"t$V\@example.com" ]) or BAIL_OUT "init v$V";
 	}
+	require PublicInbox::Config;
+	require PublicInbox::InboxWritable;
 	my $cfg = PublicInbox::Config->new;
 	my $seen = 0;
 	$cfg->each_inbox(sub {
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index e3707979..cbe300da 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -27,7 +27,7 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	like($lei_out, qr!\Q$t2\E!, 't2 added to ls-externals');
 
 	ok(!$lei->('add-external', $t2, '--mirror', "$http/t2/"),
-		'--mirror fails if reused');
+		'--mirror fails if reused') or diag "$lei_err.$lei_out = $?";
 
 	ok($lei->('ls-external'), 'ls-external');
 	like($lei_out, qr!\Q$t2\E!, 'still in ls-externals');

^ permalink raw reply related	[relevance 81%]

* [PATCH 10/11] lei: replace "I:"-prefixed info messages with "#"
    2021-02-09  8:09 36% ` [PATCH 05/11] lei: split out MdirReader package, lazy-require earlier Eric Wong
  2021-02-09  8:09 64% ` [PATCH 08/11] lei q: prefix --alert ops with ':' instead of '-' Eric Wong
@ 2021-02-09  8:09 67% ` Eric Wong
  2021-02-09  8:09 81% ` [PATCH 11/11] tests|lei: fixes for TEST_RUN_MODE=0 and lei oneshot Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-09  8:09 UTC (permalink / raw)
  To: meta

The "#" is what TAP <https://testanything.org/> uses,
which is also consistent with what our (and many other)
test suites emit.
---
 lib/PublicInbox/LEI.pm | 6 +++---
 t/lei.t                | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e29b13c3..5f265087 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -559,7 +559,7 @@ sub _lei_cfg ($;$) {
 		open my $fh, '>>', $f or die "open($f): $!\n";
 		@st = stat($fh) or die "fstat($f): $!\n";
 		$cur_st = pack('dd', $st[10], $st[7]);
-		qerr($self, "I: $f created") if $self->{cmd} ne 'config';
+		qerr($self, "# $f created") if $self->{cmd} ne 'config';
 	}
 	my $cfg = PublicInbox::Config::git_config_dump($f);
 	$cfg->{-st} = $cur_st;
@@ -619,7 +619,7 @@ sub lei_init {
 	my @cur = stat($cur) if defined($cur);
 	$cur = File::Spec->canonpath($cur // $dir);
 	my @dir = stat($dir);
-	my $exists = "I: leistore.dir=$cur already initialized" if @dir;
+	my $exists = "# leistore.dir=$cur already initialized" if @dir;
 	if (@cur) {
 		if ($cur eq $dir) {
 			_lei_store($self, 1)->done;
@@ -638,7 +638,7 @@ E: leistore.dir=$cur already initialized and it is not $dir
 	}
 	lei_config($self, 'leistore.dir', $dir);
 	_lei_store($self, 1)->done;
-	$exists //= "I: leistore.dir=$dir newly initialized";
+	$exists //= "# leistore.dir=$dir newly initialized";
 	return qerr($self, $exists);
 }
 
diff --git a/t/lei.t b/t/lei.t
index 8e771eb5..4785acca 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -49,7 +49,7 @@ my $test_help = sub {
 
 my $ok_err_info = sub {
 	my ($msg) = @_;
-	is(grep(!/^I:/, split(/^/, $lei_err)), 0, $msg) or
+	is(grep(!/^#/, split(/^/, $lei_err)), 0, $msg) or
 		diag "$msg: err=$lei_err";
 };
 

^ permalink raw reply related	[relevance 67%]

* [PATCH 0/6] more lei stuffs
@ 2021-02-10  7:07 71% Eric Wong
  2021-02-10  7:07 47% ` [PATCH 1/6] lei *external: glob improvements, ls-external filtering Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-02-10  7:07 UTC (permalink / raw)
  To: meta

IMAP and NNTP lei-import support are coming.  Basically stuff in
-watch, but IMAP will need to support flags (aka keywords) in
IMAP.

"lei ls-external" gains filtering + globbing support
(still needs shell completion support)

Eric Wong (6):
  lei *external: glob improvements, ls-external filtering
  lei_external: remove unnecessary Exporter use
  test_common: support lei-daemon only testing
  lei ls-external: support --local and --remote
  lei: note some TODO items (curl, externals)
  net_reader: new package split from -watch

 MANIFEST                       |   1 +
 lib/PublicInbox/LEI.pm         |   4 +-
 lib/PublicInbox/LeiCurl.pm     |   2 +
 lib/PublicInbox/LeiExternal.pm |  85 +++++++++----
 lib/PublicInbox/NetReader.pm   | 220 +++++++++++++++++++++++++++++++++
 lib/PublicInbox/TestCommon.pm  |  13 +-
 lib/PublicInbox/Watch.pm       | 204 +-----------------------------
 t/lei-externals.t              |  36 ++++--
 t/lei_external.t               |  20 ++-
 9 files changed, 343 insertions(+), 242 deletions(-)
 create mode 100644 lib/PublicInbox/NetReader.pm


^ permalink raw reply	[relevance 71%]

* [PATCH 1/6] lei *external: glob improvements, ls-external filtering
  2021-02-10  7:07 71% [PATCH 0/6] more lei stuffs Eric Wong
@ 2021-02-10  7:07 47% ` Eric Wong
  2021-02-10  7:07 71% ` [PATCH 3/6] test_common: support lei-daemon only testing Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-10  7:07 UTC (permalink / raw)
  To: meta

The "ls-external" now accepts the same glob patterns used by
with lei q --{include,only,exclude}.  If no glob is detected, it
will be treated as a literal substring match (like "grep -F").

Inverting matches is also supported ("grep -v").
---
 lib/PublicInbox/LEI.pm         |  4 +-
 lib/PublicInbox/LeiExternal.pm | 74 ++++++++++++++++++++++++----------
 t/lei_external.t               | 20 +++++++--
 3 files changed, 71 insertions(+), 27 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index dd831c54..eb5a646e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -124,8 +124,8 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(boost=i c=s@ mirror=s no-torsocks torsocks=s inbox-version=i),
 	qw(quiet|q verbose|v+),
 	index_opt(), PublicInbox::LeiQuery::curl_opt() ],
-'ls-external' => [ '[FILTER...]', 'list publicinbox|extindex locations',
-	qw(format|f=s z|0 local remote quiet|q) ],
+'ls-external' => [ '[FILTER]', 'list publicinbox|extindex locations',
+	qw(format|f=s z|0 globoff|g invert-match|v local remote) ],
 'forget-external' => [ 'LOCATION...|--prune',
 	'exclude further results from a publicinbox|extindex',
 	qw(prune quiet|q) ],
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index b65dc87c..bac15226 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -22,22 +22,16 @@ sub externals_each {
 	# highest boost first, but stable for alphabetic tie break
 	use sort 'stable';
 	my @order = sort { $boost{$b} <=> $boost{$a} } sort keys %boost;
-	return @order if !$cb;
-	for my $loc (@order) {
-		$cb->(@arg, $loc, $boost{$loc});
+	if (ref($cb) eq 'CODE') {
+		for my $loc (@order) {
+			$cb->(@arg, $loc, $boost{$loc});
+		}
+	} elsif (ref($cb) eq 'HASH') {
+		%$cb = %boost;
 	}
 	@order; # scalar or array
 }
 
-sub lei_ls_external {
-	my ($self, @argv) = @_;
-	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
-	externals_each($self, sub {
-		my ($loc, $boost_val) = @_;
-		$self->out($loc, $OFS, 'boost=', $boost_val, $ORS);
-	});
-}
-
 sub ext_canonicalize {
 	my ($location) = @_;
 	if ($location !~ m!\Ahttps?://!) {
@@ -52,28 +46,47 @@ sub ext_canonicalize {
 	}
 }
 
-my %patmap = ('*' => '[^/]*?', '?' => '[^/]', '[' => '[', ']' => ']');
-sub glob2pat {
-	my ($glob) = @_;
-        $glob =~ s!(.)!$patmap{$1} || "\Q$1"!ge;
-        $glob;
+my %re_map = ( '*' => '[^/]*?', '?' => '[^/]',
+		'[' => '[', ']' => ']', ',' => ',' );
+
+sub glob2re {
+	my ($re) = @_;
+	my $p = '';
+	my $in_bracket = 0;
+	my $qm = 0;
+	my $changes = ($re =~ s!(.)!
+		$re_map{$p eq '\\' ? '' : do {
+			if ($1 eq '[') { ++$in_bracket }
+			elsif ($1 eq ']') { --$in_bracket }
+			$p = $1;
+		}} // do {
+			$p = $1;
+			($p eq '-' && $in_bracket) ? $p : (++$qm, "\Q$p")
+		}!sge);
+	# bashism (also supported by curl): {a,b,c} => (a|b|c)
+	$re =~ s/([^\\]*)\\\{([^,]*?,[^\\]*?)\\\}/
+		(my $in_braces = $2) =~ tr!,!|!;
+		$1."($in_braces)";
+		/sge;
+	($changes - $qm) ? $re : undef;
 }
 
+# get canonicalized externals list matching $loc
+# $is_exclude denotes it's for --exclude
+# otherwise it's for --only/--include is assumed
 sub get_externals {
-	my ($self, $loc, $exclude) = @_;
+	my ($self, $loc, $is_exclude) = @_;
 	return (ext_canonicalize($loc)) if -e $loc;
-
 	my @m;
 	my @cur = externals_each($self);
 	my $do_glob = !$self->{opt}->{globoff}; # glob by default
-	if ($do_glob && ($loc =~ /[\*\?]/s || $loc =~ /\[.*\]/s)) {
-		my $re = glob2pat($loc);
+	if ($do_glob && (my $re = glob2re($loc))) {
 		@m = grep(m!$re!, @cur);
 		return @m if scalar(@m);
 	} elsif (index($loc, '/') < 0) { # exact basename match:
 		@m = grep(m!/\Q$loc\E/?\z!, @cur);
 		return @m if scalar(@m) == 1;
-	} elsif ($exclude) { # URL, maybe:
+	} elsif ($is_exclude) { # URL, maybe:
 		my $canon = ext_canonicalize($loc);
 		@m = grep(m!\A\Q$canon\E\z!, @cur);
 		return @m if scalar(@m) == 1;
@@ -88,6 +101,23 @@ sub get_externals {
 	();
 }
 
+sub lei_ls_external {
+	my ($self, $filter) = @_;
+	my $do_glob = !$self->{opt}->{globoff}; # glob by default
+	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
+	$filter //= '*';
+	my $re = $do_glob ? glob2re($filter) : undef;
+	$re //= index($filter, '/') < 0 ?
+			qr!/\Q$filter\E/?\z! : # exact basename match
+			qr/\Q$filter\E/; # grep -F semantics
+	my @ext = externals_each($self, my $boost = {});
+	@ext = $self->{opt}->{'invert-match'} ? grep(!/$re/, @ext)
+					: grep(/$re/, @ext);
+	for my $loc (@ext) {
+		$self->out($loc, $OFS, 'boost=', $boost->{$loc}, $ORS);
+	}
+}
+
 sub add_external_finish {
 	my ($self, $location) = @_;
 	my $cfg = $self->_lei_cfg(1);
diff --git a/t/lei_external.t b/t/lei_external.t
index 587990db..0ef6633d 100644
--- a/t/lei_external.t
+++ b/t/lei_external.t
@@ -1,7 +1,8 @@
 #!perl -w
-use strict;
-use v5.10.1;
-use Test::More;
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# internal unit test, see t/lei-externals.t for functional tests
+use strict; use v5.10.1; use Test::More;
 my $cls = 'PublicInbox::LeiExternal';
 require_ok $cls;
 my $canon = $cls->can('ext_canonicalize');
@@ -15,4 +16,17 @@ is($canon->('/this/path/is/nonexistent/'), '/this/path/is/nonexistent',
 is($canon->('/this//path/'), '/this/path', 'extra slashes gone');
 is($canon->('/ALL/CAPS'), '/ALL/CAPS', 'caps preserved');
 
+my $glob2re = $cls->can('glob2re');
+is($glob2re->('foo'), undef, 'plain string unchanged');
+is_deeply($glob2re->('[f-o]'), '[f-o]' , 'range accepted');
+is_deeply($glob2re->('*'), '[^/]*?' , 'wildcard accepted');
+is_deeply($glob2re->('{a,b,c}'), '(a|b|c)' , 'braces');
+is_deeply($glob2re->('{,b,c}'), '(|b|c)' , 'brace with empty @ start');
+is_deeply($glob2re->('{a,b,}'), '(a|b|)' , 'brace with empty @ end');
+is_deeply($glob2re->('{a}'), undef, 'ungrouped brace');
+is_deeply($glob2re->('{a'), undef, 'open left brace');
+is_deeply($glob2re->('a}'), undef, 'open right brace');
+is_deeply($glob2re->('*.[ch]'), '[^/]*?\\.[ch]', 'suffix glob');
+is_deeply($glob2re->('{[a-z],9,}'), '([a-z]|9|)' , 'brace with range');
+
 done_testing;

^ permalink raw reply related	[relevance 47%]

* [PATCH 3/6] test_common: support lei-daemon only testing
  2021-02-10  7:07 71% [PATCH 0/6] more lei stuffs Eric Wong
  2021-02-10  7:07 47% ` [PATCH 1/6] lei *external: glob improvements, ls-external filtering Eric Wong
@ 2021-02-10  7:07 71% ` Eric Wong
  2021-02-10  7:07 49% ` [PATCH 4/6] lei ls-external: support --local and --remote Eric Wong
  2021-02-10  7:07 71% ` [PATCH 5/6] lei: note some TODO items (curl, externals) Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-10  7:07 UTC (permalink / raw)
  To: meta

Daemon-only tests can be significantly faster due to cached
configs; so give developers a chance to test only daemons to
improve productivity.

The differences between daemon and oneshot modes are minimal,
at this point.
---
 lib/PublicInbox/TestCommon.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 63d45ac3..64fe0499 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -506,6 +506,8 @@ EOM
 		}
 	}; # SKIP for lei_daemon
 	unless ($test_opt->{daemon_only}) {
+		$ENV{TEST_LEI_DAEMON_ONLY} and
+			skip 'TEST_LEI_DAEMON_ONLY set', 1;
 		require_ok 'PublicInbox::LEI';
 		my $home = "$tmpdir/lei-oneshot";
 		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";

^ permalink raw reply related	[relevance 71%]

* [PATCH 5/6] lei: note some TODO items (curl, externals)
  2021-02-10  7:07 71% [PATCH 0/6] more lei stuffs Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-10  7:07 49% ` [PATCH 4/6] lei ls-external: support --local and --remote Eric Wong
@ 2021-02-10  7:07 71% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-10  7:07 UTC (permalink / raw)
  To: meta

I don't know if it's worth it to use libcurl directly
(nor the effort to support and maintain tests)
---
 lib/PublicInbox/LeiCurl.pm     | 2 ++
 lib/PublicInbox/LeiExternal.pm | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/lib/PublicInbox/LeiCurl.pm b/lib/PublicInbox/LeiCurl.pm
index f346a1b4..3a79fbf8 100644
--- a/lib/PublicInbox/LeiCurl.pm
+++ b/lib/PublicInbox/LeiCurl.pm
@@ -2,6 +2,8 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # common option and torsocks(1) wrapping for curl(1)
+# Eventually, we may support using libcurl via Inline::C and/or
+# WWW::Curl; but curl(1) is most prevalent and widely-installed.
 package PublicInbox::LeiCurl;
 use strict;
 use v5.10.1;
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index b402eed4..8a51afcb 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -44,6 +44,8 @@ sub ext_canonicalize {
 	}
 }
 
+# TODO: we will probably extract glob2re into a separate module for
+# PublicInbox::Filter::Base and maybe other places
 my %re_map = ( '*' => '[^/]*?', '?' => '[^/]',
 		'[' => '[', ']' => ']', ',' => ',' );
 
@@ -99,6 +101,7 @@ sub get_externals {
 	();
 }
 
+# TODO: does this need JSON output?
 sub lei_ls_external {
 	my ($self, $filter) = @_;
 	my $opt = $self->{opt};

^ permalink raw reply related	[relevance 71%]

* [PATCH 4/6] lei ls-external: support --local and --remote
  2021-02-10  7:07 71% [PATCH 0/6] more lei stuffs Eric Wong
  2021-02-10  7:07 47% ` [PATCH 1/6] lei *external: glob improvements, ls-external filtering Eric Wong
  2021-02-10  7:07 71% ` [PATCH 3/6] test_common: support lei-daemon only testing Eric Wong
@ 2021-02-10  7:07 49% ` Eric Wong
  2021-02-10  7:07 71% ` [PATCH 5/6] lei: note some TODO items (curl, externals) Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-10  7:07 UTC (permalink / raw)
  To: meta

Similar to "lei q", "--local" means only local and "--remote"
means remote only.  I can't think of a reason to have --no-*
variants for these switches.

There's also updates to the TestCommon for more common lei
cases.
---
 lib/PublicInbox/LeiExternal.pm | 12 +++++++++---
 lib/PublicInbox/TestCommon.pm  | 11 ++++++++++-
 t/lei-externals.t              | 36 +++++++++++++++++++++++++---------
 3 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index b4e1918d..b402eed4 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -101,16 +101,22 @@ sub get_externals {
 
 sub lei_ls_external {
 	my ($self, $filter) = @_;
-	my $do_glob = !$self->{opt}->{globoff}; # glob by default
-	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
+	my $opt = $self->{opt};
+	my $do_glob = !$opt->{globoff}; # glob by default
+	my ($OFS, $ORS) = $opt->{z} ? ("\0", "\0\0") : (" ", "\n");
 	$filter //= '*';
 	my $re = $do_glob ? glob2re($filter) : undef;
 	$re //= index($filter, '/') < 0 ?
 			qr!/\Q$filter\E/?\z! : # exact basename match
 			qr/\Q$filter\E/; # grep -F semantics
 	my @ext = externals_each($self, my $boost = {});
-	@ext = $self->{opt}->{'invert-match'} ? grep(!/$re/, @ext)
+	@ext = $opt->{'invert-match'} ? grep(!/$re/, @ext)
 					: grep(/$re/, @ext);
+	if ($opt->{'local'} && !$opt->{remote}) {
+		@ext = grep(!m!\A[a-z\+]+://!, @ext);
+	} elsif ($opt->{remote} && !$opt->{'local'}) {
+		@ext = grep(m!\A[a-z\+]+://!, @ext);
+	}
 	for my $loc (@ext) {
 		$self->out($loc, $OFS, 'boost=', $boost->{$loc}, $ORS);
 	}
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 64fe0499..f5b3fae4 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -9,12 +9,14 @@ use v5.10.1;
 use Fcntl qw(FD_CLOEXEC F_SETFD F_GETFD :seek);
 use POSIX qw(dup2);
 use IO::Socket::INET;
+use File::Spec;
 our @EXPORT;
 BEGIN {
 	@EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes
-		tcp_host_port test_lei lei $lei $lei_out $lei_err $lei_opt);
+		tcp_host_port test_lei lei lei_ok
+		$lei $lei_out $lei_err $lei_opt);
 	require Test::More;
 	my @methods = grep(!/\W/, @Test::More::EXPORT);
 	eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
@@ -459,6 +461,13 @@ our $lei = sub {
 
 sub lei (@) { $lei->(@_) }
 
+sub lei_ok (@) {
+	my $msg = ref($_[-1]) ? pop(@_) : undef;
+	# filter out anything that looks like a path name for consistent logs
+	my @msg = grep(!m!\A/!, @_);
+	ok($lei->(@_), "lei @msg". ($msg ? " ($$msg)" : ''));
+}
+
 sub json_utf8 () {
 	state $x = ref(PublicInbox::Config->json)->new->utf8->canonical;
 }
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 28c01174..9fc8bae9 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -35,20 +35,20 @@ test_lei(sub {
 	my $home = $ENV{HOME};
 	my $config_file = "$home/.config/lei/config";
 	my $store_dir = "$home/.local/share/lei";
-	ok($lei->('ls-external'), 'ls-external works');
+	lei_ok 'ls-external', \'ls-external on fresh install';
 	is($lei_out.$lei_err, '', 'ls-external no output, yet');
 	ok(!-e $config_file && !-e $store_dir,
 		'nothing created by ls-external');
 
-	ok(!$lei->('add-external', "$home/nonexistent"),
-		"fails on non-existent dir");
-	ok($lei->('ls-external'), 'ls-external works after add failure');
+	ok(!lei('add-external', "$home/nonexistent",
+		"fails on non-existent dir"));
+	lei_ok('ls-external', \'ls-external works after add failure');
 	is($lei_out.$lei_err, '', 'ls-external still has no output');
 	my $cfg = PublicInbox::Config->new($cfg_path);
 	$cfg->each_inbox(sub {
 		my ($ibx) = @_;
-		ok($lei->(qw(add-external -q), $ibx->{inboxdir}),
-			'added external');
+		lei_ok(qw(add-external -q), $ibx->{inboxdir},
+				\'added external');
 		is($lei_out.$lei_err, '', 'no output');
 	});
 	ok(-s $config_file && -e $store_dir,
@@ -59,12 +59,30 @@ test_lei(sub {
 		is($lcfg->{"external.$ibx->{inboxdir}.boost"}, 0,
 			"configured boost on $ibx->{name}");
 	});
-	$lei->('ls-external');
+	lei_ok 'ls-external';
 	like($lei_out, qr/boost=0\n/s, 'ls-external has output');
-	ok($lei->(qw(add-external -q https://EXAMPLE.com/ibx)), 'add remote');
+	lei_ok qw(add-external -q https://EXAMPLE.com/ibx), \'add remote';
 	is($lei_err, '', 'no warnings after add-external');
 
-	ok($lei->(qw(_complete lei forget-external)), 'complete for externals');
+	{
+		lei_ok qw(ls-external --remote);
+		my $r_only = +{ map { $_ => 1 } split(/^/m, $lei_out) };
+		lei_ok qw(ls-external --local);
+		my $l_only = +{ map { $_ => 1 } split(/^/m, $lei_out) };
+		lei_ok 'ls-external';
+		is_deeply([grep { $l_only->{$_} } keys %$r_only], [],
+			'no locals in --remote');
+		is_deeply([grep { $r_only->{$_} } keys %$l_only], [],
+			'no remotes in --local');
+		my $all = +{ map { $_ => 1 } split(/^/m, $lei_out) };
+		is_deeply($all, { %$r_only, %$l_only },
+				'default output combines remote + local');
+		lei_ok qw(ls-external --remote --local);
+		my $both = +{ map { $_ => 1 } split(/^/m, $lei_out) };
+		is_deeply($all, $both, '--remote --local == no args');
+	}
+
+	lei_ok qw(_complete lei forget-external), \'complete for externals';
 	my %comp = map { $_ => 1 } split(/\s+/, $lei_out);
 	ok($comp{'https://example.com/ibx/'}, 'forget external completion');
 	$cfg->each_inbox(sub {

^ permalink raw reply related	[relevance 49%]

* [PATCH 0/2] WWW + "lei q --stdin": support git approxidate
@ 2021-02-10 19:57 70% Eric Wong
  2021-02-10 19:57 37% ` [PATCH 1/2] search: use git approxidate in WWW and "lei q --stdin" Eric Wong
  2021-02-12  4:34 71% ` [PATCH 0/2] WWW + "lei q --stdin": support git approxidate Kyle Meyer
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-02-10 19:57 UTC (permalink / raw)
  To: meta

1/2 is something I've wanted since 2015.  It could be done in a
less janky way if we didn't have to spawn git-rev-parse(1)
(libgit2 doesn't expose git__date_parse) AND if we didn't need
to support both XS and SWIG Xapian bindings.

But it's stable enough performance-wise for now with a single
git(1) process that I don't worry about making it public-facing
in WWW.

I'm not completely sure about 2/2, but I figure lei maintaining
consistency with WWW is slightly more important that consistency
with git (and git doesn't use mairix-like prefixes :P).
I also don't want to muck around too much with how Xapian does
quoted phrases

Eric Wong (2):
  search: use git approxidate in WWW and "lei q --stdin"
  search: disallow spaces in argv approxidate queries

 lib/PublicInbox/Isearch.pm    |  1 +
 lib/PublicInbox/LeiQuery.pm   |  8 +++++++-
 lib/PublicInbox/Mbox.pm       |  1 +
 lib/PublicInbox/Search.pm     | 37 ++++++++++++++++++++++++++---------
 lib/PublicInbox/SearchView.pm |  3 ++-
 t/lei-externals.t             |  2 +-
 t/psgi_search.t               | 37 ++++++++++++++++++++---------------
 t/search.t                    | 25 +++++++++++++++++++++++
 8 files changed, 86 insertions(+), 28 deletions(-)

^ permalink raw reply	[relevance 70%]

* [PATCH 1/2] search: use git approxidate in WWW and "lei q --stdin"
  2021-02-10 19:57 70% [PATCH 0/2] WWW + "lei q --stdin": support git approxidate Eric Wong
@ 2021-02-10 19:57 37% ` Eric Wong
  2021-02-12  4:34 71% ` [PATCH 0/2] WWW + "lei q --stdin": support git approxidate Kyle Meyer
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-02-10 19:57 UTC (permalink / raw)
  To: meta

This greatly improves the usability of d:, dt:, and rt: search
prefixes for users already familiar git's "approxidate" feature.

That is, users familiar with the --(since|after|until|before)=
options in git-log(1) and similar commands will be able to use
those dates in the WWW UI.
---
 lib/PublicInbox/Isearch.pm    |  1 +
 lib/PublicInbox/LeiQuery.pm   |  8 +++++++-
 lib/PublicInbox/Mbox.pm       |  1 +
 lib/PublicInbox/Search.pm     | 35 +++++++++++++++++++++++++--------
 lib/PublicInbox/SearchView.pm |  3 ++-
 t/lei-externals.t             |  2 +-
 t/psgi_search.t               | 37 ++++++++++++++++++++---------------
 t/search.t                    | 25 +++++++++++++++++++++++
 8 files changed, 85 insertions(+), 27 deletions(-)

diff --git a/lib/PublicInbox/Isearch.pm b/lib/PublicInbox/Isearch.pm
index 342d7913..9ed2d9e5 100644
--- a/lib/PublicInbox/Isearch.pm
+++ b/lib/PublicInbox/Isearch.pm
@@ -25,6 +25,7 @@ SELECT ibx_id FROM inboxes WHERE eidx_key = ? LIMIT 1
 		die "E: `$self->{eidx_key}' not in $self->{es}->{topdir}\n";
 }
 
+sub query_approxidate { $_[0]->{es}->query_approxidate($_[1], $_[2]) }
 
 sub mset {
 	my ($self, $str, $opt) = @_;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index d637b1ae..f71beae6 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -14,7 +14,12 @@ sub prep_ext { # externals_each callback
 sub qstr_add { # for --stdin
 	my ($self) = @_; # $_[1] = $rbuf
 	if (defined($_[1])) {
-		return eval { $self->{lxs}->do_query($self) } if $_[1] eq '';
+		$_[1] eq '' and return eval {
+			my $lse = delete $self->{lse};
+			$lse->query_approxidate($lse->git,
+						$self->{mset_opt}->{qstr});
+			$self->{lxs}->do_query($self);
+		};
 		$self->{mset_opt}->{qstr} .= $_[1];
 	} else {
 		$self->fail("error reading stdin: $!");
@@ -105,6 +110,7 @@ sub lei_q {
 no query allowed on command-line with --stdin
 
 		require PublicInbox::InputPipe;
+		$self->{lse} = $lse; # for query_approxidate
 		PublicInbox::InputPipe::consume($self->{0}, \&qstr_add, $self);
 		return;
 	}
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 94f733bc..844099aa 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -237,6 +237,7 @@ sub mbox_all {
 
 	my $qopts = $ctx->{qopts} = { relevance => -2 }; # ORDER BY docid DESC
 	$qopts->{threads} = 1 if $q->{t};
+	$srch->query_approxidate($ctx->{ibx}->git, $q_string);
 	my $mset = $srch->mset($q_string, $qopts);
 	$qopts->{offset} = $mset->size or
 			return [404, [qw(Content-Type text/plain)],
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index b3fd532d..8e4cce33 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -321,6 +321,16 @@ sub date_parse_prepare {
 	"$pfx:".join('..', @r).$end;
 }
 
+sub date_parse_finalize {
+	my ($git, $to_parse) = @_;
+	# git-rev-parse can handle any number of args up to system
+	# limits (around (4096*32) bytes on Linux).
+	my @r = $git->date_parse(@$to_parse);
+	my $i;
+	$_[2] =~ s/\0(%[%YmdHMSs]+)([0-9\+]+)\0/strftime($1,
+		gmtime($2 eq '+' ? ($r[$i]+86400) : $r[$i=$2+0]))/sge;
+}
+
 # n.b. argv never has NUL, though we'll need to filter it out
 # if this $argv isn't from a command execution
 sub query_argv_to_string {
@@ -336,17 +346,26 @@ sub query_argv_to_string {
 			$_
 		}
 	} @$argv);
-	# git-rev-parse can handle any number of args up to system
-	# limits (around (4096*32) bytes on Linux).
-	if ($to_parse) {
-		my @r = $git->date_parse(@$to_parse);
-		my $i;
-		$tmp =~ s/\0(%[%YmdHMSs]+)([0-9\+]+)\0/strftime($1,
-			gmtime($2 eq '+' ? ($r[$i]+86400) : $r[$i=$2+0]))/sge;
-	}
+	date_parse_finalize($git, $to_parse, $tmp) if $to_parse;
 	$tmp
 }
 
+# this is for the WWW "q=" query parameter and "lei q --stdin"
+# it can't do d:"5 days ago", but it will do d:5.days.ago
+sub query_approxidate {
+	my (undef, $git) = @_; # $_[2] = $query_string (modified in-place)
+	my $DQ = qq<"\x{201c}\x{201d}>; # Xapian can use curly quotes
+	$_[2] =~ tr/\x00/ /; # Xapian doesn't do NUL, we use it as a placeholder
+	my ($terms, $phrase, $to_parse);
+	$_[2] =~ s{([^$DQ]*)([${DQ}][^\"]*[$DQ])?}{
+		($terms, $phrase) = ($1, $2);
+		$terms =~ s!\b(d|rt|dt):(\S+)!
+			date_parse_prepare($to_parse //= [], $1, $2)!sge;
+		$terms.($phrase // '');
+		}sge;
+	date_parse_finalize($git, $to_parse, $_[2]) if $to_parse;
+}
+
 # read-only
 sub mset {
 	my ($self, $query_string, $opts) = @_;
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index 08c77f35..2d0b8e13 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -34,7 +34,6 @@ sub sres_top_html {
 		return PublicInbox::WWW::need($ctx, 'Search');
 	my $q = PublicInbox::SearchQuery->new($ctx->{qp});
 	my $x = $q->{x};
-	my $query = $q->{'q'};
 	my $o = $q->{o};
 	my $asc;
 	if ($o < 0) {
@@ -54,6 +53,8 @@ sub sres_top_html {
 	my ($mset, $total, $err, $html);
 retry:
 	eval {
+		my $query = $q->{'q'};
+		$srch->query_approxidate($ctx->{ibx}->git, $query);
 		$mset = $srch->mset($query, $opts);
 		$total = $mset->get_matches_estimated;
 	};
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 9fc8bae9..f61b7e52 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -151,7 +151,7 @@ SKIP: {
 	{
 		open my $fh, '+>', undef or BAIL_OUT $!;
 		$fh->autoflush(1);
-		print $fh 's:use' or BAIL_OUT $!;
+		print $fh 's:use d:..5.days.from.now' or BAIL_OUT $!;
 		seek($fh, 0, SEEK_SET) or BAIL_OUT $!;
 		ok($lei->([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $fh }),
 				'--stdin on regular file works');
diff --git a/t/psgi_search.t b/t/psgi_search.t
index 8ba431bc..514df005 100644
--- a/t/psgi_search.t
+++ b/t/psgi_search.t
@@ -74,20 +74,25 @@ EOF
 my $www = PublicInbox::WWW->new($cfg);
 test_psgi(sub { $www->call(@_) }, sub {
 	my ($cb) = @_;
-	my $res;
-	$res = $cb->(GET('/test/?q=%C3%86var'));
-	my $html = $res->content;
-	like($html, qr/<title>&#198;var - /, 'HTML escaped in title');
-	my @res = ($html =~ m/\?q=(.+var)\b/g);
-	ok(scalar(@res), 'saw query strings');
-	my %uniq = map { $_ => 1 } @res;
-	is(1, scalar keys %uniq, 'all query values identical in HTML');
-	is('%C3%86var', (keys %uniq)[0], 'matches original query');
-	ok(index($html, 'by &#198;var Arnfj&#246;r&#240; Bjarmason') >= 0,
-		"displayed Ævar's name properly in HTML");
-
-	like($html, qr/download mbox\.gz: .*?"full threads"/s,
-		'"full threads" download option shown');
+	my ($html, $res);
+	my $approxidate = '1.hour.from.now';
+	for my $req ('/test/?q=%C3%86var', '/test/?q=%25C3%2586var') {
+		$res = $cb->(GET($req."+d:..$approxidate"));
+		$html = $res->content;
+		like($html, qr/<title>&#198;var d:\.\.\Q$approxidate\E/,
+			'HTML escaped in title, "d:..$APPROXIDATE" preserved');
+		my @res = ($html =~ m/\?q=(.+var)\+d:\.\.\Q$approxidate\E/g);
+		ok(scalar(@res), 'saw query strings');
+		my %uniq = map { $_ => 1 } @res;
+		is(1, scalar keys %uniq, 'all query values identical in HTML');
+		is('%C3%86var', (keys %uniq)[0], 'matches original query');
+		ok(index($html, 'by &#198;var Arnfj&#246;r&#240; Bjarmason')
+			>= 0, "displayed Ævar's name properly in HTML");
+		like($html, qr/download mbox\.gz: .*?"full threads"/s,
+			'"full threads" download option shown');
+	}
+	like($html, qr/Initial query\b.*?returned no.results, used:.*instead/s,
+		'noted retry on double-escaped query {-uxs_retried}');
 
 	my $warn = [];
 	local $SIG{__WARN__} = sub { push @$warn, @_ };
@@ -130,7 +135,7 @@ test_psgi(sub { $www->call(@_) }, sub {
 		qr/filename=no-subject\.mbox\.gz/);
 
 	# "full threads" mbox.gz download
-	$res = $cb->(POST('/test/?q=s:test&x=m&t'));
+	$res = $cb->(POST('/test/?q=s:test+d:..1.hour.from.now&x=m&t'));
 	is($res->code, 200, 'successful mbox download with threads');
 	gunzip(\($res->content) => \(my $before));
 	is_deeply([ "Message-ID: <$mid>\n", "Message-ID: <reply\@asdf>\n" ],
@@ -151,7 +156,7 @@ test_psgi(sub { $www->call(@_) }, sub {
 		'"full threads" download option not shown w/o has_threadid');
 
 	# in case somebody uses curl to bypass <form>
-	$res = $cb->(POST('/test/?q=s:test&x=m&t'));
+	$res = $cb->(POST("/test/?q=s:test+d:..$approxidate&x=m&t"));
 	is($res->code, 200, 'successful mbox download w/ threads');
 	gunzip(\($res->content) => \(my $after));
 	isnt($before, $after);
diff --git a/t/search.t b/t/search.t
index bcfe91f5..77081231 100644
--- a/t/search.t
+++ b/t/search.t
@@ -583,6 +583,31 @@ SKIP: {
 	$q = $s->query_argv_to_string($g, [qw{OR (rt:1993-10-02)}]);
 	like($q, qr/\AOR \(rt:749\d{6}\.\.749\d{6}\)\z/,
 		'trailing parentheses preserved');
+
+	my $qs = qq[f:bob rt:1993-10-02..2010-10-02];
+	$s->query_approxidate($g, $qs);
+	like($qs, qr/\Af:bob rt:749\d{6}\.\.1286\d{6}\z/,
+		'no phrases, no problem');
+
+	my $orig = $qs = qq[f:bob "d:1993-10-02..2010-10-02"];
+	$s->query_approxidate($g, $qs);
+	is($qs, $orig, 'phrase preserved');
+
+	$orig = $qs = qq[f:bob "d:1993-10-02..2010-10-02 "] .
+			qq["dt:1993-10-02..2010-10-02 " \x{201c}];
+	$s->query_approxidate($g, $qs);
+	is($qs, $orig, 'phrase preserved even with escaped ""');
+
+	$orig = $qs = qq[f:bob "hello world" d:1993-10-02..2010-10-02];
+	$s->query_approxidate($g, $qs);
+	is($qs, qq[f:bob "hello world" d:19931002..20101002],
+		'post-phrase date corrected');
+
+	my $x_days_ago = strftime('%Y%m%d', gmtime(time - (5 * 86400)));
+	$orig = $qs = qq[broken d:5.days.ago..];
+	$s->query_approxidate($g, $qs);
+	is($qs, qq[broken d:$x_days_ago..], 'date.phrase.with.dots');
+
 	$ENV{TEST_EXPENSIVE} or
 		skip 'TEST_EXPENSIVE not set for argv overflow check', 1;
 	my @w;

^ permalink raw reply related	[relevance 37%]

* [PATCH 0/4] doc: lei manpages, round 2
@ 2021-02-11  4:04 71% Kyle Meyer
  2021-02-11  4:04 71% ` [PATCH 1/4] doc: lei q: use 'mfolder' as --output placeholder Kyle Meyer
                   ` (4 more replies)
  0 siblings, 5 replies; 200+ results
From: Kyle Meyer @ 2021-02-11  4:04 UTC (permalink / raw)
  To: meta

This series updates the lei manpages, continuing from
<20210201055704.26683-1-kyle@kyleam.com>.  It covers changes up to the
current tip of master (e49cf9c629c..7a1fe192b9f).

  [1/4] doc: lei q: use 'mfolder' as --output placeholder
  [2/4] doc: lei: prefer 'location' and 'dirname'
  [3/4] doc: add lei-import(1)
  [4/4] doc: lei: update manpages

 Documentation/lei-add-external.pod    | 66 +++++++++++++++++++++++++--
 Documentation/lei-forget-external.pod |  2 +-
 Documentation/lei-import.pod          | 54 ++++++++++++++++++++++
 Documentation/lei-init.pod            |  4 +-
 Documentation/lei-ls-external.pod     | 17 ++++++-
 Documentation/lei-overview.pod        | 17 ++++++-
 Documentation/lei-q.pod               | 46 ++++++++++++++++++-
 Documentation/lei.pod                 |  4 +-
 Documentation/txt2pre                 |  1 +
 MANIFEST                              |  1 +
 Makefile.PL                           |  2 +-
 11 files changed, 199 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/lei-import.pod


base-commit: 7a1fe192b9f63f057a21cb60c5e0e85b2ca34d50
-- 
2.30.0


^ permalink raw reply	[relevance 71%]

* [PATCH 1/4] doc: lei q: use 'mfolder' as --output placeholder
  2021-02-11  4:04 71% [PATCH 0/4] doc: lei manpages, round 2 Kyle Meyer
@ 2021-02-11  4:04 71% ` Kyle Meyer
  2021-02-11  4:04 65% ` [PATCH 2/4] doc: lei: prefer 'location' and 'dirname' Kyle Meyer
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-11  4:04 UTC (permalink / raw)
  To: meta

'mfolder' is familiar to mairix users, and 'path' isn't a good choice
because support will be added for IMAP.

Link: https://public-inbox.org/meta/YCBh62OqkYnr5cqw@dcvr
---
 Documentation/lei-q.pod | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 8f053a55..405cf48f 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -16,7 +16,7 @@ TODO: Give common prefixes, or at least a description/reference.
 
 =over
 
-=item -o PATH, --output=PATH, --mfolder=PATH
+=item -o MFOLDER, --output=MFOLDER, --mfolder=MFOLDER
 
 Destination for results (e.g., C<path/to/Maildir> or - for stdout).
 
-- 
2.30.0


^ permalink raw reply related	[relevance 71%]

* [PATCH 2/4] doc: lei: prefer 'location' and 'dirname'
  2021-02-11  4:04 71% [PATCH 0/4] doc: lei manpages, round 2 Kyle Meyer
  2021-02-11  4:04 71% ` [PATCH 1/4] doc: lei q: use 'mfolder' as --output placeholder Kyle Meyer
@ 2021-02-11  4:04 65% ` Kyle Meyer
  2021-02-11  4:04 52% ` [PATCH 3/4] doc: add lei-import(1) Kyle Meyer
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-11  4:04 UTC (permalink / raw)
  To: meta

This follows the help output change in 52342875 (lei help: split out
into separate file, 2021-02-06).
---
 Documentation/lei-add-external.pod    | 4 ++--
 Documentation/lei-forget-external.pod | 2 +-
 Documentation/lei-init.pod            | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/lei-add-external.pod b/Documentation/lei-add-external.pod
index dd87be62..ebefb4cf 100644
--- a/Documentation/lei-add-external.pod
+++ b/Documentation/lei-add-external.pod
@@ -4,12 +4,12 @@ lei-add-external - add inbox or external index
 
 =head1 SYNOPSIS
 
-lei add-external [OPTIONS] URL_OR_PATHNAME
+lei add-external [OPTIONS] LOCATION
 
 =head1 DESCRIPTION
 
 Configure lei to search against an external (an inbox or external
-index).  When C<URL_OR_PATHNAME> is a local path, it should point to a
+index).  When C<LOCATION> is a local path, it should point to a
 directory that is a C<public.<name>.inboxdir> or
 C<extindex.<name>.topdir> value in ~/.public-inbox/config.
 
diff --git a/Documentation/lei-forget-external.pod b/Documentation/lei-forget-external.pod
index 40287bd3..3ad6bd45 100644
--- a/Documentation/lei-forget-external.pod
+++ b/Documentation/lei-forget-external.pod
@@ -4,7 +4,7 @@ lei-forget-external - forget external locations
 
 =head1 SYNOPSIS
 
-lei forget-external [OPTIONS] URL_OR_PATHNAME [URL_OR_PATHNAME...]
+lei forget-external [OPTIONS] LOCATION [LOCATION...]
 
 =head1 DESCRIPTION
 
diff --git a/Documentation/lei-init.pod b/Documentation/lei-init.pod
index 8a8022fb..bc687f72 100644
--- a/Documentation/lei-init.pod
+++ b/Documentation/lei-init.pod
@@ -4,11 +4,11 @@ lei-init - initialize storage
 
 =head1 SYNOPSIS
 
-lei init [OPTIONS] [PATHNAME]
+lei init [OPTIONS] [DIRNAME]
 
 =head1 DESCRIPTION
 
-Initialize local writable storage for L<lei(1)>.  If C<PATHNAME> is
+Initialize local writable storage for L<lei(1)>.  If C<DIRNAME> is
 unspecified, the storage is created at C<$XDG_DATA_HOME/lei/store>.
 C<leistore.dir> in C<$XDG_CONFIG_HOME/lei/config> records this
 location.
-- 
2.30.0


^ permalink raw reply related	[relevance 65%]

* [PATCH 3/4] doc: add lei-import(1)
  2021-02-11  4:04 71% [PATCH 0/4] doc: lei manpages, round 2 Kyle Meyer
  2021-02-11  4:04 71% ` [PATCH 1/4] doc: lei q: use 'mfolder' as --output placeholder Kyle Meyer
  2021-02-11  4:04 65% ` [PATCH 2/4] doc: lei: prefer 'location' and 'dirname' Kyle Meyer
@ 2021-02-11  4:04 52% ` Kyle Meyer
  2021-02-11  4:04 44% ` [PATCH 4/4] doc: lei: update manpages Kyle Meyer
  2021-02-11  5:08 71% ` [PATCH 0/4] doc: lei manpages, round 2 Eric Wong
  4 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-11  4:04 UTC (permalink / raw)
  To: meta

---
 Documentation/lei-add-external.pod |  2 +-
 Documentation/lei-import.pod       | 54 ++++++++++++++++++++++++++++++
 Documentation/lei-overview.pod     | 10 +++++-
 Documentation/lei.pod              |  4 +--
 Documentation/txt2pre              |  1 +
 MANIFEST                           |  1 +
 Makefile.PL                        |  2 +-
 7 files changed, 69 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/lei-import.pod

diff --git a/Documentation/lei-add-external.pod b/Documentation/lei-add-external.pod
index ebefb4cf..1be3f905 100644
--- a/Documentation/lei-add-external.pod
+++ b/Documentation/lei-add-external.pod
@@ -44,6 +44,6 @@ License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
 
 =head1 SEE ALSO
 
-L<lei-forget-external(1)>, L<lei-ls-external(1)>,
+L<lei-forget-external(1)>, L<lei-ls-external(1)>, L<lei-import(1)>,
 L<public-inbox-index(1)>, L<public-inbox-extindex(1)>,
 L<public-inbox-extindex-format(5)>
diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
new file mode 100644
index 00000000..14ca2d45
--- /dev/null
+++ b/Documentation/lei-import.pod
@@ -0,0 +1,54 @@
+=head1 NAME
+
+lei-import - one-time import of messages into local store
+
+=head1 SYNOPSIS
+
+lei import [OPTIONS] LOCATION [LOCATION...]
+
+lei import [OPTIONS] --stdin
+
+=head1 DESCRIPTION
+
+Import messages into the local storage of L<lei(1)>.  C<LOCATION> is a
+source of messages: a directory (Maildir) or a file (whose format is
+specified via C<--format>).
+
+TODO: Update when URL support is added.
+
+=head1 OPTIONS
+
+=over
+
+=item -f MAIL_FORMAT, --format=MAIL_FORMAT
+
+Message input format: C<eml>, C<mboxrd>, C<mboxcl2>, C<mboxcl>,
+C<mboxo>.
+
+=item --stdin
+
+Read messages from stdin.
+
+=item --no-kw, --no-keywords, --no-flags
+
+Don't import message keywords (or "flags" in IMAP terminology).
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+
+=head1 SEE ALSO
+
+L<lei-add-external(1)>
diff --git a/Documentation/lei-overview.pod b/Documentation/lei-overview.pod
index d1903045..33ddb528 100644
--- a/Documentation/lei-overview.pod
+++ b/Documentation/lei-overview.pod
@@ -12,7 +12,15 @@ provides some basic examples.
 L<lei-init(1)> initializes writable local storage based on
 L<public-inbox-v2-format(5)>.
 
-TODO: Extend when lei-import and friends are added.
+=head2 EXAMPLES
+
+=over
+
+=item $ lei import --format=mboxrd t.mbox
+
+Import the messages from an mbox into the local storage.
+
+=back
 
 =head1 EXTERNALS
 
diff --git a/Documentation/lei.pod b/Documentation/lei.pod
index e12a157d..9ce9e9a4 100644
--- a/Documentation/lei.pod
+++ b/Documentation/lei.pod
@@ -27,9 +27,9 @@ Subcommands for initializing and managing local, writable storage:
 
 =item * L<lei-init(1)>
 
-=back
+=item * L<lei-import(1)>
 
-TODO: Add commands like lei-import once they're implemented.
+=back
 
 The following subcommands can be used to manage and inspect external
 locations:
diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index 604490ef..8421cad7 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -16,6 +16,7 @@ for (qw[lei(1)
 	lei-daemon-kill(1)
 	lei-daemon-pid(1)
 	lei-forget-external(1)
+	lei-import(1)
 	lei-init(1)
 	lei-ls-external(1)
 	lei-overview(7)
diff --git a/MANIFEST b/MANIFEST
index 92226d5a..1794d930 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -26,6 +26,7 @@ Documentation/lei-config.pod
 Documentation/lei-daemon-kill.pod
 Documentation/lei-daemon-pid.pod
 Documentation/lei-forget-external.pod
+Documentation/lei-import.pod
 Documentation/lei-init.pod
 Documentation/lei-ls-external.pod
 Documentation/lei-overview.pod
diff --git a/Makefile.PL b/Makefile.PL
index 6fb0d560..89f1774e 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -45,7 +45,7 @@ $v->{-m1} = [ map {
 	} @EXE_FILES,
 	qw(
 	lei-add-external lei-config lei-daemon-kill lei-daemon-pid
-	lei-forget-external lei-init lei-ls-external lei-q)];
+	lei-forget-external lei-import lei-init lei-ls-external lei-q)];
 $v->{-m5} = [ qw(public-inbox-config public-inbox-v1-format
 		public-inbox-v2-format public-inbox-extindex-format) ];
 $v->{-m7} = [ qw(lei-overview public-inbox-overview public-inbox-tuning) ];
-- 
2.30.0


^ permalink raw reply related	[relevance 52%]

* [PATCH 4/4] doc: lei: update manpages
  2021-02-11  4:04 71% [PATCH 0/4] doc: lei manpages, round 2 Kyle Meyer
                   ` (2 preceding siblings ...)
  2021-02-11  4:04 52% ` [PATCH 3/4] doc: add lei-import(1) Kyle Meyer
@ 2021-02-11  4:04 44% ` Kyle Meyer
  2021-02-11  5:08 71% ` [PATCH 0/4] doc: lei manpages, round 2 Eric Wong
  4 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-11  4:04 UTC (permalink / raw)
  To: meta

Catch up with recent developments.
---
 Documentation/lei-add-external.pod | 62 +++++++++++++++++++++++++++++-
 Documentation/lei-ls-external.pod  | 17 +++++++-
 Documentation/lei-overview.pod     |  7 +++-
 Documentation/lei-q.pod            | 44 +++++++++++++++++++++
 4 files changed, 125 insertions(+), 5 deletions(-)

diff --git a/Documentation/lei-add-external.pod b/Documentation/lei-add-external.pod
index 1be3f905..3bc0ba83 100644
--- a/Documentation/lei-add-external.pod
+++ b/Documentation/lei-add-external.pod
@@ -9,12 +9,14 @@ lei add-external [OPTIONS] LOCATION
 =head1 DESCRIPTION
 
 Configure lei to search against an external (an inbox or external
-index).  When C<LOCATION> is a local path, it should point to a
-directory that is a C<public.<name>.inboxdir> or
+index).  When C<LOCATION> is an existing local path, it should point
+to a directory that is a C<public.<name>.inboxdir> or
 C<extindex.<name>.topdir> value in ~/.public-inbox/config.
 
 =head1 OPTIONS
 
+TODO: mention curl options?
+
 =over
 
 =item --boost=NUMBER
@@ -23,6 +25,62 @@ Set priority of a new or existing location.
 
 Default: 0
 
+=item --mirror=URL
+
+Create C<LOCATION> by mirroring the public-inbox at C<URL>.
+
+=item -v, --verbose
+
+Provide more feedback on stderr.
+
+=item -q, --quiet
+
+Suppress feedback messages.
+
+=back
+
+=head2 MIRRORING
+
+=over
+
+=item --torsocks=auto|no|yes, --no-torsocks
+
+Whether to wrap L<git(1)> and L<curl(1)> commands with torsocks.
+
+Default: C<auto>
+
+=item --inbox-version=NUM
+
+Force a public-inbox version (must be C<1> or C<2>).
+
+=back
+
+The following options are passed to L<public-inbox-init(1)>:
+
+=over
+
+=item -j JOBS, --jobs=JOBS
+
+=item -L LEVEL, --indexlevel=LEVEL
+
+=back
+
+The following options are passed to L<public-inbox-index(1)>:
+
+=over
+
+=item --batch-size=SIZE
+
+=item --compact
+
+=item -j JOBS, --jobs=JOBS
+
+=item --max-size=SIZE
+
+=item --sequential-shard
+
+=item --skip-docdata
+
 =back
 
 =head1 FILES
diff --git a/Documentation/lei-ls-external.pod b/Documentation/lei-ls-external.pod
index 1735faa9..85d951f0 100644
--- a/Documentation/lei-ls-external.pod
+++ b/Documentation/lei-ls-external.pod
@@ -4,16 +4,29 @@ lei-ls-external - list inbox and external index locations
 
 =head1 SYNOPSIS
 
-lei ls-external [OPTIONS]
+lei ls-external [OPTIONS] [FILTER]
 
 =head1 DESCRIPTION
 
-List configured externals.
+List configured externals.  If C<FILTER> is given, restrict the output
+to matching entries.
 
 =head1 OPTIONS
 
 =over
 
+=item -g, --globoff
+
+Do not match C<FILTER> using C<*?> wildcards and C<[]> ranges.
+
+=item --local
+
+Limit operations to the local filesystem.
+
+=item --remote
+
+Limit operations to those requiring network access.
+
 =item -z, -0
 
 Use C<\0> (NUL) instead of newline (CR) to delimit lines.
diff --git a/Documentation/lei-overview.pod b/Documentation/lei-overview.pod
index 33ddb528..840d011b 100644
--- a/Documentation/lei-overview.pod
+++ b/Documentation/lei-overview.pod
@@ -27,7 +27,7 @@ Import the messages from an mbox into the local storage.
 In addition to the above store, lei can make read-only queries to
 "externals": inboxes and external indices.  An external can be
 registered by passing a URL or local path to L<lei-add-external(1)>.
-For local paths, the external needs to be indexed with
+For existing local paths, the external needs to be indexed with
 L<public-inbox-index(1)> (in the case of a regular inbox) or
 L<public-inbox-extindex(1)> (in the case of an external index).
 
@@ -39,6 +39,11 @@ L<public-inbox-extindex(1)> (in the case of an external index).
 
 Add a remote external for public-inbox's inbox.
 
+=item $ lei add-external --mirror https://public-inbox.org/meta/ path
+
+Clone L<https://public-inbox.org/meta/> to C<path>, index it with
+L<public-inbox-index(1)>, and add it as a local external.
+
 =back
 
 =head1 SEARCHING
diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 405cf48f..c8df6fc7 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -6,6 +6,8 @@ lei-q - search for messages matching terms
 
 lei q [OPTIONS] TERM [TERM...]
 
+lei q [OPTIONS] --stdin
+
 =head1 DESCRIPTION
 
 Search for messages across the lei store and externals.
@@ -14,8 +16,14 @@ TODO: Give common prefixes, or at least a description/reference.
 
 =head1 OPTIONS
 
+TODO: mention curl options?
+
 =over
 
+=item --stdin
+
+Read search terms from stdin.
+
 =item -o MFOLDER, --output=MFOLDER, --mfolder=MFOLDER
 
 Destination for results (e.g., C<path/to/Maildir> or - for stdout).
@@ -43,6 +51,18 @@ For a subset of MUAs known to accept a mailbox via C<-f>, COMMAND can
 be abbreviated to the name of the program: C<mutt>, C<mailx>, C<mail>,
 or C<neomutt>.
 
+=item --alert=COMMAND[,COMMAND...]
+
+Run C<COMMAND> after writing to output.  C<:WINCH> indicates to send
+C<SIGWINCH> to the C<--mua> process.  C<:bell> indicates to print a
+bell code.  Any other value is interpreted as a command to execute as
+is.
+
+This option may be given multiple times.
+
+Default: C<:WINCH,:bell> when C<--mua> is specified and C<--output>
+doesn't point to stdout, nothing otherwise.
+
 =item -a, --augment
 
 Augment output destination instead of clobbering it.
@@ -74,6 +94,26 @@ Limit operations to those requiring network access.
 
 Don't include results from externals.
 
+=item -I LOCATION, --include=LOCATION
+
+Include specified external in search.  This option may be given
+multiple times.
+
+=item --exclude=LOCATION
+
+Exclude specified external from search.  This option may be given
+multiple times.
+
+=item --only=LOCATION
+
+Use only the specified external for search.  This option may be given
+multiple times, in which case the search uses only the specified set.
+
+=item -g, --globoff
+
+Do not match locations using C<*?> wildcards and C<[]> ranges.  This
+option applies to C<--include>, C<--exclude>, and C<--only>.
+
 =item -NUMBER, -n NUMBER, --limit=NUMBER
 
 Limit the number of matches.
@@ -101,6 +141,10 @@ Default: C<received>
 
 Provide more feedback on stderr.
 
+=item -q, --quiet
+
+Suppress feedback messages.
+
 =item --torsocks=auto|no|yes, --no-torsocks
 
 Whether to wrap L<git(1)> and L<curl(1)> commands with torsocks.
-- 
2.30.0


^ permalink raw reply related	[relevance 44%]

* Re: [PATCH 0/4] doc: lei manpages, round 2
  2021-02-11  4:04 71% [PATCH 0/4] doc: lei manpages, round 2 Kyle Meyer
                   ` (3 preceding siblings ...)
  2021-02-11  4:04 44% ` [PATCH 4/4] doc: lei: update manpages Kyle Meyer
@ 2021-02-11  5:08 71% ` Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-11  5:08 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Thanks, pushed as f310a5054fb8e215885f0b48afac44ff32ca1d56
to https://80x24.org/public-inbox.git

^ permalink raw reply	[relevance 71%]

* Re: [PATCH 0/2] WWW + "lei q --stdin": support git approxidate
  2021-02-10 19:57 70% [PATCH 0/2] WWW + "lei q --stdin": support git approxidate Eric Wong
  2021-02-10 19:57 37% ` [PATCH 1/2] search: use git approxidate in WWW and "lei q --stdin" Eric Wong
@ 2021-02-12  4:34 71% ` Kyle Meyer
  1 sibling, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-12  4:34 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> 1/2 is something I've wanted since 2015.  It could be done in a
> less janky way if we didn't have to spawn git-rev-parse(1)
> (libgit2 doesn't expose git__date_parse) AND if we didn't need
> to support both XS and SWIG Xapian bindings.
>
> But it's stable enough performance-wise for now with a single
> git(1) process that I don't worry about making it public-facing
> in WWW.

Very neat :)  I bet I'll end up using this a lot.

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: fail_handler: use correct exit code
@ 2021-02-15  7:43 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-15  7:43 UTC (permalink / raw)
  To: meta

We were shifting in the wrong direction :x
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index eb5a646e..aa14ca6f 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -369,7 +369,7 @@ sub fail_handler ($;$$) {
 		$wq->wq_wait_old(undef, $lei) if $wq->wq_kill_old; # lei-daemon
 	}
 	close($io) if $io; # needed to avoid warnings on SIGPIPE
-	$lei->x_it($code // (1 >> 8));
+	x_it($lei, $code // (1 << 8));
 }
 
 sub sigpipe_handler { # handles SIGPIPE from @WQ_KEYS workers

^ permalink raw reply related	[relevance 71%]

* does "lei q" --format/-f need to exist?
@ 2021-02-17  4:40 71% Eric Wong
  2021-02-18  5:28 71% ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-17  4:40 UTC (permalink / raw)
  To: meta

"maildir:/path/to/dir" has been supported by public-inbox-watch
for years, now.

The following all work today:

	lei q -o mboxrd:/tmp/foo.mboxrd ...
	lei q -o mboxcl2:/tmp/foo.mboxcl2 ...
	lei q -o maildir:/tmp/foo/ ...

So -f/--format seems redundant.  I'm working on on
"lei import" for multiple sources, so being able to specify the
type/URL-scheme on a per-source basis seems like the way to go:

	lei import mboxrd:/tmp/foo.mboxrd maildir:/tmp/md/ ...
	lei import imaps://$host/INBOX.foo nntps://$host/news.group ...

And there's also "lei convert" in the wings (mainly for
testing/development purposes, but could be useful stand-alone):

	lei convert mboxrd:/tmp/foo.mboxrd maildir:/tmp/md/

^ permalink raw reply	[relevance 71%]

* [PATCH 05/11] lei import: move check_input_format to lei
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
  2021-02-17 10:06 71% ` [PATCH 01/11] lei: bless config Eric Wong
  2021-02-17 10:07 55% ` [PATCH 04/11] lei import: start rearranging code for IMAP support Eric Wong
@ 2021-02-17 10:07 85% ` Eric Wong
  2021-02-17 10:07 18% ` [PATCH 08/11] lei convert: mail format conversion sub-command Eric Wong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:07 UTC (permalink / raw)
  To: meta

We'll be supporting "lei convert" in a future change; so it
makes sense to share a common internal API for common error
messages.
---
 lib/PublicInbox/LEI.pm       | 14 ++++++++++++++
 lib/PublicInbox/LeiImport.pm | 17 ++---------------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 12deedd8..1fa9f751 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -391,6 +391,20 @@ sub fail ($$;$) {
 	undef;
 }
 
+sub check_input_format ($;$) {
+	my ($self, $files) = @_;
+	my $fmt = $self->{opt}->{'format'};
+	if (!$fmt) {
+		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
+		return fail($self, "--format unset for $err");
+	}
+	return 1 if $fmt eq 'eml';
+	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
+	require PublicInbox::MboxReader;
+	PublicInbox::MboxReader->can($fmt) ||
+				fail($self, "--format=$fmt unrecognized");
+}
+
 sub out ($;@) {
 	my $self = shift;
 	return if print { $self->{1} // return } @_; # likely
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index b25d7e97..32f3a467 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -29,19 +29,6 @@ sub import_done { # EOF callback for main daemon
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
-sub check_fmt ($;$) {
-	my ($lei, $f) = @_;
-	my $fmt = $lei->{opt}->{'format'};
-	if (!$fmt) {
-		my $err = $f ? "regular file(s):\n@$f" : '--stdin';
-		return $lei->fail("--format unset for $err");
-	}
-	return 1 if $fmt eq 'eml';
-	require PublicInbox::MboxReader;
-	PublicInbox::MboxReader->can($fmt) ||
-				$lei->fail( "--format=$fmt unrecognized\n");
-}
-
 sub do_import {
 	my ($lei) = @_;
 	my $ops = {
@@ -82,7 +69,7 @@ sub call { # the main "lei import" method
 	if ($lei->{opt}->{stdin}) {
 		@argv and return
 			$lei->fail("--stdin and locations (@argv) do not mix");
-		check_fmt($lei) or return;
+		$lei->check_input_format or return;
 		$self->{0} = $lei->{0};
 	} else {
 		my @f;
@@ -95,7 +82,7 @@ sub call { # the main "lei import" method
 				$lei->{nrd}->add_url($x);
 			}
 		}
-		if (@f) { check_fmt($lei, \@f) or return }
+		if (@f) { $lei->check_input_format(\@f) or return }
 		if ($lei->{nrd} && (my @err = $lei->{nrd}->errors)) {
 			return $lei->fail(@err);
 		}

^ permalink raw reply related	[relevance 85%]

* [PATCH 01/11] lei: bless config
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
@ 2021-02-17 10:06 71% ` Eric Wong
  2021-02-17 10:07 55% ` [PATCH 04/11] lei import: start rearranging code for IMAP support Eric Wong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:06 UTC (permalink / raw)
  To: meta

We'll be needing ->url_match from PublicInbox::Config
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index aa14ca6f..12deedd8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -563,6 +563,7 @@ sub _lei_cfg ($;$) {
 		qerr($self, "# $f created") if $self->{cmd} ne 'config';
 	}
 	my $cfg = PublicInbox::Config::git_config_dump($f);
+	bless $cfg, 'PublicInbox::Config';
 	$cfg->{-st} = $cur_st;
 	$cfg->{'-f'} = $f;
 	$self->{cfg} = $PATH2CFG{$f} = $cfg;

^ permalink raw reply related	[relevance 71%]

* [PATCH 00/11] lei IMAP read support
@ 2021-02-17 10:06 64% Eric Wong
  2021-02-17 10:06 71% ` [PATCH 01/11] lei: bless config Eric Wong
                   ` (6 more replies)
  0 siblings, 7 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:06 UTC (permalink / raw)
  To: meta

IMAP write support for search results is planned, but testing
could get tricky...

Still unsure about some UI bits w.r.t --format/-f:
	https://public-inbox.org/meta/20210217044032.GA17934@dcvr/

convert and import should support parallel network xfers,
NNTP reads, and eventually JMAP...

convert and import don't support compressed mboxes, yet.

Eric Wong (11):
  lei: bless config
  watch: move imap_common_init to NetReader
  watch: connect to NNTP and IMAP in config order
  lei import: start rearranging code for IMAP support
  lei import: move check_input_format to lei
  tests: setup_public_inboxes: use IMAP-friendly newsgroups
  t/lei_to_mail: remove unnecessary arg passing
  lei convert: mail format conversion sub-command
  lei import: add IMAP, (maildir|mbox*):$PATHNAME support
  lei: consolidate the bulk of the IPC code
  lei: check for IMAP auth errors

 MANIFEST                         |  11 +-
 lib/PublicInbox/GitCredential.pm |  18 ++-
 lib/PublicInbox/LEI.pm           |  62 +++++++-
 lib/PublicInbox/LeiAuth.pm       |  70 +++++++++
 lib/PublicInbox/LeiConvert.pm    | 137 +++++++++++++++++
 lib/PublicInbox/LeiDedupe.pm     |   2 +-
 lib/PublicInbox/LeiImport.pm     | 156 +++++++++++++-------
 lib/PublicInbox/LeiMirror.pm     |  19 +--
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiToMail.pm     |   5 +-
 lib/PublicInbox/MdirReader.pm    |  26 ++++
 lib/PublicInbox/NetReader.pm     | 242 +++++++++++++++++++++++++++++--
 lib/PublicInbox/TestCommon.pm    |  15 +-
 lib/PublicInbox/Watch.pm         |  82 ++---------
 t/{home1 => home2}/.gitignore    |   0
 t/{home1 => home2}/Makefile      |   0
 t/{home1 => home2}/README        |   0
 t/lei-convert.t                  |  36 +++++
 t/lei-import-imap.t              |  28 ++++
 t/lei-import-maildir.t           |   4 +-
 t/lei_to_mail.t                  |  14 +-
 t/net_reader-imap.t              |  40 +++++
 xt/lei-auth-fail.t               |  20 +++
 23 files changed, 820 insertions(+), 174 deletions(-)
 create mode 100644 lib/PublicInbox/LeiAuth.pm
 create mode 100644 lib/PublicInbox/LeiConvert.pm
 rename t/{home1 => home2}/.gitignore (100%)
 rename t/{home1 => home2}/Makefile (100%)
 rename t/{home1 => home2}/README (100%)
 create mode 100644 t/lei-convert.t
 create mode 100644 t/lei-import-imap.t
 create mode 100644 t/net_reader-imap.t
 create mode 100644 xt/lei-auth-fail.t


^ permalink raw reply	[relevance 64%]

* [PATCH 04/11] lei import: start rearranging code for IMAP support
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
  2021-02-17 10:06 71% ` [PATCH 01/11] lei: bless config Eric Wong
@ 2021-02-17 10:07 55% ` Eric Wong
  2021-02-17 10:07 85% ` [PATCH 05/11] lei import: move check_input_format to lei Eric Wong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:07 UTC (permalink / raw)
  To: meta

More to come in a later commit; some error handling and failure
modes will be trickier with IMAP due to authentication.
---
 lib/PublicInbox/LeiImport.pm | 74 +++++++++++++++++++++++++-----------
 lib/PublicInbox/NetReader.pm | 19 +++++++++
 2 files changed, 71 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 8358d9d4..b25d7e97 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -29,26 +29,21 @@ sub import_done { # EOF callback for main daemon
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
-sub call { # the main "lei import" method
-	my ($cls, $lei, @argv) = @_;
-	my $sto = $lei->_lei_store(1);
-	$sto->write_prepare($lei);
-	$lei->{opt}->{kw} //= 1;
+sub check_fmt ($;$) {
+	my ($lei, $f) = @_;
 	my $fmt = $lei->{opt}->{'format'};
-	my $self = $lei->{imp} = bless {}, $cls;
-	my @f;
-	for my $x (@argv) {
-		if (-f $x) { push @f, $x }
-		elsif (-d _) { require PublicInbox::MdirReader }
-	}
-	(@f && !$fmt) and
-		return $lei->fail("--format unset for regular file(s):\n@f");
-	if (@f && $fmt ne 'eml') {
-		require PublicInbox::MboxReader;
-		PublicInbox::MboxReader->can($fmt) or
-			return $lei->fail( "--format=$fmt unrecognized\n");
+	if (!$fmt) {
+		my $err = $f ? "regular file(s):\n@$f" : '--stdin';
+		return $lei->fail("--format unset for $err");
 	}
-	$self->{0} = $lei->{0} if $lei->{opt}->{stdin};
+	return 1 if $fmt eq 'eml';
+	require PublicInbox::MboxReader;
+	PublicInbox::MboxReader->can($fmt) ||
+				$lei->fail( "--format=$fmt unrecognized\n");
+}
+
+sub do_import {
+	my ($lei) = @_;
 	my $ops = {
 		'!' => [ $lei->can('fail_handler'), $lei ],
 		'x_it' => [ $lei->can('x_it'), $lei ],
@@ -56,14 +51,19 @@ sub call { # the main "lei import" method
 		'' => [ \&import_done, $lei ],
 	};
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	my $j = $lei->{opt}->{jobs} // scalar(@argv) || 1;
-	my $nproc = $self->detect_nproc;
-	$j = $nproc if $j > $nproc;
+	my $self = $lei->{imp};
+	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{argv}}) || 1;
+	if (my $nrd = $lei->{nrd}) {
+		# $j = $nrd->net_concurrency($j); TODO
+	} else {
+		my $nproc = $self->detect_nproc;
+		$j = $nproc if $j > $nproc;
+	}
 	$self->wq_workers_start('lei_import', $j, $lei->oldset, {lei => $lei});
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
 	$self->wq_io_do('import_stdin', []) if $self->{0};
-	for my $x (@argv) {
+	for my $x (@{$self->{argv}}) {
 		$self->wq_io_do('import_path_url', [], $x);
 	}
 	$self->wq_close(1);
@@ -73,6 +73,36 @@ sub call { # the main "lei import" method
 	}
 }
 
+sub call { # the main "lei import" method
+	my ($cls, $lei, @argv) = @_;
+	my $sto = $lei->_lei_store(1);
+	$sto->write_prepare($lei);
+	$lei->{opt}->{kw} //= 1;
+	my $self = $lei->{imp} = bless { argv => \@argv }, $cls;
+	if ($lei->{opt}->{stdin}) {
+		@argv and return
+			$lei->fail("--stdin and locations (@argv) do not mix");
+		check_fmt($lei) or return;
+		$self->{0} = $lei->{0};
+	} else {
+		my @f;
+		for my $x (@argv) {
+			if (-f $x) { push @f, $x }
+			elsif (-d _) { require PublicInbox::MdirReader }
+			else {
+				require PublicInbox::NetReader;
+				$lei->{nrd} //= PublicInbox::NetReader->new;
+				$lei->{nrd}->add_url($x);
+			}
+		}
+		if (@f) { check_fmt($lei, \@f) or return }
+		if ($lei->{nrd} && (my @err = $lei->{nrd}->errors)) {
+			return $lei->fail(@err);
+		}
+	}
+	do_import($lei);
+}
+
 sub ipc_atfork_child {
 	my ($self) = @_;
 	$self->{lei}->lei_atfork_child;
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index fa337bcd..1d053425 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -280,4 +280,23 @@ sub imap_common_init ($) {
 	$mics;
 }
 
+sub add_url {
+	my ($self, $arg) = @_;
+	if (my $url = imap_url($arg)) {
+		push @{$self->{imap_order}}, $url;
+	} else {
+		push @{$self->{unsupported_url}}, $arg;
+	}
+}
+
+sub errors {
+	my ($self) = @_;
+	if (my $u = $self->{unsupported_url}) {
+		return "Unsupported URL(s): @$u";
+	}
+	undef;
+}
+
+sub new { bless {}, shift };
+
 1;

^ permalink raw reply related	[relevance 55%]

* [PATCH 11/11] lei: check for IMAP auth errors
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-17 10:07 47% ` [PATCH 10/11] lei: consolidate the bulk of the IPC code Eric Wong
@ 2021-02-17 10:07 62% ` Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:07 UTC (permalink / raw)
  To: meta

We need to ensure authentication failures and error codes get
propagated to the parent process(es) properly.  For now, this
will just be a maintainer test which hits a read/write IMAP
server on public-inbox.org on a non-standard port with invalid
credentials.
---
 lib/PublicInbox/LeiAuth.pm   |  1 +
 lib/PublicInbox/NetReader.pm |  3 +++
 xt/lei-auth-fail.t           | 20 ++++++++++++++++++++
 3 files changed, 24 insertions(+)
 create mode 100644 xt/lei-auth-fail.t

diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 7210af99..7acb9900 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -42,6 +42,7 @@ sub auth_eof {
 
 sub auth_start {
 	my ($self, $lei, $post_auth_cb, @args) = @_;
+	$lei->_lei_cfg(1); # workers may need to read config
 	my $op = $lei->workers_start($self, 'auth', 1, {
 		'nrd_merge' => [ \&nrd_merge, $lei ],
 		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index ad8c18d0..61ea538b 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -89,6 +89,9 @@ sub mic_for { # mic = Mail::IMAPClient
 		$self->{mic_arg}->{uri_section($uri)} = $mic_arg;
 	} else {
 		$err = "E: <$url> LOGIN: $@\n";
+		if ($cred && defined($cred->{password})) {
+			$err =~ s/\Q$cred->{password}\E/*******/g;
+		}
 		$mic = undef;
 	}
 	$cred->run($mic ? 'approve' : 'reject') if $cred;
diff --git a/xt/lei-auth-fail.t b/xt/lei-auth-fail.t
new file mode 100644
index 00000000..5308d0f9
--- /dev/null
+++ b/xt/lei-auth-fail.t
@@ -0,0 +1,20 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+
+# TODO: mock IMAP server which fails at authentication so we don't
+# have to make external connections to test this:
+my $imap_fail = $ENV{TEST_LEI_IMAP_FAIL_URL} //
+	'imaps://AzureDiamond:Hunter2@public-inbox.org:994/INBOX';
+test_lei(sub {
+	ok(!lei(qw(convert -o mboxrd:/dev/stdout), $imap_fail),
+		'IMAP auth failure on convert');
+	like($lei_err, qr!\bE:.*?imaps://.*?!sm, 'error shown');
+	unlike($lei_err, qr!Hunter2!s, 'password not shown');
+	is($lei_out, '', 'nothing output');
+	ok(!lei(qw(import), $imap_fail), 'IMAP auth failure on import');
+	like($lei_err, qr!\bE:.*?imaps://.*?!sm, 'error shown');
+	unlike($lei_err, qr!Hunter2!s, 'password not shown');
+});
+done_testing;

^ permalink raw reply related	[relevance 62%]

* [PATCH 10/11] lei: consolidate the bulk of the IPC code
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-17 10:07 37% ` [PATCH 09/11] lei import: add IMAP, (maildir|mbox*):$PATHNAME support Eric Wong
@ 2021-02-17 10:07 47% ` Eric Wong
  2021-02-17 10:07 62% ` [PATCH 11/11] lei: check for IMAP auth errors Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:07 UTC (permalink / raw)
  To: meta

The backends for "lei add-external --mirror", "lei convert", and
"lei import" all share a similar pattern for spawning background
workers.  Hoist out the common parts to slim down our code base
a bit.

The LeiXSearch and LeiToMail workers for "lei q" remains a the
odd duck due to the deep pipelining and parallelization.
---
 lib/PublicInbox/LEI.pm        | 19 +++++++++++++++++++
 lib/PublicInbox/LeiAuth.pm    | 17 +++--------------
 lib/PublicInbox/LeiConvert.pm | 22 +++++-----------------
 lib/PublicInbox/LeiImport.pm  | 19 ++++---------------
 lib/PublicInbox/LeiMirror.pm  | 19 ++++---------------
 5 files changed, 35 insertions(+), 61 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1e4c36d0..0b4bc20e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -468,6 +468,25 @@ sub lei_atfork_child {
 	$current_lei = $persist ? undef : $self; # for SIG{__WARN__}
 }
 
+sub workers_start {
+	my ($lei, $wq, $ident, $jobs, $ops) = @_;
+	$ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		%$ops
+	};
+	require PublicInbox::PktOp;
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$wq->wq_workers_start($ident, $jobs, $lei->oldset, { lei => $lei });
+	delete $lei->{pkt_op_p};
+	my $op = delete $lei->{pkt_op_c};
+	$lei->event_step_init;
+	# oneshot needs $op, daemon-mode uses DS->EventLoop to handle $op
+	$lei->{oneshot} ? $op : undef;
+}
+
 sub _help {
 	require PublicInbox::LeiHelp;
 	PublicInbox::LeiHelp::call($_[0], $_[1], \%CMD, \%OPTDESC);
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 88310874..7210af99 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -42,24 +42,13 @@ sub auth_eof {
 
 sub auth_start {
 	my ($self, $lei, $post_auth_cb, @args) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
+	my $op = $lei->workers_start($self, 'auth', 1, {
 		'nrd_merge' => [ \&nrd_merge, $lei ],
 		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	$self->wq_workers_start('lei_auth', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	});
 	$self->wq_io_do('do_auth', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 44d5131b..6dd137bc 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::PktOp;
 use PublicInbox::LeiStore;
 use PublicInbox::LeiOverview;
 
@@ -59,26 +58,15 @@ sub do_convert { # via wq_do
 	delete $self->{wcb}; # commit
 }
 
-sub convert_start {
+sub convert_start { # LeiAuth->auth_start callback
 	my ($lei) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ $lei->can('dclose'), $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{cnv};
-	$self->wq_workers_start('lei_convert', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_convert', 1, {
+		'' => [ $lei->can('dclose'), $lei ]
+	});
 	$self->wq_io_do('do_convert', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub call { # the main "lei convert" method
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 4d225262..a0d79282 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::PktOp;
 
 sub _import_eml { # MboxReader callback
 	my ($eml, $sto, $set_kw) = @_;
@@ -31,13 +30,6 @@ sub import_done { # EOF callback for main daemon
 
 sub import_start {
 	my ($lei) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ \&import_done, $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{imp};
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $nrd = $lei->{nrd}) {
@@ -46,18 +38,15 @@ sub import_start {
 		my $nproc = $self->detect_nproc;
 		$j = $nproc if $j > $nproc;
 	}
-	$self->wq_workers_start('lei_import', $j, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_import', $j, {
+		'' => [ \&import_done, $lei ],
+	});
 	$self->wq_io_do('import_stdin', []) if $self->{0};
 	for my $input (@{$self->{inputs}}) {
 		$self->wq_io_do('import_path_url', [], $input);
 	}
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub call { # the main "lei import" method
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index c5153148..f8ca1ee5 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
 use PublicInbox::Spawn qw(popen_rd spawn);
-use PublicInbox::PktOp;
 
 sub do_finish_mirror { # dwaitpid callback
 	my ($arg, $pid) = @_;
@@ -279,22 +278,12 @@ sub start {
 	require PublicInbox::Inbox;
 	require PublicInbox::Admin;
 	require PublicInbox::InboxWritable;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ \&mirror_done, $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	$self->wq_workers_start('lei_mirror', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_mirror', 1, {
+		'' => [ \&mirror_done, $lei ]
+	});
 	$self->wq_io_do('do_mirror', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {

^ permalink raw reply related	[relevance 47%]

* [PATCH 09/11] lei import: add IMAP, (maildir|mbox*):$PATHNAME support
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-17 10:07 18% ` [PATCH 08/11] lei convert: mail format conversion sub-command Eric Wong
@ 2021-02-17 10:07 37% ` Eric Wong
  2021-02-17 10:07 47% ` [PATCH 10/11] lei: consolidate the bulk of the IPC code Eric Wong
  2021-02-17 10:07 62% ` [PATCH 11/11] lei: check for IMAP auth errors Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-17 10:07 UTC (permalink / raw)
  To: meta

This makes "lei import" more similar to "lei convert" and
allows importing from disparate sources simultaneously.

We'll also fix some ->child_error usage errors and make
the style of the code more similar to the "lei convert"
code.
---
 MANIFEST                     |   1 +
 lib/PublicInbox/LeiImport.pm | 126 ++++++++++++++++++++++++-----------
 t/lei-import-imap.t          |  28 ++++++++
 t/lei-import-maildir.t       |   4 +-
 t/lei_to_mail.t              |  10 +++
 5 files changed, 127 insertions(+), 42 deletions(-)
 create mode 100644 t/lei-import-imap.t

diff --git a/MANIFEST b/MANIFEST
index 4f146771..19f73356 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -365,6 +365,7 @@ t/kqnotify.t
 t/lei-convert.t
 t/lei-daemon.t
 t/lei-externals.t
+t/lei-import-imap.t
 t/lei-import-maildir.t
 t/lei-import.t
 t/lei-mirror.t
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 32f3a467..4d225262 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -29,7 +29,7 @@ sub import_done { # EOF callback for main daemon
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
-sub do_import {
+sub import_start {
 	my ($lei) = @_;
 	my $ops = {
 		'!' => [ $lei->can('fail_handler'), $lei ],
@@ -39,7 +39,7 @@ sub do_import {
 	};
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{imp};
-	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{argv}}) || 1;
+	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $nrd = $lei->{nrd}) {
 		# $j = $nrd->net_concurrency($j); TODO
 	} else {
@@ -50,8 +50,8 @@ sub do_import {
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
 	$self->wq_io_do('import_stdin', []) if $self->{0};
-	for my $x (@{$self->{argv}}) {
-		$self->wq_io_do('import_path_url', [], $x);
+	for my $input (@{$self->{inputs}}) {
+		$self->wq_io_do('import_path_url', [], $input);
 	}
 	$self->wq_close(1);
 	$lei->event_step_init; # wait for shutdowns
@@ -61,60 +61,88 @@ sub do_import {
 }
 
 sub call { # the main "lei import" method
-	my ($cls, $lei, @argv) = @_;
+	my ($cls, $lei, @inputs) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
+	my ($nrd, @f, @d);
 	$lei->{opt}->{kw} //= 1;
-	my $self = $lei->{imp} = bless { argv => \@argv }, $cls;
+	my $self = $lei->{imp} = bless { inputs => \@inputs }, $cls;
 	if ($lei->{opt}->{stdin}) {
-		@argv and return
-			$lei->fail("--stdin and locations (@argv) do not mix");
+		@inputs and return $lei->fail("--stdin and @inputs do not mix");
 		$lei->check_input_format or return;
 		$self->{0} = $lei->{0};
-	} else {
-		my @f;
-		for my $x (@argv) {
-			if (-f $x) { push @f, $x }
-			elsif (-d _) { require PublicInbox::MdirReader }
-			else {
-				require PublicInbox::NetReader;
-				$lei->{nrd} //= PublicInbox::NetReader->new;
-				$lei->{nrd}->add_url($x);
+	}
+
+	# TODO: do we need --format for non-stdin?
+	my $fmt = $lei->{opt}->{'format'};
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+			require PublicInbox::NetReader;
+			$nrd //= PublicInbox::NetReader->new;
+			$nrd->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--format=$fmt and `$ifmt:' conflict
+
 			}
-		}
-		if (@f) { $lei->check_input_format(\@f) or return }
-		if ($lei->{nrd} && (my @err = $lei->{nrd}->errors)) {
-			return $lei->fail(@err);
-		}
+			if (-f $input_path) {
+				require PublicInbox::MboxReader;
+				PublicInbox::MboxReader->can($ifmt) or return
+					$lei->fail("$ifmt not supported");
+			} elsif (-d _) {
+				$ifmt eq 'maildir' or return
+					$lei->fail("$ifmt not supported");
+			} else { return $lei->fail("Unable to handle $input_path") }
+		} elsif (-f $input) { push @f, $input
+		} elsif (-d _) { push @d, $input
+		} else { return $lei->fail("Unable to handle $input") }
+	}
+	if (@f) { $lei->check_input_format(\@f) or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
 	}
-	do_import($lei);
+	$self->{inputs} = \@inputs;
+	return import_start($lei) if !$nrd;
+
+	if (my $err = $nrd->errors) {
+		return $lei->fail($err);
+	}
+	$nrd->{quiet} = $lei->{opt}->{quiet};
+	$lei->{nrd} = $nrd;
+	require PublicInbox::LeiAuth;
+	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
+	$auth->auth_start($lei, \&import_start, $lei);
 }
 
 sub ipc_atfork_child {
 	my ($self) = @_;
+	delete $self->{lei}->{imp}; # drop circular ref
 	$self->{lei}->lei_atfork_child;
 	$self->SUPER::ipc_atfork_child;
 }
 
 sub _import_fh {
-	my ($lei, $fh, $x) = @_;
+	my ($lei, $fh, $input, $ifmt) = @_;
 	my $set_kw = $lei->{opt}->{kw};
-	my $fmt = $lei->{opt}->{'format'};
 	eval {
-		if ($fmt eq 'eml') {
+		if ($ifmt eq 'eml') {
 			my $buf = do { local $/; <$fh> } //
-				return $lei->child_error(1 >> 8, <<"");
-error reading $x: $!
+				return $lei->child_error(1 << 8, <<"");
+error reading $input: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
 			_import_eml($eml, $lei->{sto}, $set_kw);
 		} else { # some mbox (->can already checked in call);
-			my $cb = PublicInbox::MboxReader->can($fmt) //
-				die "BUG: bad fmt=$fmt";
+			my $cb = PublicInbox::MboxReader->can($ifmt) //
+				die "BUG: bad fmt=$ifmt";
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
 	};
-	$lei->child_error(1 >> 8, "<stdin>: $@") if $@;
+	$lei->child_error(1 << 8, "<stdin>: $@") if $@;
 }
 
 sub _import_maildir { # maildir_each_file cb
@@ -122,27 +150,45 @@ sub _import_maildir { # maildir_each_file cb
 	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
 }
 
+sub _import_imap { # imap_each cb
+	my ($url, $uid, $kw, $eml, $sto, $set_kw) = @_;
+	warn "$url $uid";
+	$sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
+}
+
 sub import_path_url {
-	my ($self, $x) = @_;
+	my ($self, $input) = @_;
 	my $lei = $self->{lei};
+	my $ifmt = lc($lei->{opt}->{'format'} // '');
 	# TODO auto-detect?
-	if (-f $x) {
-		open my $fh, '<', $x or return $lei->child_error(1 >> 8, <<"");
-unable to open $x: $!
+	if ($input =~ m!\A(imap|nntp)s?://!i) {
+		$lei->{nrd}->imap_each($input, \&_import_imap, $lei->{sto},
+					$lei->{opt}->{kw});
+		return;
+	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+		$ifmt = lc $1;
+	}
+	if (-f $input) {
+		open my $fh, '<', $input or return $lei->child_error(1 << 8, <<"");
+unable to open $input: $!
 
-		_import_fh($lei, $fh, $x);
-	} elsif (-d _ && (-d "$x/cur" || -d "$x/new")) {
-		PublicInbox::MdirReader::maildir_each_file($x,
+		_import_fh($lei, $fh, $input, $ifmt);
+	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
+		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
+$input appears to a be a maildir, not $ifmt
+EOM
+		PublicInbox::MdirReader::maildir_each_file($input,
 					\&_import_maildir,
 					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
-		$lei->fail("$x unsupported (TODO)");
+		$lei->fail("$input unsupported (TODO)");
 	}
 }
 
 sub import_stdin {
 	my ($self) = @_;
-	_import_fh($self->{lei}, $self->{0}, '<stdin>');
+	my $lei = $self->{lei};
+	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'format'});
 }
 
 1;
diff --git a/t/lei-import-imap.t b/t/lei-import-imap.t
new file mode 100644
index 00000000..ee308723
--- /dev/null
+++ b/t/lei-import-imap.t
@@ -0,0 +1,28 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	lei_ok(qw(q bytes:1..));
+	my $out = json_utf8->decode($lei_out);
+	is_deeply($out, [ undef ], 'nothing imported, yet');
+	lei_ok('import', "imap://$host_port/t.v2.0");
+	lei_ok(qw(q bytes:1..));
+	$out = json_utf8->decode($lei_out);
+	ok(scalar(@$out) > 1, 'got imported messages');
+	is(pop @$out, undef, 'trailing JSON null element was null');
+	my %r;
+	for (@$out) { $r{ref($_)}++ }
+	is_deeply(\%r, { 'HASH' => scalar(@$out) }, 'all hashes');
+});
+done_testing;
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index 5842e19e..d2b059ad 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -23,8 +23,8 @@ test_lei(sub {
 	is_deeply($r2, $res, 'idempotent import');
 
 	rename("$md/cur/x:2,S", "$md/cur/x:2,SR") or BAIL_OUT "rename: $!";
-	ok($lei->(qw(import), $md), 'import Maildir after +answered');
-	ok($lei->(qw(q -d none s:boolean)), 'lei q after +answered');
+	lei_ok('import', "maildir:$md", \'import Maildir after +answered');
+	lei_ok(qw(q -d none s:boolean), \'lei q after +answered');
 	$res = json_utf8->decode($lei_out);
 	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
 	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 6a571660..72b90700 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -139,6 +139,16 @@ test_lei(sub {
 	is($res->[1], undef, 'only one result');
 });
 
+test_lei(sub {
+	lei_ok('import', "$mbox:$fn", \'imported mbox:/path') or diag $lei_err;
+	lei_ok(qw(q s:x), \'lei q works') or diag $lei_err;
+	my $res = json_utf8->decode($lei_out);
+	my $x = $res->[0];
+	is($x->{'s'}, 'x', 'subject imported') or diag $lei_out;
+	is_deeply($x->{'kw'}, ['seen'], 'kw imported') or diag $lei_out;
+	is($res->[1], undef, 'only one result');
+});
+
 for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 	my $zsfx2cmd = PublicInbox::LeiToMail->can('zsfx2cmd');
 	SKIP: {

^ permalink raw reply related	[relevance 37%]

* [PATCH 08/11] lei convert: mail format conversion sub-command
  2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-17 10:07 85% ` [PATCH 05/11] lei import: move check_input_format to lei Eric Wong
@ 2021-02-17 10:07 18% ` Eric Wong
  2021-02-17 10:53 71%   ` Eric Wong
  2021-02-17 10:07 37% ` [PATCH 09/11] lei import: add IMAP, (maildir|mbox*):$PATHNAME support Eric Wong
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-17 10:07 UTC (permalink / raw)
  To: meta

This will make testing IMAP support for other commands easier, as
it doesn't write to lei/store at all.  Like the pager and MUA,
"git credential" is always spawned by script/lei (and not
lei-daemon) so it has a controlling terminal for password
prompts.
---
 MANIFEST                         |   4 +
 lib/PublicInbox/GitCredential.pm |  18 ++--
 lib/PublicInbox/LEI.pm           |  38 +++++--
 lib/PublicInbox/LeiAuth.pm       |  80 +++++++++++++++
 lib/PublicInbox/LeiConvert.pm    | 149 ++++++++++++++++++++++++++++
 lib/PublicInbox/LeiDedupe.pm     |   2 +-
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiToMail.pm     |   5 +-
 lib/PublicInbox/MdirReader.pm    |  26 +++++
 lib/PublicInbox/NetReader.pm     | 163 ++++++++++++++++++++++++++++---
 lib/PublicInbox/TestCommon.pm    |  11 ++-
 t/lei-convert.t                  |  36 +++++++
 t/net_reader-imap.t              |  40 ++++++++
 13 files changed, 542 insertions(+), 37 deletions(-)
 create mode 100644 lib/PublicInbox/LeiAuth.pm
 create mode 100644 lib/PublicInbox/LeiConvert.pm
 create mode 100644 t/lei-convert.t
 create mode 100644 t/net_reader-imap.t

diff --git a/MANIFEST b/MANIFEST
index 82068900..4f146771 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -178,6 +178,8 @@ lib/PublicInbox/InputPipe.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiAuth.pm
+lib/PublicInbox/LeiConvert.pm
 lib/PublicInbox/LeiCurl.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
@@ -360,6 +362,7 @@ t/init.t
 t/ipc.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-convert.t
 t/lei-daemon.t
 t/lei-externals.t
 t/lei-import-maildir.t
@@ -388,6 +391,7 @@ t/msg_iter.t
 t/msgmap.t
 t/msgtime.t
 t/multi-mid.t
+t/net_reader-imap.t
 t/nntp.t
 t/nntpd-tls.t
 t/nntpd-v2.t
diff --git a/lib/PublicInbox/GitCredential.pm b/lib/PublicInbox/GitCredential.pm
index 9e193029..2d81817c 100644
--- a/lib/PublicInbox/GitCredential.pm
+++ b/lib/PublicInbox/GitCredential.pm
@@ -4,11 +4,17 @@ package PublicInbox::GitCredential;
 use strict;
 use PublicInbox::Spawn qw(popen_rd);
 
-sub run ($$) {
-	my ($self, $op) = @_;
-	my ($in_r, $in_w);
+sub run ($$;$) {
+	my ($self, $op, $lei) = @_;
+	my ($in_r, $in_w, $out_r);
+	my $cmd = [ qw(git credential), $op ];
 	pipe($in_r, $in_w) or die "pipe: $!";
-	my $out_r = popen_rd([qw(git credential), $op], undef, { 0 => $in_r });
+	if ($lei && !$lei->{oneshot}) { # we'll die if disconnected:
+		pipe($out_r, my $out_w) or die "pipe: $!";
+		$lei->send_exec_cmd([ $in_r, $out_w ], $cmd, {});
+	} else {
+		$out_r = popen_rd($cmd, undef, { 0 => $in_r });
+	}
 	close $in_r or die "close in_r: $!";
 
 	my $out = '';
@@ -41,8 +47,8 @@ sub check_netrc ($) {
 }
 
 sub fill {
-	my ($self) = @_;
-	my $out_r = run($self, 'fill');
+	my ($self, $lei) = @_;
+	my $out_r = run($self, 'fill', $lei);
 	while (<$out_r>) {
 		chomp;
 		return if $_ eq '';
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1fa9f751..1e4c36d0 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -173,7 +173,11 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
 	format|f=s kw|keywords|flags!),
 	],
-
+'convert' => [ 'LOCATION...|--stdin',
+	'one-time conversion from URL or filesystem to another format',
+	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
+	kw|keywords|flags!),
+	],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
@@ -320,7 +324,7 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-my @WQ_KEYS = qw(lxs l2m imp mrr); # internal workers
+my @WQ_KEYS = qw(lxs l2m imp mrr cnv auth); # internal workers
 
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
@@ -391,18 +395,19 @@ sub fail ($$;$) {
 	undef;
 }
 
-sub check_input_format ($;$) {
-	my ($self, $files) = @_;
-	my $fmt = $self->{opt}->{'format'};
+sub check_input_format ($;$$) {
+	my ($self, $files, $opt_key) = @_;
+	$opt_key //= 'format';
+	my $fmt = $self->{opt}->{$opt_key};
 	if (!$fmt) {
 		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
-		return fail($self, "--format unset for $err");
+		return fail($self, "--$opt_key unset for $err");
 	}
 	return 1 if $fmt eq 'eml';
 	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
 	require PublicInbox::MboxReader;
 	PublicInbox::MboxReader->can($fmt) ||
-				fail($self, "--format=$fmt unrecognized");
+				fail($self, "--$opt_key=$fmt unrecognized");
 }
 
 sub out ($;@) {
@@ -445,6 +450,7 @@ sub lei_atfork_child {
 	} else {
 		delete $self->{0};
 	}
+	delete @$self{qw(cnv)};
 	for (delete @$self{qw(3 sock old_1 au_done)}) {
 		close($_) if defined($_);
 	}
@@ -626,6 +632,11 @@ sub lei_import {
 	PublicInbox::LeiImport->call(@_);
 }
 
+sub lei_convert {
+	require PublicInbox::LeiConvert;
+	PublicInbox::LeiConvert->call(@_);
+}
+
 sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
@@ -770,6 +781,13 @@ sub start_mua {
 	delete $self->{opt}->{verbose};
 }
 
+sub send_exec_cmd { # tell script/lei to execute a command
+	my ($self, $io, $cmd, $env) = @_;
+	my $sock = $self->{sock} // die 'lei client gone';
+	my $fds = [ map { fileno($_) } @$io ];
+	$send_cmd->($sock, $fds, exec_buf($cmd, $env), MSG_EOR);
+}
+
 sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 	my ($self) = @_;
 	my $alerts = $self->{opt}->{alert} // return;
@@ -813,10 +831,9 @@ sub start_pager {
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
 	my $pgr = [ undef, @$rdr{1, 2} ];
-	if (my $sock = $self->{sock}) { # lei(1) process runs it
+	if ($self->{sock}) { # lei(1) process runs it
 		delete @$new_env{keys %$env}; # only set iff unset
-		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
-		$send_cmd->($sock, $fds, exec_buf([$pager], $new_env), MSG_EOR);
+		send_exec_cmd($self, [ @$rdr{0..2} ], [$pager], $new_env);
 	} elsif ($self->{oneshot}) {
 		my $cmd = [$pager];
 		$self->{"pid.$self.$$"}->{spawn($cmd, $new_env, $rdr)} = $cmd;
@@ -920,6 +937,7 @@ sub event_step {
 
 sub event_step_init {
 	my ($self) = @_;
+	return if $self->{-event_init_done}++;
 	if (my $sock = $self->{sock}) { # using DS->EventLoop
 		$self->SUPER::new($sock, EPOLLIN|EPOLLET);
 	}
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
new file mode 100644
index 00000000..88310874
--- /dev/null
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -0,0 +1,80 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Authentication worker for anything that needs auth for read/write IMAP
+# (eventually for read-only NNTP access)
+package PublicInbox::LeiAuth;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::PktOp qw(pkt_do);
+use PublicInbox::NetReader;
+
+sub nrd_merge {
+	my ($lei, $nrd_new) = @_;
+	if ($lei->{pkt_op_p}) { # from lei_convert worker
+		pkt_do($lei->{pkt_op_p}, 'nrd_merge', $nrd_new);
+	} else { # single lei-daemon consumer
+		my $self = $lei->{auth} or return; # client disconnected
+		my $nrd = $self->{nrd};
+		%$nrd = (%$nrd, %$nrd_new);
+	}
+}
+
+sub do_auth { # called via wq_io_do
+	my ($self) = @_;
+	my ($lei, $nrd) = @$self{qw(lei nrd)};
+	$nrd->imap_common_init($lei);
+	nrd_merge($lei, $nrd); # tell lei-daemon updated auth info
+}
+
+sub do_finish_auth { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($self, $lei, $post_auth_cb, @args) = @$arg;
+	$? ? $lei->dclose : $post_auth_cb->(@args);
+}
+
+sub auth_eof {
+	my ($lei, $post_auth_cb, @args) = @_;
+	my $self = delete $lei->{auth} or return;
+	$self->wq_wait_old(\&do_finish_auth, $lei, $post_auth_cb, @args);
+}
+
+sub auth_start {
+	my ($self, $lei, $post_auth_cb, @args) = @_;
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'nrd_merge' => [ \&nrd_merge, $lei ],
+		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$self->wq_workers_start('lei_auth', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_io_do('do_auth', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	# prevent {sock} from being closed in lei_atfork_child:
+	my $s = delete $self->{lei}->{sock};
+	delete $self->{lei}->{auth}; # drop circular ref
+	$self->{lei}->lei_atfork_child;
+	$self->{lei}->{sock} = $s if $s;
+	$self->SUPER::ipc_atfork_child;
+}
+
+sub new {
+	my ($cls, $nrd) = @_;
+	bless { nrd => $nrd }, $cls;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
new file mode 100644
index 00000000..44d5131b
--- /dev/null
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -0,0 +1,149 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# front-end for the "lei convert" sub-command
+package PublicInbox::LeiConvert;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::Eml;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::PktOp;
+use PublicInbox::LeiStore;
+use PublicInbox::LeiOverview;
+
+sub mbox_cb {
+	my ($eml, $self) = @_;
+	my @kw = PublicInbox::LeiStore::mbox_keywords($eml);
+	$eml->header_set($_) for qw(Status X-Status);
+	$self->{wcb}->(undef, { kw => \@kw }, $eml);
+}
+
+sub imap_cb { # ->imap_each
+	my ($url, $uid, $kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
+sub mdir_cb {
+	my ($kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
+sub do_convert { # via wq_do
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	my $in_fmt = $lei->{opt}->{'in-format'};
+	if (my $stdin = delete $self->{0}) {
+		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
+	}
+	for my $input (@{$self->{inputs}}) {
+		my $ifmt = lc($in_fmt // '');
+		if ($input =~ m!\A(?:imap|nntp)s?://!) { # TODO: nntp
+			$lei->{nrd}->imap_each($input, \&imap_cb, $self);
+			next;
+		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+			$ifmt = lc $1;
+		}
+		if (-f $input) {
+			open my $fh, '<', $input or
+					return $lei->fail("open $input: $!");
+			PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+		} elsif (-d _) {
+			PublicInbox::MdirReader::maildir_each_eml($input,
+							\&mdir_cb, $self);
+		} else {
+			die "BUG: $input unhandled"; # should've failed earlier
+		}
+	}
+	delete $lei->{1};
+	delete $self->{wcb}; # commit
+}
+
+sub convert_start {
+	my ($lei) = @_;
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'' => [ $lei->can('dclose'), $lei ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	my $self = $lei->{cnv};
+	$self->wq_workers_start('lei_convert', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_io_do('do_convert', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub call { # the main "lei convert" method
+	my ($cls, $lei, @inputs) = @_;
+	my $opt = $lei->{opt};
+	$opt->{kw} //= 1;
+	my $self = $lei->{cnv} = bless {}, $cls;
+	my $in_fmt = $opt->{'in-format'};
+	my ($nrd, @f, @d);
+	$opt->{dedupe} //= 'none';
+	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
+	$lei->{l2m} or return
+		$lei->fail("output not specified or is not a mail destination");
+	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
+	if ($opt->{stdin}) {
+		@inputs and return $lei->fail("--stdin and @inputs do not mix");
+		$lei->check_input_format(undef, 'in-format') or return;
+		$self->{0} = $lei->{0};
+	}
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+			require PublicInbox::NetReader;
+			$nrd //= PublicInbox::NetReader->new;
+			$nrd->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($in_fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--in-format=$in_fmt and `$ifmt:' conflict
+
+			}
+		} elsif (-f $input) { push @f, $input }
+		elsif (-d _) { push @d, $input }
+		else { return $lei->fail("Unable to handle $input") }
+	}
+	if (@f) { $lei->check_input_format(\@f, 'in-format') or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
+	}
+	$self->{inputs} = \@inputs;
+	return convert_start($lei) if !$nrd;
+
+	if (my $err = $nrd->errors) {
+		return $lei->fail($err);
+	}
+	$nrd->{quiet} = $opt->{quiet};
+	$lei->{nrd} = $nrd;
+	require PublicInbox::LeiAuth;
+	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
+	$auth->auth_start($lei, \&convert_start, $lei);
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	$lei->lei_atfork_child;
+	my $l2m = delete $lei->{l2m};
+	$l2m->pre_augment($lei);
+	$l2m->do_augment($lei);
+	$l2m->post_augment($lei);
+	$self->{wcb} = $l2m->write_cb($lei);
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
+	$self->SUPER::ipc_atfork_child;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 2114c0e8..5fec9384 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -127,7 +127,7 @@ sub prepare_dedupe {
 
 sub pause_dedupe {
 	my ($self) = @_;
-	my $skv = $self->[0];
+	my $skv = $self->[0] or return;
 	$skv->dbh_release;
 	delete($skv->{dbh}) if $skv;
 }
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index c820f0d7..3169bae6 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -51,18 +51,19 @@ sub detect_fmt ($$) {
 }
 
 sub new {
-	my ($class, $lei) = @_;
+	my ($class, $lei, $ofmt_key) = @_;
 	my $opt = $lei->{opt};
 	my $dst = $opt->{output} // '-';
 	$dst = '/dev/stdout' if $dst eq '-';
+	$ofmt_key //= 'format';
 
-	my $fmt = $opt->{'format'};
+	my $fmt = $opt->{$ofmt_key};
 	$fmt = lc($fmt) if defined $fmt;
 	if ($dst =~ s/\A([a-z0-9]+)://is) { # e.g. Maildir:/home/user/Mail/
 		my $ofmt = lc $1;
 		$fmt //= $ofmt;
 		return $lei->fail(<<"") if $fmt ne $ofmt;
---format=$fmt and --output=$ofmt conflict
+--$ofmt_key=$fmt and --output=$ofmt conflict
 
 	}
 	$fmt //= 'json' if $dst eq '/dev/stdout';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e3e512be..f0adc44f 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -437,7 +437,7 @@ sub _do_augment_mbox {
 	$dedupe->pause_dedupe if $dedupe;
 }
 
-sub pre_augment { # fast (1 disk seek), runs in main daemon
+sub pre_augment { # fast (1 disk seek), runs in same process as post_augment
 	my ($self, $lei) = @_;
 	# _pre_augment_maildir, _pre_augment_mbox
 	my $m = "_pre_augment_$self->{base_type}";
@@ -451,7 +451,8 @@ sub do_augment { # slow, runs in wq worker
 	$self->$m($lei);
 }
 
-sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
+# fast (spawn compressor or mkdir), runs in same process as pre_augment
+sub post_augment {
 	my ($self, $lei, @args) = @_;
 	# _post_augment_maildir, _post_augment_mbox
 	my $m = "_post_augment_$self->{base_type}";
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index e0ff676d..5fa534f5 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -7,6 +7,7 @@
 package PublicInbox::MdirReader;
 use strict;
 use v5.10.1;
+use PublicInbox::InboxWritable qw(eml_from_path);
 
 # returns Maildir flags from a basename ('' for no flags, undef for invalid)
 sub maildir_basename_flags {
@@ -36,4 +37,29 @@ sub maildir_each_file ($$;@) {
 	}
 }
 
+my %c2kw = ('D' => 'draft', F => 'flagged', R => 'answered', S => 'seen');
+
+sub maildir_each_eml ($$;@) {
+	my ($dir, $cb, @arg) = @_;
+	$dir .= '/' unless substr($dir, -1) eq '/';
+	my $pfx = "$dir/new/";
+	if (opendir(my $dh, $pfx)) {
+		while (defined(my $bn = readdir($dh))) {
+			next if substr($bn, 0, 1) eq '.';
+			my @f = split(/:/, $bn, -1);
+			next if scalar(@f) != 1;
+			my $eml = eml_from_path($pfx.$bn) or next;
+			$cb->([], $eml, @arg);
+		}
+	}
+	$pfx = "$dir/cur/";
+	opendir my $dh, $pfx or return;
+	while (defined(my $bn = readdir($dh))) {
+		my $fl = maildir_basename_flags($bn) // next;
+		my $eml = eml_from_path($pfx.$bn) or next;
+		my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
+		$cb->(\@kw, $eml, @arg);
+	}
+}
+
 1;
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 1d053425..ad8c18d0 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -5,7 +5,8 @@
 package PublicInbox::NetReader;
 use strict;
 use v5.10.1;
-use parent qw(Exporter);
+use parent qw(Exporter PublicInbox::IPC);
+use PublicInbox::Eml;
 
 # TODO: trim this down, this is huge
 our @EXPORT = qw(uri_new uri_scheme uri_section
@@ -33,7 +34,7 @@ sub uri_section ($) {
 sub auth_anon_cb { '' }; # for Mail::IMAPClient::Authcallback
 
 sub mic_for { # mic = Mail::IMAPClient
-	my ($self, $url, $mic_args) = @_;
+	my ($self, $url, $mic_args, $lei) = @_;
 	require PublicInbox::URIimap;
 	my $uri = PublicInbox::URIimap->new($url);
 	require PublicInbox::GitCredential;
@@ -74,21 +75,26 @@ sub mic_for { # mic = Mail::IMAPClient
 	}
 	if ($cred) {
 		$cred->check_netrc unless defined $cred->{password};
-		$cred->fill; # may prompt user here
+		$cred->fill($lei); # may prompt user here
 		$mic->User($mic_arg->{User} = $cred->{username});
 		$mic->Password($mic_arg->{Password} = $cred->{password});
 	} else { # AUTH=ANONYMOUS
 		$mic->Authmechanism($mic_arg->{Authmechanism} = 'ANONYMOUS');
-		$mic->Authcallback($mic_arg->{Authcallback} = \&auth_anon_cb);
+		$mic_arg->{Authcallback} = 'auth_anon_cb';
+		$mic->Authcallback(\&auth_anon_cb);
 	}
+	my $err;
 	if ($mic->login && $mic->IsAuthenticated) {
 		# success! keep IMAPClient->new arg in case we get disconnected
 		$self->{mic_arg}->{uri_section($uri)} = $mic_arg;
 	} else {
-		warn "E: <$url> LOGIN: $@\n";
+		$err = "E: <$url> LOGIN: $@\n";
 		$mic = undef;
 	}
 	$cred->run($mic ? 'approve' : 'reject') if $cred;
+	if ($err) {
+		$lei ? $lei->fail($err) : warn($err);
+	}
 	$mic;
 }
 
@@ -139,8 +145,8 @@ E: <$url> STARTTLS requested and failed
 	$nn;
 }
 
-sub nn_for ($$$) { # nn = Net::NNTP
-	my ($self, $url, $nn_args) = @_;
+sub nn_for ($$$;$) { # nn = Net::NNTP
+	my ($self, $url, $nn_args, $lei) = @_;
 	my $uri = uri_new($url);
 	my $sec = uri_section($uri);
 	my $nntp_opt = $self->{nntp_opt}->{$sec} //= {};
@@ -170,7 +176,7 @@ sub nn_for ($$$) { # nn = Net::NNTP
 	my $nn = nn_new($nn_arg, $nntp_opt, $url);
 
 	if ($cred) {
-		$cred->fill; # may prompt user here
+		$cred->fill($lei); # may prompt user here
 		if ($nn->authinfo($u, $p)) {
 			push @{$nntp_opt->{-postconn}}, [ 'authinfo', $u, $p ];
 		} else {
@@ -240,14 +246,15 @@ sub cfg_bool ($$$) {
 }
 
 # flesh out common IMAP-specific data structures
-sub imap_common_init ($) {
-	my ($self) = @_;
+sub imap_common_init ($;$) {
+	my ($self, $lei) = @_;
+	$self->{quiet} = 1 if $lei && $lei->{opt}->{quiet};
 	eval { require PublicInbox::IMAPClient } or
 		die "Mail::IMAPClient is required for IMAP:\n$@\n";
 	eval { require PublicInbox::IMAPTracker } or
 		die "DBD::SQLite is required for IMAP\n:$@\n";
 	require PublicInbox::URIimap;
-	my $cfg = $self->{pi_cfg};
+	my $cfg = $self->{pi_cfg} // $lei->_lei_cfg;
 	my $mic_args = {}; # scheme://authority => Mail:IMAPClient arg
 	for my $url (@{$self->{imap_order}}) {
 		my $uri = PublicInbox::URIimap->new($url);
@@ -275,7 +282,8 @@ sub imap_common_init ($) {
 	my $mics = {}; # schema://authority => IMAPClient obj
 	for my $url (@{$self->{imap_order}}) {
 		my $uri = PublicInbox::URIimap->new($url);
-		$mics->{uri_section($uri)} //= mic_for($self, $url, $mic_args);
+		my $sec = uri_section($uri);
+		$mics->{$sec} //= mic_for($self, $url, $mic_args, $lei);
 	}
 	$mics;
 }
@@ -294,9 +302,140 @@ sub errors {
 	if (my $u = $self->{unsupported_url}) {
 		return "Unsupported URL(s): @$u";
 	}
+	if ($self->{imap_order}) {
+		eval { require PublicInbox::IMAPClient } or
+			die "Mail::IMAPClient is required for IMAP:\n$@\n";
+	}
 	undef;
 }
 
+my %IMAPflags2kw = (
+	'\Seen' => 'seen',
+	'\Answered' => 'answered',
+	'\Flagged' => 'flagged',
+	'\Draft' => 'draft',
+);
+
+sub _imap_do_msg ($$$$$) {
+	my ($self, $url, $uid, $raw, $flags) = @_;
+	# our target audience expects LF-only, save storage
+	$$raw =~ s/\r\n/\n/sg;
+	my $kw = [];
+	for my $f (split(/ /, $flags)) {
+		my $k = $IMAPflags2kw{$f} // next; # TODO: X-Label?
+		push @$kw, $k;
+	}
+	my ($eml_cb, @args) = @{$self->{eml_each}};
+	$eml_cb->($url, $uid, $kw, PublicInbox::Eml->new($raw), @args);
+}
+
+sub _imap_fetch_all ($$$) {
+	my ($self, $mic, $url) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = uri_section($uri);
+	my $mbx = $uri->mailbox;
+	$mic->Clear(1); # trim results history
+	$mic->examine($mbx) or return "E: EXAMINE $mbx ($sec) failed: $!";
+	my ($r_uidval, $r_uidnext);
+	for ($mic->Results) {
+		/^\* OK \[UIDVALIDITY ([0-9]+)\].*/ and $r_uidval = $1;
+		/^\* OK \[UIDNEXT ([0-9]+)\].*/ and $r_uidnext = $1;
+		last if $r_uidval && $r_uidnext;
+	}
+	$r_uidval //= $mic->uidvalidity($mbx) //
+		return "E: $url cannot get UIDVALIDITY";
+	$r_uidnext //= $mic->uidnext($mbx) //
+		return "E: $url cannot get UIDNEXT";
+	my $itrk = $self->{incremental} ?
+			PublicInbox::IMAPTracker->new($url) : 0;
+	my ($l_uidval, $l_uid) = $itrk ? $itrk->get_last : ();
+	$l_uidval //= $r_uidval; # first time
+	$l_uid //= 1;
+	if ($l_uidval != $r_uidval) {
+		return "E: $url UIDVALIDITY mismatch\n".
+			"E: local=$l_uidval != remote=$r_uidval";
+	}
+	my $r_uid = $r_uidnext - 1;
+	if ($l_uid != 1 && $l_uid > $r_uid) {
+		return "E: $url local UID exceeds remote ($l_uid > $r_uid)\n".
+			"E: $url strangely, UIDVALIDLITY matches ($l_uidval)\n";
+	}
+	return if $l_uid >= $r_uid; # nothing to do
+
+	warn "# $url fetching UID $l_uid:$r_uid\n" unless $self->{quiet};
+	$mic->Uid(1); # the default, we hope
+	my $bs = $self->{imap_opt}->{$sec}->{batch_size} // 1;
+	my $req = $mic->imap4rev1 ? 'BODY.PEEK[]' : 'RFC822.PEEK';
+	my $key = $req;
+	$key =~ s/\.PEEK//;
+	my ($uids, $batch);
+	my $err;
+	do {
+		# I wish "UID FETCH $START:*" could work, but:
+		# 1) servers do not need to return results in any order
+		# 2) Mail::IMAPClient doesn't offer a streaming API
+		$uids = $mic->search("UID $l_uid:*") or
+			return "E: $url UID SEARCH $l_uid:* error: $!";
+		return if scalar(@$uids) == 0;
+
+		# RFC 3501 doesn't seem to indicate order of UID SEARCH
+		# responses, so sort it ourselves.  Order matters so
+		# IMAPTracker can store the newest UID.
+		@$uids = sort { $a <=> $b } @$uids;
+
+		# Did we actually get new messages?
+		return if $uids->[0] < $l_uid;
+
+		$l_uid = $uids->[-1] + 1; # for next search
+		my $last_uid;
+		my $n = $self->{max_batch};
+		while (scalar @$uids) {
+			my @batch = splice(@$uids, 0, $bs);
+			$batch = join(',', @batch);
+			local $0 = "UID:$batch $mbx $sec";
+			my $r = $mic->fetch_hash($batch, $req, 'FLAGS');
+			unless ($r) { # network error?
+				$err = "E: $url UID FETCH $batch error: $!";
+				last;
+			}
+			for my $uid (@batch) {
+				# messages get deleted, so holes appear
+				my $per_uid = delete $r->{$uid} // next;
+				my $raw = delete($per_uid->{$key}) // next;
+				_imap_do_msg($self, $url, $uid, \$raw,
+						$per_uid->{FLAGS});
+				$last_uid = $uid;
+				last if $self->{quit};
+			}
+			last if $self->{quit};
+		}
+		$itrk->update_last($r_uidval, $last_uid) if $itrk;
+	} until ($err || $self->{quit});
+	$err;
+}
+
+sub imap_each {
+	my ($self, $url, $eml_cb, @args) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = uri_section($uri);
+	my $mic_arg = $self->{mic_arg}->{$sec} or
+			die "BUG: no Mail::IMAPClient->new arg for $sec";
+	local $0 = $uri->mailbox." $sec";
+	my $cb_name = $mic_arg->{Authcallback};
+	if (ref($cb_name) ne 'CODE') {
+		$mic_arg->{Authcallback} = $self->can($cb_name);
+	}
+	my $mic = PublicInbox::IMAPClient->new(%$mic_arg, Debug => 0);
+	my $err;
+	if ($mic && $mic->IsConnected) {
+		local $self->{eml_each} = [ $eml_cb, @args ];
+		$err = _imap_fetch_all($self, $mic, $url);
+	} else {
+		$err = "E: not connected: $!";
+	}
+	$mic;
+}
+
 sub new { bless {}, shift };
 
 1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index c5070cfd..3eb08e9f 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -462,10 +462,15 @@ our $lei = sub {
 sub lei (@) { $lei->(@_) }
 
 sub lei_ok (@) {
-	my $msg = ref($_[-1]) ? pop(@_) : undef;
+	my $msg = ref($_[-1]) eq 'SCALAR' ? pop(@_) : undef;
+	my $tmpdir = quotemeta(File::Spec->tmpdir);
 	# filter out anything that looks like a path name for consistent logs
-	my @msg = grep(!m!\A/!, @_);
-	ok($lei->(@_), "lei @msg". ($msg ? " ($$msg)" : ''));
+	my @msg = ref($_[0]) eq 'ARRAY' ? @{$_[0]} : @_;
+	for (@msg) {
+		s!\A([a-z0-9]+://)[^/]+/!$1\$HOST_PORT/! ||
+			s!$tmpdir\b/(?:[^/]+/)?!\$TMPDIR/!;
+	}
+	ok(lei(@_), "lei @msg". ($msg ? " ($$msg)" : '')) or diag $lei_err;
 }
 
 sub json_utf8 () {
diff --git a/t/lei-convert.t b/t/lei-convert.t
new file mode 100644
index 00000000..a319c4ad
--- /dev/null
+++ b/t/lei-convert.t
@@ -0,0 +1,36 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use Digest::SHA
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	my $dig = Digest::SHA->new(256);
+	lei_ok('convert', '-o', "mboxrd:$tmpdir/foo.mboxrd",
+		"imap://$host_port/t.v2.0");
+	ok(-f "$tmpdir/foo.mboxrd", 'mboxrd created');
+	$dig->addfile("$tmpdir/foo.mboxrd");
+	my $foo = $dig->digest;
+	lei_ok('convert', '-o', "$tmpdir/md", "mboxrd:$tmpdir/foo.mboxrd");
+	ok(-d "$tmpdir/md", 'Maildir created');
+	lei_ok('convert', '-o', "mboxrd:$tmpdir/bar.mboxrd", "$tmpdir/md");
+	$dig->addfile("$tmpdir/bar.mboxrd");
+	my $bar = $dig->digest;
+	is($foo, $bar, 'mboxrd round-tripped through Maildir');
+	open my $in, '<', "$tmpdir/bar.mboxrd" or BAIL_OUT;
+	my $rdr = { 0 => $in, 1 => \(my $out), 2 => \$lei_err };
+	lei_ok([qw(convert --stdin -F mboxrd -o mboxrd:/dev/stdout)],
+		undef, $rdr);
+	$dig->add($out);
+	is($foo, $dig->digest, 'mboxrd round-tripped --stdin => stdout');
+});
+done_testing;
diff --git a/t/net_reader-imap.t b/t/net_reader-imap.t
new file mode 100644
index 00000000..eea8b0fd
--- /dev/null
+++ b/t/net_reader-imap.t
@@ -0,0 +1,40 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($tmpdir, $for_destroy) = tmpdir;
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $sock = tcp_server;
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT "-imapd: $?";
+my ($host, $port) = tcp_host_port $sock;
+require_ok 'PublicInbox::NetReader';
+my $nrd = PublicInbox::NetReader->new;
+$nrd->add_url(my $url = "imap://$host:$port/t.v2.0");
+is($nrd->errors, undef, 'no errors');
+$nrd->{pi_cfg} = PublicInbox::Config->new($cfg_path);
+$nrd->imap_common_init;
+$nrd->{quiet} = 1;
+my (%eml, %urls, %args, $nr, @w);
+local $SIG{__WARN__} = sub { push(@w, @_) };
+$nrd->imap_each($url, sub {
+	my ($u, $uid, $kw, $eml, $arg) = @_;
+	++$urls{$u};
+	++$args{$arg};
+	like($uid, qr/\A[0-9]+\z/, 'got digit UID '.$uid);
+	++$eml{ref($eml)};
+	++$nr;
+}, 'blah');
+is(scalar(@w), 0, 'no warnings');
+ok($nr, 'got some emails');
+is($eml{'PublicInbox::Eml'}, $nr, 'got expected Eml objects');
+is(scalar keys %eml, 1, 'only got Eml objects');
+is($urls{$url}, $nr, 'one URL expected number of times');
+is(scalar keys %urls, 1, 'only got one URL');
+is($args{blah}, $nr, 'got arg expected number of times');
+is(scalar keys %args, 1, 'only got one arg');
+
+done_testing;

^ permalink raw reply related	[relevance 18%]

* Re: [PATCH 08/11] lei convert: mail format conversion sub-command
  2021-02-17 10:07 18% ` [PATCH 08/11] lei convert: mail format conversion sub-command Eric Wong
@ 2021-02-17 10:53 71%   ` Eric Wong
  2021-02-18 11:06 69%     ` [PATCHv2 0/4] lei IMAP support take #2 Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-17 10:53 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +++ b/t/lei-convert.t

> +test_lei({ tmpdir => $tmpdir }, sub {
> +	my $dig = Digest::SHA->new(256);
> +	lei_ok('convert', '-o', "mboxrd:$tmpdir/foo.mboxrd",
> +		"imap://$host_port/t.v2.0");
> +	ok(-f "$tmpdir/foo.mboxrd", 'mboxrd created');
> +	$dig->addfile("$tmpdir/foo.mboxrd");
> +	my $foo = $dig->digest;
> +	lei_ok('convert', '-o', "$tmpdir/md", "mboxrd:$tmpdir/foo.mboxrd");
> +	ok(-d "$tmpdir/md", 'Maildir created');
> +	lei_ok('convert', '-o', "mboxrd:$tmpdir/bar.mboxrd", "$tmpdir/md");
> +	$dig->addfile("$tmpdir/bar.mboxrd");
> +	my $bar = $dig->digest;
> +	is($foo, $bar, 'mboxrd round-tripped through Maildir');

Oh dear, I've truly lost my mind :<  readdir order is totally random
and by some dumb luck this worked when I tested it.

> +	open my $in, '<', "$tmpdir/bar.mboxrd" or BAIL_OUT;
> +	my $rdr = { 0 => $in, 1 => \(my $out), 2 => \$lei_err };
> +	lei_ok([qw(convert --stdin -F mboxrd -o mboxrd:/dev/stdout)],
> +		undef, $rdr);
> +	$dig->add($out);
> +	is($foo, $dig->digest, 'mboxrd round-tripped --stdin => stdout');

^ permalink raw reply	[relevance 71%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-17  4:40 71% does "lei q" --format/-f need to exist? Eric Wong
@ 2021-02-18  5:28 71% ` Kyle Meyer
  2021-02-18 12:07 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-18  5:28 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> "maildir:/path/to/dir" has been supported by public-inbox-watch
> for years, now.
>
> The following all work today:
>
> 	lei q -o mboxrd:/tmp/foo.mboxrd ...
> 	lei q -o mboxcl2:/tmp/foo.mboxcl2 ...
> 	lei q -o maildir:/tmp/foo/ ...
>
> So -f/--format seems redundant.

I find "<format>:<destination>" pretty natural/intuitive, even if
perhaps the stdout case (e.g., "mboxrd:-" or "concatjson:-") looks a bit
odd.  Dropping --format makes sense to me.

^ permalink raw reply	[relevance 71%]

* [PATCHv2 0/4] lei IMAP support take #2
  2021-02-17 10:53 71%   ` Eric Wong
@ 2021-02-18 11:06 69%     ` Eric Wong
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
                         ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-02-18 11:06 UTC (permalink / raw)
  To: meta

The original t/lei-convert.t was bonkers and now fixed in 1/4
Minor changes for everything except 3/4 which AFAIK has no
changes.

Eric Wong (4):
  lei convert: mail format conversion sub-command
  lei import: add IMAP and (maildir|mbox*):$PATHNAME support
  lei: consolidate the bulk of the IPC code
  lei: check for IMAP auth errors

 MANIFEST                         |   6 ++
 lib/PublicInbox/GitCredential.pm |  18 ++--
 lib/PublicInbox/LEI.pm           |  57 +++++++++--
 lib/PublicInbox/LeiAuth.pm       |  70 +++++++++++++
 lib/PublicInbox/LeiConvert.pm    | 148 +++++++++++++++++++++++++++
 lib/PublicInbox/LeiDedupe.pm     |   2 +-
 lib/PublicInbox/LeiImport.pm     | 148 +++++++++++++++++----------
 lib/PublicInbox/LeiMirror.pm     |  19 +---
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiToMail.pm     |   5 +-
 lib/PublicInbox/MdirReader.pm    |  26 +++++
 lib/PublicInbox/NetReader.pm     | 166 ++++++++++++++++++++++++++++---
 lib/PublicInbox/TestCommon.pm    |  11 +-
 t/lei-convert.t                  |  71 +++++++++++++
 t/lei-import-imap.t              |  28 ++++++
 t/lei-import-maildir.t           |   4 +-
 t/lei_to_mail.t                  |  10 ++
 t/net_reader-imap.t              |  40 ++++++++
 xt/lei-auth-fail.t               |  20 ++++
 19 files changed, 747 insertions(+), 109 deletions(-)
 create mode 100644 lib/PublicInbox/LeiAuth.pm
 create mode 100644 lib/PublicInbox/LeiConvert.pm
 create mode 100644 t/lei-convert.t
 create mode 100644 t/lei-import-imap.t
 create mode 100644 t/net_reader-imap.t
 create mode 100644 xt/lei-auth-fail.t


^ permalink raw reply	[relevance 69%]

* [PATCHv2 1/4] lei convert: mail format conversion sub-command
  2021-02-18 11:06 69%     ` [PATCHv2 0/4] lei IMAP support take #2 Eric Wong
@ 2021-02-18 11:06 18%       ` Eric Wong
  2021-02-18 20:22 68%         ` [PATCHv3 0/4] lei convert IMAP support Eric Wong
                           ` (4 more replies)
  2021-02-18 11:06 37%       ` [PATCHv2 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support Eric Wong
                         ` (2 subsequent siblings)
  3 siblings, 5 replies; 200+ results
From: Eric Wong @ 2021-02-18 11:06 UTC (permalink / raw)
  To: meta

This will make testing IMAP support for other commands easier, as
it doesn't write to lei/store at all.  Like the pager and MUA,
"git credential" is always spawned by script/lei (and not
lei-daemon) so it has a controlling terminal for password
prompts.

v2: fix missing requires, correct test ordering
---
 MANIFEST                         |   4 +
 lib/PublicInbox/GitCredential.pm |  18 ++--
 lib/PublicInbox/LEI.pm           |  38 +++++--
 lib/PublicInbox/LeiAuth.pm       |  80 +++++++++++++++
 lib/PublicInbox/LeiConvert.pm    | 160 ++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiDedupe.pm     |   2 +-
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiToMail.pm     |   5 +-
 lib/PublicInbox/MdirReader.pm    |  26 +++++
 lib/PublicInbox/NetReader.pm     | 163 ++++++++++++++++++++++++++++---
 lib/PublicInbox/TestCommon.pm    |  11 ++-
 t/lei-convert.t                  |  71 ++++++++++++++
 t/net_reader-imap.t              |  40 ++++++++
 13 files changed, 588 insertions(+), 37 deletions(-)
 create mode 100644 lib/PublicInbox/LeiAuth.pm
 create mode 100644 lib/PublicInbox/LeiConvert.pm
 create mode 100644 t/lei-convert.t
 create mode 100644 t/net_reader-imap.t

diff --git a/MANIFEST b/MANIFEST
index 82068900..4f146771 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -178,6 +178,8 @@ lib/PublicInbox/InputPipe.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiAuth.pm
+lib/PublicInbox/LeiConvert.pm
 lib/PublicInbox/LeiCurl.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
@@ -360,6 +362,7 @@ t/init.t
 t/ipc.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-convert.t
 t/lei-daemon.t
 t/lei-externals.t
 t/lei-import-maildir.t
@@ -388,6 +391,7 @@ t/msg_iter.t
 t/msgmap.t
 t/msgtime.t
 t/multi-mid.t
+t/net_reader-imap.t
 t/nntp.t
 t/nntpd-tls.t
 t/nntpd-v2.t
diff --git a/lib/PublicInbox/GitCredential.pm b/lib/PublicInbox/GitCredential.pm
index 9e193029..2d81817c 100644
--- a/lib/PublicInbox/GitCredential.pm
+++ b/lib/PublicInbox/GitCredential.pm
@@ -4,11 +4,17 @@ package PublicInbox::GitCredential;
 use strict;
 use PublicInbox::Spawn qw(popen_rd);
 
-sub run ($$) {
-	my ($self, $op) = @_;
-	my ($in_r, $in_w);
+sub run ($$;$) {
+	my ($self, $op, $lei) = @_;
+	my ($in_r, $in_w, $out_r);
+	my $cmd = [ qw(git credential), $op ];
 	pipe($in_r, $in_w) or die "pipe: $!";
-	my $out_r = popen_rd([qw(git credential), $op], undef, { 0 => $in_r });
+	if ($lei && !$lei->{oneshot}) { # we'll die if disconnected:
+		pipe($out_r, my $out_w) or die "pipe: $!";
+		$lei->send_exec_cmd([ $in_r, $out_w ], $cmd, {});
+	} else {
+		$out_r = popen_rd($cmd, undef, { 0 => $in_r });
+	}
 	close $in_r or die "close in_r: $!";
 
 	my $out = '';
@@ -41,8 +47,8 @@ sub check_netrc ($) {
 }
 
 sub fill {
-	my ($self) = @_;
-	my $out_r = run($self, 'fill');
+	my ($self, $lei) = @_;
+	my $out_r = run($self, 'fill', $lei);
 	while (<$out_r>) {
 		chomp;
 		return if $_ eq '';
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1fa9f751..1e4c36d0 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -173,7 +173,11 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
 	format|f=s kw|keywords|flags!),
 	],
-
+'convert' => [ 'LOCATION...|--stdin',
+	'one-time conversion from URL or filesystem to another format',
+	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
+	kw|keywords|flags!),
+	],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
@@ -320,7 +324,7 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-my @WQ_KEYS = qw(lxs l2m imp mrr); # internal workers
+my @WQ_KEYS = qw(lxs l2m imp mrr cnv auth); # internal workers
 
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
@@ -391,18 +395,19 @@ sub fail ($$;$) {
 	undef;
 }
 
-sub check_input_format ($;$) {
-	my ($self, $files) = @_;
-	my $fmt = $self->{opt}->{'format'};
+sub check_input_format ($;$$) {
+	my ($self, $files, $opt_key) = @_;
+	$opt_key //= 'format';
+	my $fmt = $self->{opt}->{$opt_key};
 	if (!$fmt) {
 		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
-		return fail($self, "--format unset for $err");
+		return fail($self, "--$opt_key unset for $err");
 	}
 	return 1 if $fmt eq 'eml';
 	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
 	require PublicInbox::MboxReader;
 	PublicInbox::MboxReader->can($fmt) ||
-				fail($self, "--format=$fmt unrecognized");
+				fail($self, "--$opt_key=$fmt unrecognized");
 }
 
 sub out ($;@) {
@@ -445,6 +450,7 @@ sub lei_atfork_child {
 	} else {
 		delete $self->{0};
 	}
+	delete @$self{qw(cnv)};
 	for (delete @$self{qw(3 sock old_1 au_done)}) {
 		close($_) if defined($_);
 	}
@@ -626,6 +632,11 @@ sub lei_import {
 	PublicInbox::LeiImport->call(@_);
 }
 
+sub lei_convert {
+	require PublicInbox::LeiConvert;
+	PublicInbox::LeiConvert->call(@_);
+}
+
 sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
@@ -770,6 +781,13 @@ sub start_mua {
 	delete $self->{opt}->{verbose};
 }
 
+sub send_exec_cmd { # tell script/lei to execute a command
+	my ($self, $io, $cmd, $env) = @_;
+	my $sock = $self->{sock} // die 'lei client gone';
+	my $fds = [ map { fileno($_) } @$io ];
+	$send_cmd->($sock, $fds, exec_buf($cmd, $env), MSG_EOR);
+}
+
 sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 	my ($self) = @_;
 	my $alerts = $self->{opt}->{alert} // return;
@@ -813,10 +831,9 @@ sub start_pager {
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
 	my $pgr = [ undef, @$rdr{1, 2} ];
-	if (my $sock = $self->{sock}) { # lei(1) process runs it
+	if ($self->{sock}) { # lei(1) process runs it
 		delete @$new_env{keys %$env}; # only set iff unset
-		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
-		$send_cmd->($sock, $fds, exec_buf([$pager], $new_env), MSG_EOR);
+		send_exec_cmd($self, [ @$rdr{0..2} ], [$pager], $new_env);
 	} elsif ($self->{oneshot}) {
 		my $cmd = [$pager];
 		$self->{"pid.$self.$$"}->{spawn($cmd, $new_env, $rdr)} = $cmd;
@@ -920,6 +937,7 @@ sub event_step {
 
 sub event_step_init {
 	my ($self) = @_;
+	return if $self->{-event_init_done}++;
 	if (my $sock = $self->{sock}) { # using DS->EventLoop
 		$self->SUPER::new($sock, EPOLLIN|EPOLLET);
 	}
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
new file mode 100644
index 00000000..88310874
--- /dev/null
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -0,0 +1,80 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Authentication worker for anything that needs auth for read/write IMAP
+# (eventually for read-only NNTP access)
+package PublicInbox::LeiAuth;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::PktOp qw(pkt_do);
+use PublicInbox::NetReader;
+
+sub nrd_merge {
+	my ($lei, $nrd_new) = @_;
+	if ($lei->{pkt_op_p}) { # from lei_convert worker
+		pkt_do($lei->{pkt_op_p}, 'nrd_merge', $nrd_new);
+	} else { # single lei-daemon consumer
+		my $self = $lei->{auth} or return; # client disconnected
+		my $nrd = $self->{nrd};
+		%$nrd = (%$nrd, %$nrd_new);
+	}
+}
+
+sub do_auth { # called via wq_io_do
+	my ($self) = @_;
+	my ($lei, $nrd) = @$self{qw(lei nrd)};
+	$nrd->imap_common_init($lei);
+	nrd_merge($lei, $nrd); # tell lei-daemon updated auth info
+}
+
+sub do_finish_auth { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($self, $lei, $post_auth_cb, @args) = @$arg;
+	$? ? $lei->dclose : $post_auth_cb->(@args);
+}
+
+sub auth_eof {
+	my ($lei, $post_auth_cb, @args) = @_;
+	my $self = delete $lei->{auth} or return;
+	$self->wq_wait_old(\&do_finish_auth, $lei, $post_auth_cb, @args);
+}
+
+sub auth_start {
+	my ($self, $lei, $post_auth_cb, @args) = @_;
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'nrd_merge' => [ \&nrd_merge, $lei ],
+		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$self->wq_workers_start('lei_auth', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_io_do('do_auth', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	# prevent {sock} from being closed in lei_atfork_child:
+	my $s = delete $self->{lei}->{sock};
+	delete $self->{lei}->{auth}; # drop circular ref
+	$self->{lei}->lei_atfork_child;
+	$self->{lei}->{sock} = $s if $s;
+	$self->SUPER::ipc_atfork_child;
+}
+
+sub new {
+	my ($cls, $nrd) = @_;
+	bless { nrd => $nrd }, $cls;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
new file mode 100644
index 00000000..78fd5e17
--- /dev/null
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -0,0 +1,160 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# front-end for the "lei convert" sub-command
+package PublicInbox::LeiConvert;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::Eml;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::PktOp;
+use PublicInbox::LeiStore;
+use PublicInbox::LeiOverview;
+
+sub mbox_cb {
+	my ($eml, $self) = @_;
+	my @kw = PublicInbox::LeiStore::mbox_keywords($eml);
+	$eml->header_set($_) for qw(Status X-Status);
+	$self->{wcb}->(undef, { kw => \@kw }, $eml);
+}
+
+sub imap_cb { # ->imap_each
+	my ($url, $uid, $kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
+sub mdir_cb {
+	my ($kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
+sub do_convert { # via wq_do
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	my $in_fmt = $lei->{opt}->{'in-format'};
+	if (my $stdin = delete $self->{0}) {
+		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
+	}
+	for my $input (@{$self->{inputs}}) {
+		my $ifmt = lc($in_fmt // '');
+		if ($input =~ m!\A(?:imap|nntp)s?://!) { # TODO: nntp
+			$lei->{nrd}->imap_each($input, \&imap_cb, $self);
+			next;
+		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+			$ifmt = lc $1;
+		}
+		if (-f $input) {
+			open my $fh, '<', $input or
+					return $lei->fail("open $input: $!");
+			PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+		} elsif (-d _) {
+			PublicInbox::MdirReader::maildir_each_eml($input,
+							\&mdir_cb, $self);
+		} else {
+			die "BUG: $input unhandled"; # should've failed earlier
+		}
+	}
+	delete $lei->{1};
+	delete $self->{wcb}; # commit
+}
+
+sub convert_start {
+	my ($lei) = @_;
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'' => [ $lei->can('dclose'), $lei ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	my $self = $lei->{cnv};
+	$self->wq_workers_start('lei_convert', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_io_do('do_convert', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub call { # the main "lei convert" method
+	my ($cls, $lei, @inputs) = @_;
+	my $opt = $lei->{opt};
+	$opt->{kw} //= 1;
+	my $self = $lei->{cnv} = bless {}, $cls;
+	my $in_fmt = $opt->{'in-format'};
+	my ($nrd, @f, @d);
+	$opt->{dedupe} //= 'none';
+	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
+	$lei->{l2m} or return
+		$lei->fail("output not specified or is not a mail destination");
+	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
+	if ($opt->{stdin}) {
+		@inputs and return $lei->fail("--stdin and @inputs do not mix");
+		$lei->check_input_format(undef, 'in-format') or return;
+		$self->{0} = $lei->{0};
+	}
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+			require PublicInbox::NetReader;
+			$nrd //= PublicInbox::NetReader->new;
+			$nrd->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($in_fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--in-format=$in_fmt and `$ifmt:' conflict
+
+			}
+			if (-f $input_path) {
+				require PublicInbox::MboxReader;
+				PublicInbox::MboxReader->can($ifmt) or return
+					$lei->fail("$ifmt not supported");
+			} elsif (-d _) {
+				require PublicInbox::MdirReader;
+				$ifmt eq 'maildir' or return
+					$lei->fail("$ifmt not supported");
+			} else {
+				return $lei->fail("Unable to handle $input");
+			}
+		} elsif (-f $input) { push @f, $input }
+		elsif (-d _) { push @d, $input }
+		else { return $lei->fail("Unable to handle $input") }
+	}
+	if (@f) { $lei->check_input_format(\@f, 'in-format') or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
+	}
+	$self->{inputs} = \@inputs;
+	return convert_start($lei) if !$nrd;
+
+	if (my $err = $nrd->errors) {
+		return $lei->fail($err);
+	}
+	$nrd->{quiet} = $opt->{quiet};
+	$lei->{nrd} = $nrd;
+	require PublicInbox::LeiAuth;
+	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
+	$auth->auth_start($lei, \&convert_start, $lei);
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	$lei->lei_atfork_child;
+	my $l2m = delete $lei->{l2m};
+	$l2m->pre_augment($lei);
+	$l2m->do_augment($lei);
+	$l2m->post_augment($lei);
+	$self->{wcb} = $l2m->write_cb($lei);
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
+	$self->SUPER::ipc_atfork_child;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 2114c0e8..5fec9384 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -127,7 +127,7 @@ sub prepare_dedupe {
 
 sub pause_dedupe {
 	my ($self) = @_;
-	my $skv = $self->[0];
+	my $skv = $self->[0] or return;
 	$skv->dbh_release;
 	delete($skv->{dbh}) if $skv;
 }
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index c820f0d7..3169bae6 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -51,18 +51,19 @@ sub detect_fmt ($$) {
 }
 
 sub new {
-	my ($class, $lei) = @_;
+	my ($class, $lei, $ofmt_key) = @_;
 	my $opt = $lei->{opt};
 	my $dst = $opt->{output} // '-';
 	$dst = '/dev/stdout' if $dst eq '-';
+	$ofmt_key //= 'format';
 
-	my $fmt = $opt->{'format'};
+	my $fmt = $opt->{$ofmt_key};
 	$fmt = lc($fmt) if defined $fmt;
 	if ($dst =~ s/\A([a-z0-9]+)://is) { # e.g. Maildir:/home/user/Mail/
 		my $ofmt = lc $1;
 		$fmt //= $ofmt;
 		return $lei->fail(<<"") if $fmt ne $ofmt;
---format=$fmt and --output=$ofmt conflict
+--$ofmt_key=$fmt and --output=$ofmt conflict
 
 	}
 	$fmt //= 'json' if $dst eq '/dev/stdout';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e3e512be..f0adc44f 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -437,7 +437,7 @@ sub _do_augment_mbox {
 	$dedupe->pause_dedupe if $dedupe;
 }
 
-sub pre_augment { # fast (1 disk seek), runs in main daemon
+sub pre_augment { # fast (1 disk seek), runs in same process as post_augment
 	my ($self, $lei) = @_;
 	# _pre_augment_maildir, _pre_augment_mbox
 	my $m = "_pre_augment_$self->{base_type}";
@@ -451,7 +451,8 @@ sub do_augment { # slow, runs in wq worker
 	$self->$m($lei);
 }
 
-sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
+# fast (spawn compressor or mkdir), runs in same process as pre_augment
+sub post_augment {
 	my ($self, $lei, @args) = @_;
 	# _post_augment_maildir, _post_augment_mbox
 	my $m = "_post_augment_$self->{base_type}";
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index e0ff676d..5fa534f5 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -7,6 +7,7 @@
 package PublicInbox::MdirReader;
 use strict;
 use v5.10.1;
+use PublicInbox::InboxWritable qw(eml_from_path);
 
 # returns Maildir flags from a basename ('' for no flags, undef for invalid)
 sub maildir_basename_flags {
@@ -36,4 +37,29 @@ sub maildir_each_file ($$;@) {
 	}
 }
 
+my %c2kw = ('D' => 'draft', F => 'flagged', R => 'answered', S => 'seen');
+
+sub maildir_each_eml ($$;@) {
+	my ($dir, $cb, @arg) = @_;
+	$dir .= '/' unless substr($dir, -1) eq '/';
+	my $pfx = "$dir/new/";
+	if (opendir(my $dh, $pfx)) {
+		while (defined(my $bn = readdir($dh))) {
+			next if substr($bn, 0, 1) eq '.';
+			my @f = split(/:/, $bn, -1);
+			next if scalar(@f) != 1;
+			my $eml = eml_from_path($pfx.$bn) or next;
+			$cb->([], $eml, @arg);
+		}
+	}
+	$pfx = "$dir/cur/";
+	opendir my $dh, $pfx or return;
+	while (defined(my $bn = readdir($dh))) {
+		my $fl = maildir_basename_flags($bn) // next;
+		my $eml = eml_from_path($pfx.$bn) or next;
+		my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
+		$cb->(\@kw, $eml, @arg);
+	}
+}
+
 1;
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 1d053425..ad8c18d0 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -5,7 +5,8 @@
 package PublicInbox::NetReader;
 use strict;
 use v5.10.1;
-use parent qw(Exporter);
+use parent qw(Exporter PublicInbox::IPC);
+use PublicInbox::Eml;
 
 # TODO: trim this down, this is huge
 our @EXPORT = qw(uri_new uri_scheme uri_section
@@ -33,7 +34,7 @@ sub uri_section ($) {
 sub auth_anon_cb { '' }; # for Mail::IMAPClient::Authcallback
 
 sub mic_for { # mic = Mail::IMAPClient
-	my ($self, $url, $mic_args) = @_;
+	my ($self, $url, $mic_args, $lei) = @_;
 	require PublicInbox::URIimap;
 	my $uri = PublicInbox::URIimap->new($url);
 	require PublicInbox::GitCredential;
@@ -74,21 +75,26 @@ sub mic_for { # mic = Mail::IMAPClient
 	}
 	if ($cred) {
 		$cred->check_netrc unless defined $cred->{password};
-		$cred->fill; # may prompt user here
+		$cred->fill($lei); # may prompt user here
 		$mic->User($mic_arg->{User} = $cred->{username});
 		$mic->Password($mic_arg->{Password} = $cred->{password});
 	} else { # AUTH=ANONYMOUS
 		$mic->Authmechanism($mic_arg->{Authmechanism} = 'ANONYMOUS');
-		$mic->Authcallback($mic_arg->{Authcallback} = \&auth_anon_cb);
+		$mic_arg->{Authcallback} = 'auth_anon_cb';
+		$mic->Authcallback(\&auth_anon_cb);
 	}
+	my $err;
 	if ($mic->login && $mic->IsAuthenticated) {
 		# success! keep IMAPClient->new arg in case we get disconnected
 		$self->{mic_arg}->{uri_section($uri)} = $mic_arg;
 	} else {
-		warn "E: <$url> LOGIN: $@\n";
+		$err = "E: <$url> LOGIN: $@\n";
 		$mic = undef;
 	}
 	$cred->run($mic ? 'approve' : 'reject') if $cred;
+	if ($err) {
+		$lei ? $lei->fail($err) : warn($err);
+	}
 	$mic;
 }
 
@@ -139,8 +145,8 @@ E: <$url> STARTTLS requested and failed
 	$nn;
 }
 
-sub nn_for ($$$) { # nn = Net::NNTP
-	my ($self, $url, $nn_args) = @_;
+sub nn_for ($$$;$) { # nn = Net::NNTP
+	my ($self, $url, $nn_args, $lei) = @_;
 	my $uri = uri_new($url);
 	my $sec = uri_section($uri);
 	my $nntp_opt = $self->{nntp_opt}->{$sec} //= {};
@@ -170,7 +176,7 @@ sub nn_for ($$$) { # nn = Net::NNTP
 	my $nn = nn_new($nn_arg, $nntp_opt, $url);
 
 	if ($cred) {
-		$cred->fill; # may prompt user here
+		$cred->fill($lei); # may prompt user here
 		if ($nn->authinfo($u, $p)) {
 			push @{$nntp_opt->{-postconn}}, [ 'authinfo', $u, $p ];
 		} else {
@@ -240,14 +246,15 @@ sub cfg_bool ($$$) {
 }
 
 # flesh out common IMAP-specific data structures
-sub imap_common_init ($) {
-	my ($self) = @_;
+sub imap_common_init ($;$) {
+	my ($self, $lei) = @_;
+	$self->{quiet} = 1 if $lei && $lei->{opt}->{quiet};
 	eval { require PublicInbox::IMAPClient } or
 		die "Mail::IMAPClient is required for IMAP:\n$@\n";
 	eval { require PublicInbox::IMAPTracker } or
 		die "DBD::SQLite is required for IMAP\n:$@\n";
 	require PublicInbox::URIimap;
-	my $cfg = $self->{pi_cfg};
+	my $cfg = $self->{pi_cfg} // $lei->_lei_cfg;
 	my $mic_args = {}; # scheme://authority => Mail:IMAPClient arg
 	for my $url (@{$self->{imap_order}}) {
 		my $uri = PublicInbox::URIimap->new($url);
@@ -275,7 +282,8 @@ sub imap_common_init ($) {
 	my $mics = {}; # schema://authority => IMAPClient obj
 	for my $url (@{$self->{imap_order}}) {
 		my $uri = PublicInbox::URIimap->new($url);
-		$mics->{uri_section($uri)} //= mic_for($self, $url, $mic_args);
+		my $sec = uri_section($uri);
+		$mics->{$sec} //= mic_for($self, $url, $mic_args, $lei);
 	}
 	$mics;
 }
@@ -294,9 +302,140 @@ sub errors {
 	if (my $u = $self->{unsupported_url}) {
 		return "Unsupported URL(s): @$u";
 	}
+	if ($self->{imap_order}) {
+		eval { require PublicInbox::IMAPClient } or
+			die "Mail::IMAPClient is required for IMAP:\n$@\n";
+	}
 	undef;
 }
 
+my %IMAPflags2kw = (
+	'\Seen' => 'seen',
+	'\Answered' => 'answered',
+	'\Flagged' => 'flagged',
+	'\Draft' => 'draft',
+);
+
+sub _imap_do_msg ($$$$$) {
+	my ($self, $url, $uid, $raw, $flags) = @_;
+	# our target audience expects LF-only, save storage
+	$$raw =~ s/\r\n/\n/sg;
+	my $kw = [];
+	for my $f (split(/ /, $flags)) {
+		my $k = $IMAPflags2kw{$f} // next; # TODO: X-Label?
+		push @$kw, $k;
+	}
+	my ($eml_cb, @args) = @{$self->{eml_each}};
+	$eml_cb->($url, $uid, $kw, PublicInbox::Eml->new($raw), @args);
+}
+
+sub _imap_fetch_all ($$$) {
+	my ($self, $mic, $url) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = uri_section($uri);
+	my $mbx = $uri->mailbox;
+	$mic->Clear(1); # trim results history
+	$mic->examine($mbx) or return "E: EXAMINE $mbx ($sec) failed: $!";
+	my ($r_uidval, $r_uidnext);
+	for ($mic->Results) {
+		/^\* OK \[UIDVALIDITY ([0-9]+)\].*/ and $r_uidval = $1;
+		/^\* OK \[UIDNEXT ([0-9]+)\].*/ and $r_uidnext = $1;
+		last if $r_uidval && $r_uidnext;
+	}
+	$r_uidval //= $mic->uidvalidity($mbx) //
+		return "E: $url cannot get UIDVALIDITY";
+	$r_uidnext //= $mic->uidnext($mbx) //
+		return "E: $url cannot get UIDNEXT";
+	my $itrk = $self->{incremental} ?
+			PublicInbox::IMAPTracker->new($url) : 0;
+	my ($l_uidval, $l_uid) = $itrk ? $itrk->get_last : ();
+	$l_uidval //= $r_uidval; # first time
+	$l_uid //= 1;
+	if ($l_uidval != $r_uidval) {
+		return "E: $url UIDVALIDITY mismatch\n".
+			"E: local=$l_uidval != remote=$r_uidval";
+	}
+	my $r_uid = $r_uidnext - 1;
+	if ($l_uid != 1 && $l_uid > $r_uid) {
+		return "E: $url local UID exceeds remote ($l_uid > $r_uid)\n".
+			"E: $url strangely, UIDVALIDLITY matches ($l_uidval)\n";
+	}
+	return if $l_uid >= $r_uid; # nothing to do
+
+	warn "# $url fetching UID $l_uid:$r_uid\n" unless $self->{quiet};
+	$mic->Uid(1); # the default, we hope
+	my $bs = $self->{imap_opt}->{$sec}->{batch_size} // 1;
+	my $req = $mic->imap4rev1 ? 'BODY.PEEK[]' : 'RFC822.PEEK';
+	my $key = $req;
+	$key =~ s/\.PEEK//;
+	my ($uids, $batch);
+	my $err;
+	do {
+		# I wish "UID FETCH $START:*" could work, but:
+		# 1) servers do not need to return results in any order
+		# 2) Mail::IMAPClient doesn't offer a streaming API
+		$uids = $mic->search("UID $l_uid:*") or
+			return "E: $url UID SEARCH $l_uid:* error: $!";
+		return if scalar(@$uids) == 0;
+
+		# RFC 3501 doesn't seem to indicate order of UID SEARCH
+		# responses, so sort it ourselves.  Order matters so
+		# IMAPTracker can store the newest UID.
+		@$uids = sort { $a <=> $b } @$uids;
+
+		# Did we actually get new messages?
+		return if $uids->[0] < $l_uid;
+
+		$l_uid = $uids->[-1] + 1; # for next search
+		my $last_uid;
+		my $n = $self->{max_batch};
+		while (scalar @$uids) {
+			my @batch = splice(@$uids, 0, $bs);
+			$batch = join(',', @batch);
+			local $0 = "UID:$batch $mbx $sec";
+			my $r = $mic->fetch_hash($batch, $req, 'FLAGS');
+			unless ($r) { # network error?
+				$err = "E: $url UID FETCH $batch error: $!";
+				last;
+			}
+			for my $uid (@batch) {
+				# messages get deleted, so holes appear
+				my $per_uid = delete $r->{$uid} // next;
+				my $raw = delete($per_uid->{$key}) // next;
+				_imap_do_msg($self, $url, $uid, \$raw,
+						$per_uid->{FLAGS});
+				$last_uid = $uid;
+				last if $self->{quit};
+			}
+			last if $self->{quit};
+		}
+		$itrk->update_last($r_uidval, $last_uid) if $itrk;
+	} until ($err || $self->{quit});
+	$err;
+}
+
+sub imap_each {
+	my ($self, $url, $eml_cb, @args) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = uri_section($uri);
+	my $mic_arg = $self->{mic_arg}->{$sec} or
+			die "BUG: no Mail::IMAPClient->new arg for $sec";
+	local $0 = $uri->mailbox." $sec";
+	my $cb_name = $mic_arg->{Authcallback};
+	if (ref($cb_name) ne 'CODE') {
+		$mic_arg->{Authcallback} = $self->can($cb_name);
+	}
+	my $mic = PublicInbox::IMAPClient->new(%$mic_arg, Debug => 0);
+	my $err;
+	if ($mic && $mic->IsConnected) {
+		local $self->{eml_each} = [ $eml_cb, @args ];
+		$err = _imap_fetch_all($self, $mic, $url);
+	} else {
+		$err = "E: not connected: $!";
+	}
+	$mic;
+}
+
 sub new { bless {}, shift };
 
 1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index c5070cfd..3eb08e9f 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -462,10 +462,15 @@ our $lei = sub {
 sub lei (@) { $lei->(@_) }
 
 sub lei_ok (@) {
-	my $msg = ref($_[-1]) ? pop(@_) : undef;
+	my $msg = ref($_[-1]) eq 'SCALAR' ? pop(@_) : undef;
+	my $tmpdir = quotemeta(File::Spec->tmpdir);
 	# filter out anything that looks like a path name for consistent logs
-	my @msg = grep(!m!\A/!, @_);
-	ok($lei->(@_), "lei @msg". ($msg ? " ($$msg)" : ''));
+	my @msg = ref($_[0]) eq 'ARRAY' ? @{$_[0]} : @_;
+	for (@msg) {
+		s!\A([a-z0-9]+://)[^/]+/!$1\$HOST_PORT/! ||
+			s!$tmpdir\b/(?:[^/]+/)?!\$TMPDIR/!;
+	}
+	ok(lei(@_), "lei @msg". ($msg ? " ($$msg)" : '')) or diag $lei_err;
 }
 
 sub json_utf8 () {
diff --git a/t/lei-convert.t b/t/lei-convert.t
new file mode 100644
index 00000000..f58a0a80
--- /dev/null
+++ b/t/lei-convert.t
@@ -0,0 +1,71 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use PublicInbox::MboxReader;
+use PublicInbox::MdirReader;
+use PublicInbox::NetReader;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	my $d = $ENV{HOME};
+	my $dig = Digest::SHA->new(256);
+	lei_ok('convert', '-o', "mboxrd:$d/foo.mboxrd",
+		"imap://$host_port/t.v2.0");
+	ok(-f "$d/foo.mboxrd", 'mboxrd created');
+	my (@mboxrd, @mboxcl2);
+	open my $fh, '<', "$d/foo.mboxrd" or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxrd($fh, sub { push @mboxrd, shift });
+	ok(scalar(@mboxrd) > 1, 'got multiple messages');
+
+	lei_ok('convert', '-o', "mboxcl2:$d/cl2", "mboxrd:$d/foo.mboxrd");
+	ok(-s "$d/cl2", 'mboxcl2 non-empty') or diag $lei_err;
+	open $fh, '<', "$d/cl2" or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxcl2($fh, sub {
+		my $eml = shift;
+		$eml->header_set($_) for (qw(Content-Length Lines));
+		push @mboxcl2, $eml;
+	});
+	is_deeply(\@mboxcl2, \@mboxrd, 'mboxrd and mboxcl2 have same mail');
+
+	lei_ok('convert', '-o', "$d/md", "mboxrd:$d/foo.mboxrd");
+	ok(-d "$d/md", 'Maildir created');
+	my @md;
+	PublicInbox::MdirReader::maildir_each_eml("$d/md", sub {
+		push @md, $_[1];
+	});
+	is(scalar(@md), scalar(@mboxrd), 'got expected emails in Maildir');
+	@md = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @md;
+	@mboxrd = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @mboxrd;
+	my @rd_nostatus = map {
+		my $eml = PublicInbox::Eml->new(\($_->as_string));
+		$eml->header_set('Status');
+		$eml;
+	} @mboxrd;
+	is_deeply(\@md, \@rd_nostatus, 'Maildir output matches mboxrd');
+
+	my @bar;
+	lei_ok('convert', '-o', "mboxrd:$d/bar.mboxrd", "$d/md");
+	open $fh, '<', "$d/bar.mboxrd" or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxrd($fh, sub { push @bar, shift });
+	@bar = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @bar;
+	is_deeply(\@mboxrd, \@bar,
+			'mboxrd round-tripped through Maildir w/ flags');
+
+	open my $in, '<', "$d/foo.mboxrd" or BAIL_OUT;
+	my $rdr = { 0 => $in, 1 => \(my $out), 2 => \$lei_err };
+	lei_ok([qw(convert --stdin -F mboxrd -o mboxrd:/dev/stdout)],
+		undef, $rdr);
+	open $fh, '<', "$d/foo.mboxrd" or BAIL_OUT;
+	my $exp = do { local $/; <$fh> };
+	is($out, $exp, 'stdin => stdout');
+});
+done_testing;
diff --git a/t/net_reader-imap.t b/t/net_reader-imap.t
new file mode 100644
index 00000000..eea8b0fd
--- /dev/null
+++ b/t/net_reader-imap.t
@@ -0,0 +1,40 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($tmpdir, $for_destroy) = tmpdir;
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $sock = tcp_server;
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT "-imapd: $?";
+my ($host, $port) = tcp_host_port $sock;
+require_ok 'PublicInbox::NetReader';
+my $nrd = PublicInbox::NetReader->new;
+$nrd->add_url(my $url = "imap://$host:$port/t.v2.0");
+is($nrd->errors, undef, 'no errors');
+$nrd->{pi_cfg} = PublicInbox::Config->new($cfg_path);
+$nrd->imap_common_init;
+$nrd->{quiet} = 1;
+my (%eml, %urls, %args, $nr, @w);
+local $SIG{__WARN__} = sub { push(@w, @_) };
+$nrd->imap_each($url, sub {
+	my ($u, $uid, $kw, $eml, $arg) = @_;
+	++$urls{$u};
+	++$args{$arg};
+	like($uid, qr/\A[0-9]+\z/, 'got digit UID '.$uid);
+	++$eml{ref($eml)};
+	++$nr;
+}, 'blah');
+is(scalar(@w), 0, 'no warnings');
+ok($nr, 'got some emails');
+is($eml{'PublicInbox::Eml'}, $nr, 'got expected Eml objects');
+is(scalar keys %eml, 1, 'only got Eml objects');
+is($urls{$url}, $nr, 'one URL expected number of times');
+is(scalar keys %urls, 1, 'only got one URL');
+is($args{blah}, $nr, 'got arg expected number of times');
+is(scalar keys %args, 1, 'only got one arg');
+
+done_testing;

^ permalink raw reply related	[relevance 18%]

* [PATCHv2 4/4] lei: check for IMAP auth errors
  2021-02-18 11:06 69%     ` [PATCHv2 0/4] lei IMAP support take #2 Eric Wong
                         ` (2 preceding siblings ...)
  2021-02-18 11:06 47%       ` [PATCH (resend) 3/4] lei: consolidate the bulk of the IPC code Eric Wong
@ 2021-02-18 11:06 61%       ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 11:06 UTC (permalink / raw)
  To: meta

We need to ensure authentication failures and error codes get
propagated to the parent process(es) properly.

v2: update MANIFEST
---
 MANIFEST                     |  1 +
 lib/PublicInbox/LeiAuth.pm   |  1 +
 lib/PublicInbox/NetReader.pm |  3 +++
 xt/lei-auth-fail.t           | 20 ++++++++++++++++++++
 4 files changed, 25 insertions(+)
 create mode 100644 xt/lei-auth-fail.t

diff --git a/MANIFEST b/MANIFEST
index 19f73356..3d9ad616 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -466,6 +466,7 @@ xt/git_async_cmp.t
 xt/httpd-async-stream.t
 xt/imapd-mbsync-oimap.t
 xt/imapd-validate.t
+xt/lei-auth-fail.t
 xt/lei-sigpipe.t
 xt/mem-imapd-tls.t
 xt/mem-msgview.t
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 7210af99..7acb9900 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -42,6 +42,7 @@ sub auth_eof {
 
 sub auth_start {
 	my ($self, $lei, $post_auth_cb, @args) = @_;
+	$lei->_lei_cfg(1); # workers may need to read config
 	my $op = $lei->workers_start($self, 'auth', 1, {
 		'nrd_merge' => [ \&nrd_merge, $lei ],
 		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index ad8c18d0..61ea538b 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -89,6 +89,9 @@ sub mic_for { # mic = Mail::IMAPClient
 		$self->{mic_arg}->{uri_section($uri)} = $mic_arg;
 	} else {
 		$err = "E: <$url> LOGIN: $@\n";
+		if ($cred && defined($cred->{password})) {
+			$err =~ s/\Q$cred->{password}\E/*******/g;
+		}
 		$mic = undef;
 	}
 	$cred->run($mic ? 'approve' : 'reject') if $cred;
diff --git a/xt/lei-auth-fail.t b/xt/lei-auth-fail.t
new file mode 100644
index 00000000..5308d0f9
--- /dev/null
+++ b/xt/lei-auth-fail.t
@@ -0,0 +1,20 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+
+# TODO: mock IMAP server which fails at authentication so we don't
+# have to make external connections to test this:
+my $imap_fail = $ENV{TEST_LEI_IMAP_FAIL_URL} //
+	'imaps://AzureDiamond:Hunter2@public-inbox.org:994/INBOX';
+test_lei(sub {
+	ok(!lei(qw(convert -o mboxrd:/dev/stdout), $imap_fail),
+		'IMAP auth failure on convert');
+	like($lei_err, qr!\bE:.*?imaps://.*?!sm, 'error shown');
+	unlike($lei_err, qr!Hunter2!s, 'password not shown');
+	is($lei_out, '', 'nothing output');
+	ok(!lei(qw(import), $imap_fail), 'IMAP auth failure on import');
+	like($lei_err, qr!\bE:.*?imaps://.*?!sm, 'error shown');
+	unlike($lei_err, qr!Hunter2!s, 'password not shown');
+});
+done_testing;

^ permalink raw reply related	[relevance 61%]

* [PATCH (resend) 3/4] lei: consolidate the bulk of the IPC code
  2021-02-18 11:06 69%     ` [PATCHv2 0/4] lei IMAP support take #2 Eric Wong
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
  2021-02-18 11:06 37%       ` [PATCHv2 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support Eric Wong
@ 2021-02-18 11:06 47%       ` Eric Wong
  2021-02-18 11:06 61%       ` [PATCHv2 4/4] lei: check for IMAP auth errors Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 11:06 UTC (permalink / raw)
  To: meta

The backends for "lei add-external --mirror", "lei convert", and
"lei import" all share a similar pattern for spawning background
workers.  Hoist out the common parts to slim down our code base
a bit.

The LeiXSearch and LeiToMail workers for "lei q" remains a the
odd duck due to the deep pipelining and parallelization.
---
 lib/PublicInbox/LEI.pm        | 19 +++++++++++++++++++
 lib/PublicInbox/LeiAuth.pm    | 17 +++--------------
 lib/PublicInbox/LeiConvert.pm | 22 +++++-----------------
 lib/PublicInbox/LeiImport.pm  | 19 ++++---------------
 lib/PublicInbox/LeiMirror.pm  | 19 ++++---------------
 5 files changed, 35 insertions(+), 61 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1e4c36d0..0b4bc20e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -468,6 +468,25 @@ sub lei_atfork_child {
 	$current_lei = $persist ? undef : $self; # for SIG{__WARN__}
 }
 
+sub workers_start {
+	my ($lei, $wq, $ident, $jobs, $ops) = @_;
+	$ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		%$ops
+	};
+	require PublicInbox::PktOp;
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$wq->wq_workers_start($ident, $jobs, $lei->oldset, { lei => $lei });
+	delete $lei->{pkt_op_p};
+	my $op = delete $lei->{pkt_op_c};
+	$lei->event_step_init;
+	# oneshot needs $op, daemon-mode uses DS->EventLoop to handle $op
+	$lei->{oneshot} ? $op : undef;
+}
+
 sub _help {
 	require PublicInbox::LeiHelp;
 	PublicInbox::LeiHelp::call($_[0], $_[1], \%CMD, \%OPTDESC);
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 88310874..7210af99 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -42,24 +42,13 @@ sub auth_eof {
 
 sub auth_start {
 	my ($self, $lei, $post_auth_cb, @args) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
+	my $op = $lei->workers_start($self, 'auth', 1, {
 		'nrd_merge' => [ \&nrd_merge, $lei ],
 		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	$self->wq_workers_start('lei_auth', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	});
 	$self->wq_io_do('do_auth', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 78fd5e17..ba375772 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::PktOp;
 use PublicInbox::LeiStore;
 use PublicInbox::LeiOverview;
 
@@ -59,26 +58,15 @@ sub do_convert { # via wq_do
 	delete $self->{wcb}; # commit
 }
 
-sub convert_start {
+sub convert_start { # LeiAuth->auth_start callback
 	my ($lei) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ $lei->can('dclose'), $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{cnv};
-	$self->wq_workers_start('lei_convert', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_convert', 1, {
+		'' => [ $lei->can('dclose'), $lei ]
+	});
 	$self->wq_io_do('do_convert', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub call { # the main "lei convert" method
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 62a2a412..68cab12c 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::PktOp;
 
 sub _import_eml { # MboxReader callback
 	my ($eml, $sto, $set_kw) = @_;
@@ -31,13 +30,6 @@ sub import_done { # EOF callback for main daemon
 
 sub import_start {
 	my ($lei) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ \&import_done, $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{imp};
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $nrd = $lei->{nrd}) {
@@ -46,18 +38,15 @@ sub import_start {
 		my $nproc = $self->detect_nproc;
 		$j = $nproc if $j > $nproc;
 	}
-	$self->wq_workers_start('lei_import', $j, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_import', $j, {
+		'' => [ \&import_done, $lei ],
+	});
 	$self->wq_io_do('import_stdin', []) if $self->{0};
 	for my $input (@{$self->{inputs}}) {
 		$self->wq_io_do('import_path_url', [], $input);
 	}
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub call { # the main "lei import" method
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index c5153148..f8ca1ee5 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
 use PublicInbox::Spawn qw(popen_rd spawn);
-use PublicInbox::PktOp;
 
 sub do_finish_mirror { # dwaitpid callback
 	my ($arg, $pid) = @_;
@@ -279,22 +278,12 @@ sub start {
 	require PublicInbox::Inbox;
 	require PublicInbox::Admin;
 	require PublicInbox::InboxWritable;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ \&mirror_done, $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	$self->wq_workers_start('lei_mirror', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_mirror', 1, {
+		'' => [ \&mirror_done, $lei ]
+	});
 	$self->wq_io_do('do_mirror', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {

^ permalink raw reply related	[relevance 47%]

* [PATCHv2 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support
  2021-02-18 11:06 69%     ` [PATCHv2 0/4] lei IMAP support take #2 Eric Wong
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
@ 2021-02-18 11:06 37%       ` Eric Wong
  2021-02-18 11:06 47%       ` [PATCH (resend) 3/4] lei: consolidate the bulk of the IPC code Eric Wong
  2021-02-18 11:06 61%       ` [PATCHv2 4/4] lei: check for IMAP auth errors Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 11:06 UTC (permalink / raw)
  To: meta

This makes "lei import" more similar to "lei convert" and
allows importing from disparate sources simultaneously.

We'll also fix some ->child_error usage errors and make
the style of the code more similar to the "lei convert"
code.

v2: fix missing requires
---
 MANIFEST                     |   1 +
 lib/PublicInbox/LeiImport.pm | 129 ++++++++++++++++++++++++-----------
 t/lei-import-imap.t          |  28 ++++++++
 t/lei-import-maildir.t       |   4 +-
 t/lei_to_mail.t              |  10 +++
 5 files changed, 130 insertions(+), 42 deletions(-)
 create mode 100644 t/lei-import-imap.t

diff --git a/MANIFEST b/MANIFEST
index 4f146771..19f73356 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -365,6 +365,7 @@ t/kqnotify.t
 t/lei-convert.t
 t/lei-daemon.t
 t/lei-externals.t
+t/lei-import-imap.t
 t/lei-import-maildir.t
 t/lei-import.t
 t/lei-mirror.t
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 32f3a467..62a2a412 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -29,7 +29,7 @@ sub import_done { # EOF callback for main daemon
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
-sub do_import {
+sub import_start {
 	my ($lei) = @_;
 	my $ops = {
 		'!' => [ $lei->can('fail_handler'), $lei ],
@@ -39,7 +39,7 @@ sub do_import {
 	};
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{imp};
-	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{argv}}) || 1;
+	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $nrd = $lei->{nrd}) {
 		# $j = $nrd->net_concurrency($j); TODO
 	} else {
@@ -50,8 +50,8 @@ sub do_import {
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
 	$self->wq_io_do('import_stdin', []) if $self->{0};
-	for my $x (@{$self->{argv}}) {
-		$self->wq_io_do('import_path_url', [], $x);
+	for my $input (@{$self->{inputs}}) {
+		$self->wq_io_do('import_path_url', [], $input);
 	}
 	$self->wq_close(1);
 	$lei->event_step_init; # wait for shutdowns
@@ -61,60 +61,91 @@ sub do_import {
 }
 
 sub call { # the main "lei import" method
-	my ($cls, $lei, @argv) = @_;
+	my ($cls, $lei, @inputs) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
+	my ($nrd, @f, @d);
 	$lei->{opt}->{kw} //= 1;
-	my $self = $lei->{imp} = bless { argv => \@argv }, $cls;
+	my $self = $lei->{imp} = bless { inputs => \@inputs }, $cls;
 	if ($lei->{opt}->{stdin}) {
-		@argv and return
-			$lei->fail("--stdin and locations (@argv) do not mix");
+		@inputs and return $lei->fail("--stdin and @inputs do not mix");
 		$lei->check_input_format or return;
 		$self->{0} = $lei->{0};
-	} else {
-		my @f;
-		for my $x (@argv) {
-			if (-f $x) { push @f, $x }
-			elsif (-d _) { require PublicInbox::MdirReader }
-			else {
-				require PublicInbox::NetReader;
-				$lei->{nrd} //= PublicInbox::NetReader->new;
-				$lei->{nrd}->add_url($x);
+	}
+
+	# TODO: do we need --format for non-stdin?
+	my $fmt = $lei->{opt}->{'format'};
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+			require PublicInbox::NetReader;
+			$nrd //= PublicInbox::NetReader->new;
+			$nrd->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--format=$fmt and `$ifmt:' conflict
+
 			}
-		}
-		if (@f) { $lei->check_input_format(\@f) or return }
-		if ($lei->{nrd} && (my @err = $lei->{nrd}->errors)) {
-			return $lei->fail(@err);
-		}
+			if (-f $input_path) {
+				require PublicInbox::MboxReader;
+				PublicInbox::MboxReader->can($ifmt) or return
+					$lei->fail("$ifmt not supported");
+			} elsif (-d _) {
+				require PublicInbox::MdirReader;
+				$ifmt eq 'maildir' or return
+					$lei->fail("$ifmt not supported");
+			} else {
+				return $lei->fail("Unable to handle $input");
+			}
+		} elsif (-f $input) { push @f, $input
+		} elsif (-d _) { push @d, $input
+		} else { return $lei->fail("Unable to handle $input") }
 	}
-	do_import($lei);
+	if (@f) { $lei->check_input_format(\@f) or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
+	}
+	$self->{inputs} = \@inputs;
+	return import_start($lei) if !$nrd;
+
+	if (my $err = $nrd->errors) {
+		return $lei->fail($err);
+	}
+	$nrd->{quiet} = $lei->{opt}->{quiet};
+	$lei->{nrd} = $nrd;
+	require PublicInbox::LeiAuth;
+	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
+	$auth->auth_start($lei, \&import_start, $lei);
 }
 
 sub ipc_atfork_child {
 	my ($self) = @_;
+	delete $self->{lei}->{imp}; # drop circular ref
 	$self->{lei}->lei_atfork_child;
 	$self->SUPER::ipc_atfork_child;
 }
 
 sub _import_fh {
-	my ($lei, $fh, $x) = @_;
+	my ($lei, $fh, $input, $ifmt) = @_;
 	my $set_kw = $lei->{opt}->{kw};
-	my $fmt = $lei->{opt}->{'format'};
 	eval {
-		if ($fmt eq 'eml') {
+		if ($ifmt eq 'eml') {
 			my $buf = do { local $/; <$fh> } //
-				return $lei->child_error(1 >> 8, <<"");
-error reading $x: $!
+				return $lei->child_error(1 << 8, <<"");
+error reading $input: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
 			_import_eml($eml, $lei->{sto}, $set_kw);
 		} else { # some mbox (->can already checked in call);
-			my $cb = PublicInbox::MboxReader->can($fmt) //
-				die "BUG: bad fmt=$fmt";
+			my $cb = PublicInbox::MboxReader->can($ifmt) //
+				die "BUG: bad fmt=$ifmt";
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
 	};
-	$lei->child_error(1 >> 8, "<stdin>: $@") if $@;
+	$lei->child_error(1 << 8, "<stdin>: $@") if $@;
 }
 
 sub _import_maildir { # maildir_each_file cb
@@ -122,27 +153,45 @@ sub _import_maildir { # maildir_each_file cb
 	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
 }
 
+sub _import_imap { # imap_each cb
+	my ($url, $uid, $kw, $eml, $sto, $set_kw) = @_;
+	warn "$url $uid";
+	$sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
+}
+
 sub import_path_url {
-	my ($self, $x) = @_;
+	my ($self, $input) = @_;
 	my $lei = $self->{lei};
+	my $ifmt = lc($lei->{opt}->{'format'} // '');
 	# TODO auto-detect?
-	if (-f $x) {
-		open my $fh, '<', $x or return $lei->child_error(1 >> 8, <<"");
-unable to open $x: $!
+	if ($input =~ m!\A(imap|nntp)s?://!i) {
+		$lei->{nrd}->imap_each($input, \&_import_imap, $lei->{sto},
+					$lei->{opt}->{kw});
+		return;
+	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+		$ifmt = lc $1;
+	}
+	if (-f $input) {
+		open my $fh, '<', $input or return $lei->child_error(1 << 8, <<"");
+unable to open $input: $!
 
-		_import_fh($lei, $fh, $x);
-	} elsif (-d _ && (-d "$x/cur" || -d "$x/new")) {
-		PublicInbox::MdirReader::maildir_each_file($x,
+		_import_fh($lei, $fh, $input, $ifmt);
+	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
+		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
+$input appears to a be a maildir, not $ifmt
+EOM
+		PublicInbox::MdirReader::maildir_each_file($input,
 					\&_import_maildir,
 					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
-		$lei->fail("$x unsupported (TODO)");
+		$lei->fail("$input unsupported (TODO)");
 	}
 }
 
 sub import_stdin {
 	my ($self) = @_;
-	_import_fh($self->{lei}, $self->{0}, '<stdin>');
+	my $lei = $self->{lei};
+	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'format'});
 }
 
 1;
diff --git a/t/lei-import-imap.t b/t/lei-import-imap.t
new file mode 100644
index 00000000..ee308723
--- /dev/null
+++ b/t/lei-import-imap.t
@@ -0,0 +1,28 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	lei_ok(qw(q bytes:1..));
+	my $out = json_utf8->decode($lei_out);
+	is_deeply($out, [ undef ], 'nothing imported, yet');
+	lei_ok('import', "imap://$host_port/t.v2.0");
+	lei_ok(qw(q bytes:1..));
+	$out = json_utf8->decode($lei_out);
+	ok(scalar(@$out) > 1, 'got imported messages');
+	is(pop @$out, undef, 'trailing JSON null element was null');
+	my %r;
+	for (@$out) { $r{ref($_)}++ }
+	is_deeply(\%r, { 'HASH' => scalar(@$out) }, 'all hashes');
+});
+done_testing;
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index 5842e19e..d2b059ad 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -23,8 +23,8 @@ test_lei(sub {
 	is_deeply($r2, $res, 'idempotent import');
 
 	rename("$md/cur/x:2,S", "$md/cur/x:2,SR") or BAIL_OUT "rename: $!";
-	ok($lei->(qw(import), $md), 'import Maildir after +answered');
-	ok($lei->(qw(q -d none s:boolean)), 'lei q after +answered');
+	lei_ok('import', "maildir:$md", \'import Maildir after +answered');
+	lei_ok(qw(q -d none s:boolean), \'lei q after +answered');
 	$res = json_utf8->decode($lei_out);
 	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
 	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 6a571660..72b90700 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -139,6 +139,16 @@ test_lei(sub {
 	is($res->[1], undef, 'only one result');
 });
 
+test_lei(sub {
+	lei_ok('import', "$mbox:$fn", \'imported mbox:/path') or diag $lei_err;
+	lei_ok(qw(q s:x), \'lei q works') or diag $lei_err;
+	my $res = json_utf8->decode($lei_out);
+	my $x = $res->[0];
+	is($x->{'s'}, 'x', 'subject imported') or diag $lei_out;
+	is_deeply($x->{'kw'}, ['seen'], 'kw imported') or diag $lei_out;
+	is($res->[1], undef, 'only one result');
+});
+
 for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 	my $zsfx2cmd = PublicInbox::LeiToMail->can('zsfx2cmd');
 	SKIP: {

^ permalink raw reply related	[relevance 37%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-18  5:28 71% ` Kyle Meyer
@ 2021-02-18 12:07 71%   ` Eric Wong
  2021-02-19  3:10 71%     ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-18 12:07 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > "maildir:/path/to/dir" has been supported by public-inbox-watch
> > for years, now.
> >
> > The following all work today:
> >
> > 	lei q -o mboxrd:/tmp/foo.mboxrd ...
> > 	lei q -o mboxcl2:/tmp/foo.mboxcl2 ...
> > 	lei q -o maildir:/tmp/foo/ ...
> >
> > So -f/--format seems redundant.
> 
> I find "<format>:<destination>" pretty natural/intuitive, even if
> perhaps the stdout case (e.g., "mboxrd:-" or "concatjson:-") looks a bit
> odd.  Dropping --format makes sense to me.

How about we just drop --format from the documentation, for now?
(or at least stop recommending it when using with -o)

The stdout case might be a reason to keep it for "lei q",
especially since stdout is the default output:

# this defaults to stdout, looks reasonable:
lei q -f concatjson SEARCH_TERMS...

# this does the same thing, but is more difficult to type and
# looks strange:
lei q -o concatjson:- SEARCH_TERMS

# more readable, but more typing:
lei q -o concatjson:/dev/stdout SEARCH_TERMS

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: completion: bash: generalize nospace usage
@ 2021-02-18 12:27 71% Eric Wong
  2021-02-25 10:33 71% ` better "compopt -o nospace" ideas? [was: lei: completion: bash: generalize nospace usage] Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-18 12:27 UTC (permalink / raw)
  To: meta

We'll be completing more options with ':', '//' and '=' in the
future, so make it easier to disable trailing spaces on
completions.
---
 contrib/completion/lei-completion.bash | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/contrib/completion/lei-completion.bash b/contrib/completion/lei-completion.bash
index 619805fb..2c28d44a 100644
--- a/contrib/completion/lei-completion.bash
+++ b/contrib/completion/lei-completion.bash
@@ -4,14 +4,12 @@
 # preliminary bash completion support for lei (Local Email Interface)
 # Needs a lot of work, see `lei__complete' in lib/PublicInbox::LEI.pm
 _lei() {
-	case ${COMP_WORDS[@]} in
-	*' add-external h'* | *' --mirror h'*)
-		compopt -o nospace
-		;;
+	local wordlist="$(lei _complete ${COMP_WORDS[@]})"
+	case $wordlist in
+	*':'* | *'='* | '//'*) compopt -o nospace ;;
 	*) compopt +o nospace ;; # the default
 	esac
-	COMPREPLY=($(compgen -W "$(lei _complete ${COMP_WORDS[@]})" \
-			-- "${COMP_WORDS[COMP_CWORD]}"))
+	COMPREPLY=($(compgen -W "$wordlist" -- "${COMP_WORDS[COMP_CWORD]}"))
 	return 0
 }
 complete -o default -o bashdefault -F _lei lei

^ permalink raw reply related	[relevance 71%]

* [PATCHv3 0/4] lei convert IMAP support
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
@ 2021-02-18 20:22 68%         ` Eric Wong
  2021-02-18 20:22 18%         ` [PATCHv3 1/4] lei convert: mail format conversion sub-command Eric Wong
                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 20:22 UTC (permalink / raw)
  To: meta

Fixed to setup ->_lei_cfg at LeiAuth->auth_start in PATCH 1/4
instead of 4/4.  This fixes failures on my FreeBSD 11.x VM
where 1/4 alone was failing (I never caught this on Debian 10.x).

Eric Wong (4):
  lei convert: mail format conversion sub-command
  lei import: add IMAP and (maildir|mbox*):$PATHNAME support
  lei: consolidate the bulk of the IPC code
  lei: check for IMAP auth errors

 MANIFEST                         |   6 ++
 lib/PublicInbox/GitCredential.pm |  18 ++--
 lib/PublicInbox/LEI.pm           |  57 +++++++++--
 lib/PublicInbox/LeiAuth.pm       |  70 +++++++++++++
 lib/PublicInbox/LeiConvert.pm    | 148 +++++++++++++++++++++++++++
 lib/PublicInbox/LeiDedupe.pm     |   2 +-
 lib/PublicInbox/LeiImport.pm     | 148 +++++++++++++++++----------
 lib/PublicInbox/LeiMirror.pm     |  19 +---
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiToMail.pm     |   5 +-
 lib/PublicInbox/MdirReader.pm    |  26 +++++
 lib/PublicInbox/NetReader.pm     | 166 ++++++++++++++++++++++++++++---
 lib/PublicInbox/TestCommon.pm    |  11 +-
 t/lei-convert.t                  |  71 +++++++++++++
 t/lei-import-imap.t              |  28 ++++++
 t/lei-import-maildir.t           |   4 +-
 t/lei_to_mail.t                  |  10 ++
 t/net_reader-imap.t              |  40 ++++++++
 xt/lei-auth-fail.t               |  20 ++++
 19 files changed, 747 insertions(+), 109 deletions(-)
 create mode 100644 lib/PublicInbox/LeiAuth.pm
 create mode 100644 lib/PublicInbox/LeiConvert.pm
 create mode 100644 t/lei-convert.t
 create mode 100644 t/lei-import-imap.t
 create mode 100644 t/net_reader-imap.t
 create mode 100644 xt/lei-auth-fail.t


^ permalink raw reply	[relevance 68%]

* [PATCHv3 1/4] lei convert: mail format conversion sub-command
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
  2021-02-18 20:22 68%         ` [PATCHv3 0/4] lei convert IMAP support Eric Wong
@ 2021-02-18 20:22 18%         ` Eric Wong
  2021-02-18 20:22 37%         ` [PATCHv3 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support Eric Wong
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 20:22 UTC (permalink / raw)
  To: meta

This will make testing IMAP support for other commands easier, as
it doesn't write to lei/store at all.  Like the pager and MUA,
"git credential" is always spawned by script/lei (and not
lei-daemon) so it has a controlling terminal for password
prompts.

v2: fix missing requires, correct test ordering
v3: ensure config exists for IMAP auth
---
 MANIFEST                         |   4 +
 lib/PublicInbox/GitCredential.pm |  18 ++--
 lib/PublicInbox/LEI.pm           |  38 +++++--
 lib/PublicInbox/LeiAuth.pm       |  81 +++++++++++++++
 lib/PublicInbox/LeiConvert.pm    | 160 ++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiDedupe.pm     |   2 +-
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiToMail.pm     |   5 +-
 lib/PublicInbox/MdirReader.pm    |  26 +++++
 lib/PublicInbox/NetReader.pm     | 163 ++++++++++++++++++++++++++++---
 lib/PublicInbox/TestCommon.pm    |  11 ++-
 t/lei-convert.t                  |  71 ++++++++++++++
 t/net_reader-imap.t              |  40 ++++++++
 13 files changed, 589 insertions(+), 37 deletions(-)
 create mode 100644 lib/PublicInbox/LeiAuth.pm
 create mode 100644 lib/PublicInbox/LeiConvert.pm
 create mode 100644 t/lei-convert.t
 create mode 100644 t/net_reader-imap.t

diff --git a/MANIFEST b/MANIFEST
index 82068900..4f146771 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -178,6 +178,8 @@ lib/PublicInbox/InputPipe.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiAuth.pm
+lib/PublicInbox/LeiConvert.pm
 lib/PublicInbox/LeiCurl.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
@@ -360,6 +362,7 @@ t/init.t
 t/ipc.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-convert.t
 t/lei-daemon.t
 t/lei-externals.t
 t/lei-import-maildir.t
@@ -388,6 +391,7 @@ t/msg_iter.t
 t/msgmap.t
 t/msgtime.t
 t/multi-mid.t
+t/net_reader-imap.t
 t/nntp.t
 t/nntpd-tls.t
 t/nntpd-v2.t
diff --git a/lib/PublicInbox/GitCredential.pm b/lib/PublicInbox/GitCredential.pm
index 9e193029..2d81817c 100644
--- a/lib/PublicInbox/GitCredential.pm
+++ b/lib/PublicInbox/GitCredential.pm
@@ -4,11 +4,17 @@ package PublicInbox::GitCredential;
 use strict;
 use PublicInbox::Spawn qw(popen_rd);
 
-sub run ($$) {
-	my ($self, $op) = @_;
-	my ($in_r, $in_w);
+sub run ($$;$) {
+	my ($self, $op, $lei) = @_;
+	my ($in_r, $in_w, $out_r);
+	my $cmd = [ qw(git credential), $op ];
 	pipe($in_r, $in_w) or die "pipe: $!";
-	my $out_r = popen_rd([qw(git credential), $op], undef, { 0 => $in_r });
+	if ($lei && !$lei->{oneshot}) { # we'll die if disconnected:
+		pipe($out_r, my $out_w) or die "pipe: $!";
+		$lei->send_exec_cmd([ $in_r, $out_w ], $cmd, {});
+	} else {
+		$out_r = popen_rd($cmd, undef, { 0 => $in_r });
+	}
 	close $in_r or die "close in_r: $!";
 
 	my $out = '';
@@ -41,8 +47,8 @@ sub check_netrc ($) {
 }
 
 sub fill {
-	my ($self) = @_;
-	my $out_r = run($self, 'fill');
+	my ($self, $lei) = @_;
+	my $out_r = run($self, 'fill', $lei);
 	while (<$out_r>) {
 		chomp;
 		return if $_ eq '';
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1fa9f751..1e4c36d0 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -173,7 +173,11 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
 	format|f=s kw|keywords|flags!),
 	],
-
+'convert' => [ 'LOCATION...|--stdin',
+	'one-time conversion from URL or filesystem to another format',
+	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
+	kw|keywords|flags!),
+	],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
@@ -320,7 +324,7 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-my @WQ_KEYS = qw(lxs l2m imp mrr); # internal workers
+my @WQ_KEYS = qw(lxs l2m imp mrr cnv auth); # internal workers
 
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
@@ -391,18 +395,19 @@ sub fail ($$;$) {
 	undef;
 }
 
-sub check_input_format ($;$) {
-	my ($self, $files) = @_;
-	my $fmt = $self->{opt}->{'format'};
+sub check_input_format ($;$$) {
+	my ($self, $files, $opt_key) = @_;
+	$opt_key //= 'format';
+	my $fmt = $self->{opt}->{$opt_key};
 	if (!$fmt) {
 		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
-		return fail($self, "--format unset for $err");
+		return fail($self, "--$opt_key unset for $err");
 	}
 	return 1 if $fmt eq 'eml';
 	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
 	require PublicInbox::MboxReader;
 	PublicInbox::MboxReader->can($fmt) ||
-				fail($self, "--format=$fmt unrecognized");
+				fail($self, "--$opt_key=$fmt unrecognized");
 }
 
 sub out ($;@) {
@@ -445,6 +450,7 @@ sub lei_atfork_child {
 	} else {
 		delete $self->{0};
 	}
+	delete @$self{qw(cnv)};
 	for (delete @$self{qw(3 sock old_1 au_done)}) {
 		close($_) if defined($_);
 	}
@@ -626,6 +632,11 @@ sub lei_import {
 	PublicInbox::LeiImport->call(@_);
 }
 
+sub lei_convert {
+	require PublicInbox::LeiConvert;
+	PublicInbox::LeiConvert->call(@_);
+}
+
 sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
@@ -770,6 +781,13 @@ sub start_mua {
 	delete $self->{opt}->{verbose};
 }
 
+sub send_exec_cmd { # tell script/lei to execute a command
+	my ($self, $io, $cmd, $env) = @_;
+	my $sock = $self->{sock} // die 'lei client gone';
+	my $fds = [ map { fileno($_) } @$io ];
+	$send_cmd->($sock, $fds, exec_buf($cmd, $env), MSG_EOR);
+}
+
 sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 	my ($self) = @_;
 	my $alerts = $self->{opt}->{alert} // return;
@@ -813,10 +831,9 @@ sub start_pager {
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
 	my $pgr = [ undef, @$rdr{1, 2} ];
-	if (my $sock = $self->{sock}) { # lei(1) process runs it
+	if ($self->{sock}) { # lei(1) process runs it
 		delete @$new_env{keys %$env}; # only set iff unset
-		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
-		$send_cmd->($sock, $fds, exec_buf([$pager], $new_env), MSG_EOR);
+		send_exec_cmd($self, [ @$rdr{0..2} ], [$pager], $new_env);
 	} elsif ($self->{oneshot}) {
 		my $cmd = [$pager];
 		$self->{"pid.$self.$$"}->{spawn($cmd, $new_env, $rdr)} = $cmd;
@@ -920,6 +937,7 @@ sub event_step {
 
 sub event_step_init {
 	my ($self) = @_;
+	return if $self->{-event_init_done}++;
 	if (my $sock = $self->{sock}) { # using DS->EventLoop
 		$self->SUPER::new($sock, EPOLLIN|EPOLLET);
 	}
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
new file mode 100644
index 00000000..6593ba51
--- /dev/null
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -0,0 +1,81 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Authentication worker for anything that needs auth for read/write IMAP
+# (eventually for read-only NNTP access)
+package PublicInbox::LeiAuth;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::PktOp qw(pkt_do);
+use PublicInbox::NetReader;
+
+sub nrd_merge {
+	my ($lei, $nrd_new) = @_;
+	if ($lei->{pkt_op_p}) { # from lei_convert worker
+		pkt_do($lei->{pkt_op_p}, 'nrd_merge', $nrd_new);
+	} else { # single lei-daemon consumer
+		my $self = $lei->{auth} or return; # client disconnected
+		my $nrd = $self->{nrd};
+		%$nrd = (%$nrd, %$nrd_new);
+	}
+}
+
+sub do_auth { # called via wq_io_do
+	my ($self) = @_;
+	my ($lei, $nrd) = @$self{qw(lei nrd)};
+	$nrd->imap_common_init($lei);
+	nrd_merge($lei, $nrd); # tell lei-daemon updated auth info
+}
+
+sub do_finish_auth { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($self, $lei, $post_auth_cb, @args) = @$arg;
+	$? ? $lei->dclose : $post_auth_cb->(@args);
+}
+
+sub auth_eof {
+	my ($lei, $post_auth_cb, @args) = @_;
+	my $self = delete $lei->{auth} or return;
+	$self->wq_wait_old(\&do_finish_auth, $lei, $post_auth_cb, @args);
+}
+
+sub auth_start {
+	my ($self, $lei, $post_auth_cb, @args) = @_;
+	$lei->_lei_cfg(1); # workers may need to read config
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'nrd_merge' => [ \&nrd_merge, $lei ],
+		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$self->wq_workers_start('lei_auth', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_io_do('do_auth', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	# prevent {sock} from being closed in lei_atfork_child:
+	my $s = delete $self->{lei}->{sock};
+	delete $self->{lei}->{auth}; # drop circular ref
+	$self->{lei}->lei_atfork_child;
+	$self->{lei}->{sock} = $s if $s;
+	$self->SUPER::ipc_atfork_child;
+}
+
+sub new {
+	my ($cls, $nrd) = @_;
+	bless { nrd => $nrd }, $cls;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
new file mode 100644
index 00000000..78fd5e17
--- /dev/null
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -0,0 +1,160 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# front-end for the "lei convert" sub-command
+package PublicInbox::LeiConvert;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::Eml;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::PktOp;
+use PublicInbox::LeiStore;
+use PublicInbox::LeiOverview;
+
+sub mbox_cb {
+	my ($eml, $self) = @_;
+	my @kw = PublicInbox::LeiStore::mbox_keywords($eml);
+	$eml->header_set($_) for qw(Status X-Status);
+	$self->{wcb}->(undef, { kw => \@kw }, $eml);
+}
+
+sub imap_cb { # ->imap_each
+	my ($url, $uid, $kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
+sub mdir_cb {
+	my ($kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
+sub do_convert { # via wq_do
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	my $in_fmt = $lei->{opt}->{'in-format'};
+	if (my $stdin = delete $self->{0}) {
+		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
+	}
+	for my $input (@{$self->{inputs}}) {
+		my $ifmt = lc($in_fmt // '');
+		if ($input =~ m!\A(?:imap|nntp)s?://!) { # TODO: nntp
+			$lei->{nrd}->imap_each($input, \&imap_cb, $self);
+			next;
+		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+			$ifmt = lc $1;
+		}
+		if (-f $input) {
+			open my $fh, '<', $input or
+					return $lei->fail("open $input: $!");
+			PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+		} elsif (-d _) {
+			PublicInbox::MdirReader::maildir_each_eml($input,
+							\&mdir_cb, $self);
+		} else {
+			die "BUG: $input unhandled"; # should've failed earlier
+		}
+	}
+	delete $lei->{1};
+	delete $self->{wcb}; # commit
+}
+
+sub convert_start {
+	my ($lei) = @_;
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'' => [ $lei->can('dclose'), $lei ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	my $self = $lei->{cnv};
+	$self->wq_workers_start('lei_convert', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_io_do('do_convert', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub call { # the main "lei convert" method
+	my ($cls, $lei, @inputs) = @_;
+	my $opt = $lei->{opt};
+	$opt->{kw} //= 1;
+	my $self = $lei->{cnv} = bless {}, $cls;
+	my $in_fmt = $opt->{'in-format'};
+	my ($nrd, @f, @d);
+	$opt->{dedupe} //= 'none';
+	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
+	$lei->{l2m} or return
+		$lei->fail("output not specified or is not a mail destination");
+	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
+	if ($opt->{stdin}) {
+		@inputs and return $lei->fail("--stdin and @inputs do not mix");
+		$lei->check_input_format(undef, 'in-format') or return;
+		$self->{0} = $lei->{0};
+	}
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+			require PublicInbox::NetReader;
+			$nrd //= PublicInbox::NetReader->new;
+			$nrd->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($in_fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--in-format=$in_fmt and `$ifmt:' conflict
+
+			}
+			if (-f $input_path) {
+				require PublicInbox::MboxReader;
+				PublicInbox::MboxReader->can($ifmt) or return
+					$lei->fail("$ifmt not supported");
+			} elsif (-d _) {
+				require PublicInbox::MdirReader;
+				$ifmt eq 'maildir' or return
+					$lei->fail("$ifmt not supported");
+			} else {
+				return $lei->fail("Unable to handle $input");
+			}
+		} elsif (-f $input) { push @f, $input }
+		elsif (-d _) { push @d, $input }
+		else { return $lei->fail("Unable to handle $input") }
+	}
+	if (@f) { $lei->check_input_format(\@f, 'in-format') or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
+	}
+	$self->{inputs} = \@inputs;
+	return convert_start($lei) if !$nrd;
+
+	if (my $err = $nrd->errors) {
+		return $lei->fail($err);
+	}
+	$nrd->{quiet} = $opt->{quiet};
+	$lei->{nrd} = $nrd;
+	require PublicInbox::LeiAuth;
+	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
+	$auth->auth_start($lei, \&convert_start, $lei);
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	$lei->lei_atfork_child;
+	my $l2m = delete $lei->{l2m};
+	$l2m->pre_augment($lei);
+	$l2m->do_augment($lei);
+	$l2m->post_augment($lei);
+	$self->{wcb} = $l2m->write_cb($lei);
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
+	$self->SUPER::ipc_atfork_child;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 2114c0e8..5fec9384 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -127,7 +127,7 @@ sub prepare_dedupe {
 
 sub pause_dedupe {
 	my ($self) = @_;
-	my $skv = $self->[0];
+	my $skv = $self->[0] or return;
 	$skv->dbh_release;
 	delete($skv->{dbh}) if $skv;
 }
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index c820f0d7..3169bae6 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -51,18 +51,19 @@ sub detect_fmt ($$) {
 }
 
 sub new {
-	my ($class, $lei) = @_;
+	my ($class, $lei, $ofmt_key) = @_;
 	my $opt = $lei->{opt};
 	my $dst = $opt->{output} // '-';
 	$dst = '/dev/stdout' if $dst eq '-';
+	$ofmt_key //= 'format';
 
-	my $fmt = $opt->{'format'};
+	my $fmt = $opt->{$ofmt_key};
 	$fmt = lc($fmt) if defined $fmt;
 	if ($dst =~ s/\A([a-z0-9]+)://is) { # e.g. Maildir:/home/user/Mail/
 		my $ofmt = lc $1;
 		$fmt //= $ofmt;
 		return $lei->fail(<<"") if $fmt ne $ofmt;
---format=$fmt and --output=$ofmt conflict
+--$ofmt_key=$fmt and --output=$ofmt conflict
 
 	}
 	$fmt //= 'json' if $dst eq '/dev/stdout';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e3e512be..f0adc44f 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -437,7 +437,7 @@ sub _do_augment_mbox {
 	$dedupe->pause_dedupe if $dedupe;
 }
 
-sub pre_augment { # fast (1 disk seek), runs in main daemon
+sub pre_augment { # fast (1 disk seek), runs in same process as post_augment
 	my ($self, $lei) = @_;
 	# _pre_augment_maildir, _pre_augment_mbox
 	my $m = "_pre_augment_$self->{base_type}";
@@ -451,7 +451,8 @@ sub do_augment { # slow, runs in wq worker
 	$self->$m($lei);
 }
 
-sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
+# fast (spawn compressor or mkdir), runs in same process as pre_augment
+sub post_augment {
 	my ($self, $lei, @args) = @_;
 	# _post_augment_maildir, _post_augment_mbox
 	my $m = "_post_augment_$self->{base_type}";
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index e0ff676d..5fa534f5 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -7,6 +7,7 @@
 package PublicInbox::MdirReader;
 use strict;
 use v5.10.1;
+use PublicInbox::InboxWritable qw(eml_from_path);
 
 # returns Maildir flags from a basename ('' for no flags, undef for invalid)
 sub maildir_basename_flags {
@@ -36,4 +37,29 @@ sub maildir_each_file ($$;@) {
 	}
 }
 
+my %c2kw = ('D' => 'draft', F => 'flagged', R => 'answered', S => 'seen');
+
+sub maildir_each_eml ($$;@) {
+	my ($dir, $cb, @arg) = @_;
+	$dir .= '/' unless substr($dir, -1) eq '/';
+	my $pfx = "$dir/new/";
+	if (opendir(my $dh, $pfx)) {
+		while (defined(my $bn = readdir($dh))) {
+			next if substr($bn, 0, 1) eq '.';
+			my @f = split(/:/, $bn, -1);
+			next if scalar(@f) != 1;
+			my $eml = eml_from_path($pfx.$bn) or next;
+			$cb->([], $eml, @arg);
+		}
+	}
+	$pfx = "$dir/cur/";
+	opendir my $dh, $pfx or return;
+	while (defined(my $bn = readdir($dh))) {
+		my $fl = maildir_basename_flags($bn) // next;
+		my $eml = eml_from_path($pfx.$bn) or next;
+		my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
+		$cb->(\@kw, $eml, @arg);
+	}
+}
+
 1;
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 1d053425..ad8c18d0 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -5,7 +5,8 @@
 package PublicInbox::NetReader;
 use strict;
 use v5.10.1;
-use parent qw(Exporter);
+use parent qw(Exporter PublicInbox::IPC);
+use PublicInbox::Eml;
 
 # TODO: trim this down, this is huge
 our @EXPORT = qw(uri_new uri_scheme uri_section
@@ -33,7 +34,7 @@ sub uri_section ($) {
 sub auth_anon_cb { '' }; # for Mail::IMAPClient::Authcallback
 
 sub mic_for { # mic = Mail::IMAPClient
-	my ($self, $url, $mic_args) = @_;
+	my ($self, $url, $mic_args, $lei) = @_;
 	require PublicInbox::URIimap;
 	my $uri = PublicInbox::URIimap->new($url);
 	require PublicInbox::GitCredential;
@@ -74,21 +75,26 @@ sub mic_for { # mic = Mail::IMAPClient
 	}
 	if ($cred) {
 		$cred->check_netrc unless defined $cred->{password};
-		$cred->fill; # may prompt user here
+		$cred->fill($lei); # may prompt user here
 		$mic->User($mic_arg->{User} = $cred->{username});
 		$mic->Password($mic_arg->{Password} = $cred->{password});
 	} else { # AUTH=ANONYMOUS
 		$mic->Authmechanism($mic_arg->{Authmechanism} = 'ANONYMOUS');
-		$mic->Authcallback($mic_arg->{Authcallback} = \&auth_anon_cb);
+		$mic_arg->{Authcallback} = 'auth_anon_cb';
+		$mic->Authcallback(\&auth_anon_cb);
 	}
+	my $err;
 	if ($mic->login && $mic->IsAuthenticated) {
 		# success! keep IMAPClient->new arg in case we get disconnected
 		$self->{mic_arg}->{uri_section($uri)} = $mic_arg;
 	} else {
-		warn "E: <$url> LOGIN: $@\n";
+		$err = "E: <$url> LOGIN: $@\n";
 		$mic = undef;
 	}
 	$cred->run($mic ? 'approve' : 'reject') if $cred;
+	if ($err) {
+		$lei ? $lei->fail($err) : warn($err);
+	}
 	$mic;
 }
 
@@ -139,8 +145,8 @@ E: <$url> STARTTLS requested and failed
 	$nn;
 }
 
-sub nn_for ($$$) { # nn = Net::NNTP
-	my ($self, $url, $nn_args) = @_;
+sub nn_for ($$$;$) { # nn = Net::NNTP
+	my ($self, $url, $nn_args, $lei) = @_;
 	my $uri = uri_new($url);
 	my $sec = uri_section($uri);
 	my $nntp_opt = $self->{nntp_opt}->{$sec} //= {};
@@ -170,7 +176,7 @@ sub nn_for ($$$) { # nn = Net::NNTP
 	my $nn = nn_new($nn_arg, $nntp_opt, $url);
 
 	if ($cred) {
-		$cred->fill; # may prompt user here
+		$cred->fill($lei); # may prompt user here
 		if ($nn->authinfo($u, $p)) {
 			push @{$nntp_opt->{-postconn}}, [ 'authinfo', $u, $p ];
 		} else {
@@ -240,14 +246,15 @@ sub cfg_bool ($$$) {
 }
 
 # flesh out common IMAP-specific data structures
-sub imap_common_init ($) {
-	my ($self) = @_;
+sub imap_common_init ($;$) {
+	my ($self, $lei) = @_;
+	$self->{quiet} = 1 if $lei && $lei->{opt}->{quiet};
 	eval { require PublicInbox::IMAPClient } or
 		die "Mail::IMAPClient is required for IMAP:\n$@\n";
 	eval { require PublicInbox::IMAPTracker } or
 		die "DBD::SQLite is required for IMAP\n:$@\n";
 	require PublicInbox::URIimap;
-	my $cfg = $self->{pi_cfg};
+	my $cfg = $self->{pi_cfg} // $lei->_lei_cfg;
 	my $mic_args = {}; # scheme://authority => Mail:IMAPClient arg
 	for my $url (@{$self->{imap_order}}) {
 		my $uri = PublicInbox::URIimap->new($url);
@@ -275,7 +282,8 @@ sub imap_common_init ($) {
 	my $mics = {}; # schema://authority => IMAPClient obj
 	for my $url (@{$self->{imap_order}}) {
 		my $uri = PublicInbox::URIimap->new($url);
-		$mics->{uri_section($uri)} //= mic_for($self, $url, $mic_args);
+		my $sec = uri_section($uri);
+		$mics->{$sec} //= mic_for($self, $url, $mic_args, $lei);
 	}
 	$mics;
 }
@@ -294,9 +302,140 @@ sub errors {
 	if (my $u = $self->{unsupported_url}) {
 		return "Unsupported URL(s): @$u";
 	}
+	if ($self->{imap_order}) {
+		eval { require PublicInbox::IMAPClient } or
+			die "Mail::IMAPClient is required for IMAP:\n$@\n";
+	}
 	undef;
 }
 
+my %IMAPflags2kw = (
+	'\Seen' => 'seen',
+	'\Answered' => 'answered',
+	'\Flagged' => 'flagged',
+	'\Draft' => 'draft',
+);
+
+sub _imap_do_msg ($$$$$) {
+	my ($self, $url, $uid, $raw, $flags) = @_;
+	# our target audience expects LF-only, save storage
+	$$raw =~ s/\r\n/\n/sg;
+	my $kw = [];
+	for my $f (split(/ /, $flags)) {
+		my $k = $IMAPflags2kw{$f} // next; # TODO: X-Label?
+		push @$kw, $k;
+	}
+	my ($eml_cb, @args) = @{$self->{eml_each}};
+	$eml_cb->($url, $uid, $kw, PublicInbox::Eml->new($raw), @args);
+}
+
+sub _imap_fetch_all ($$$) {
+	my ($self, $mic, $url) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = uri_section($uri);
+	my $mbx = $uri->mailbox;
+	$mic->Clear(1); # trim results history
+	$mic->examine($mbx) or return "E: EXAMINE $mbx ($sec) failed: $!";
+	my ($r_uidval, $r_uidnext);
+	for ($mic->Results) {
+		/^\* OK \[UIDVALIDITY ([0-9]+)\].*/ and $r_uidval = $1;
+		/^\* OK \[UIDNEXT ([0-9]+)\].*/ and $r_uidnext = $1;
+		last if $r_uidval && $r_uidnext;
+	}
+	$r_uidval //= $mic->uidvalidity($mbx) //
+		return "E: $url cannot get UIDVALIDITY";
+	$r_uidnext //= $mic->uidnext($mbx) //
+		return "E: $url cannot get UIDNEXT";
+	my $itrk = $self->{incremental} ?
+			PublicInbox::IMAPTracker->new($url) : 0;
+	my ($l_uidval, $l_uid) = $itrk ? $itrk->get_last : ();
+	$l_uidval //= $r_uidval; # first time
+	$l_uid //= 1;
+	if ($l_uidval != $r_uidval) {
+		return "E: $url UIDVALIDITY mismatch\n".
+			"E: local=$l_uidval != remote=$r_uidval";
+	}
+	my $r_uid = $r_uidnext - 1;
+	if ($l_uid != 1 && $l_uid > $r_uid) {
+		return "E: $url local UID exceeds remote ($l_uid > $r_uid)\n".
+			"E: $url strangely, UIDVALIDLITY matches ($l_uidval)\n";
+	}
+	return if $l_uid >= $r_uid; # nothing to do
+
+	warn "# $url fetching UID $l_uid:$r_uid\n" unless $self->{quiet};
+	$mic->Uid(1); # the default, we hope
+	my $bs = $self->{imap_opt}->{$sec}->{batch_size} // 1;
+	my $req = $mic->imap4rev1 ? 'BODY.PEEK[]' : 'RFC822.PEEK';
+	my $key = $req;
+	$key =~ s/\.PEEK//;
+	my ($uids, $batch);
+	my $err;
+	do {
+		# I wish "UID FETCH $START:*" could work, but:
+		# 1) servers do not need to return results in any order
+		# 2) Mail::IMAPClient doesn't offer a streaming API
+		$uids = $mic->search("UID $l_uid:*") or
+			return "E: $url UID SEARCH $l_uid:* error: $!";
+		return if scalar(@$uids) == 0;
+
+		# RFC 3501 doesn't seem to indicate order of UID SEARCH
+		# responses, so sort it ourselves.  Order matters so
+		# IMAPTracker can store the newest UID.
+		@$uids = sort { $a <=> $b } @$uids;
+
+		# Did we actually get new messages?
+		return if $uids->[0] < $l_uid;
+
+		$l_uid = $uids->[-1] + 1; # for next search
+		my $last_uid;
+		my $n = $self->{max_batch};
+		while (scalar @$uids) {
+			my @batch = splice(@$uids, 0, $bs);
+			$batch = join(',', @batch);
+			local $0 = "UID:$batch $mbx $sec";
+			my $r = $mic->fetch_hash($batch, $req, 'FLAGS');
+			unless ($r) { # network error?
+				$err = "E: $url UID FETCH $batch error: $!";
+				last;
+			}
+			for my $uid (@batch) {
+				# messages get deleted, so holes appear
+				my $per_uid = delete $r->{$uid} // next;
+				my $raw = delete($per_uid->{$key}) // next;
+				_imap_do_msg($self, $url, $uid, \$raw,
+						$per_uid->{FLAGS});
+				$last_uid = $uid;
+				last if $self->{quit};
+			}
+			last if $self->{quit};
+		}
+		$itrk->update_last($r_uidval, $last_uid) if $itrk;
+	} until ($err || $self->{quit});
+	$err;
+}
+
+sub imap_each {
+	my ($self, $url, $eml_cb, @args) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = uri_section($uri);
+	my $mic_arg = $self->{mic_arg}->{$sec} or
+			die "BUG: no Mail::IMAPClient->new arg for $sec";
+	local $0 = $uri->mailbox." $sec";
+	my $cb_name = $mic_arg->{Authcallback};
+	if (ref($cb_name) ne 'CODE') {
+		$mic_arg->{Authcallback} = $self->can($cb_name);
+	}
+	my $mic = PublicInbox::IMAPClient->new(%$mic_arg, Debug => 0);
+	my $err;
+	if ($mic && $mic->IsConnected) {
+		local $self->{eml_each} = [ $eml_cb, @args ];
+		$err = _imap_fetch_all($self, $mic, $url);
+	} else {
+		$err = "E: not connected: $!";
+	}
+	$mic;
+}
+
 sub new { bless {}, shift };
 
 1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index c5070cfd..3eb08e9f 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -462,10 +462,15 @@ our $lei = sub {
 sub lei (@) { $lei->(@_) }
 
 sub lei_ok (@) {
-	my $msg = ref($_[-1]) ? pop(@_) : undef;
+	my $msg = ref($_[-1]) eq 'SCALAR' ? pop(@_) : undef;
+	my $tmpdir = quotemeta(File::Spec->tmpdir);
 	# filter out anything that looks like a path name for consistent logs
-	my @msg = grep(!m!\A/!, @_);
-	ok($lei->(@_), "lei @msg". ($msg ? " ($$msg)" : ''));
+	my @msg = ref($_[0]) eq 'ARRAY' ? @{$_[0]} : @_;
+	for (@msg) {
+		s!\A([a-z0-9]+://)[^/]+/!$1\$HOST_PORT/! ||
+			s!$tmpdir\b/(?:[^/]+/)?!\$TMPDIR/!;
+	}
+	ok(lei(@_), "lei @msg". ($msg ? " ($$msg)" : '')) or diag $lei_err;
 }
 
 sub json_utf8 () {
diff --git a/t/lei-convert.t b/t/lei-convert.t
new file mode 100644
index 00000000..f58a0a80
--- /dev/null
+++ b/t/lei-convert.t
@@ -0,0 +1,71 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use PublicInbox::MboxReader;
+use PublicInbox::MdirReader;
+use PublicInbox::NetReader;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	my $d = $ENV{HOME};
+	my $dig = Digest::SHA->new(256);
+	lei_ok('convert', '-o', "mboxrd:$d/foo.mboxrd",
+		"imap://$host_port/t.v2.0");
+	ok(-f "$d/foo.mboxrd", 'mboxrd created');
+	my (@mboxrd, @mboxcl2);
+	open my $fh, '<', "$d/foo.mboxrd" or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxrd($fh, sub { push @mboxrd, shift });
+	ok(scalar(@mboxrd) > 1, 'got multiple messages');
+
+	lei_ok('convert', '-o', "mboxcl2:$d/cl2", "mboxrd:$d/foo.mboxrd");
+	ok(-s "$d/cl2", 'mboxcl2 non-empty') or diag $lei_err;
+	open $fh, '<', "$d/cl2" or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxcl2($fh, sub {
+		my $eml = shift;
+		$eml->header_set($_) for (qw(Content-Length Lines));
+		push @mboxcl2, $eml;
+	});
+	is_deeply(\@mboxcl2, \@mboxrd, 'mboxrd and mboxcl2 have same mail');
+
+	lei_ok('convert', '-o', "$d/md", "mboxrd:$d/foo.mboxrd");
+	ok(-d "$d/md", 'Maildir created');
+	my @md;
+	PublicInbox::MdirReader::maildir_each_eml("$d/md", sub {
+		push @md, $_[1];
+	});
+	is(scalar(@md), scalar(@mboxrd), 'got expected emails in Maildir');
+	@md = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @md;
+	@mboxrd = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @mboxrd;
+	my @rd_nostatus = map {
+		my $eml = PublicInbox::Eml->new(\($_->as_string));
+		$eml->header_set('Status');
+		$eml;
+	} @mboxrd;
+	is_deeply(\@md, \@rd_nostatus, 'Maildir output matches mboxrd');
+
+	my @bar;
+	lei_ok('convert', '-o', "mboxrd:$d/bar.mboxrd", "$d/md");
+	open $fh, '<', "$d/bar.mboxrd" or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxrd($fh, sub { push @bar, shift });
+	@bar = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @bar;
+	is_deeply(\@mboxrd, \@bar,
+			'mboxrd round-tripped through Maildir w/ flags');
+
+	open my $in, '<', "$d/foo.mboxrd" or BAIL_OUT;
+	my $rdr = { 0 => $in, 1 => \(my $out), 2 => \$lei_err };
+	lei_ok([qw(convert --stdin -F mboxrd -o mboxrd:/dev/stdout)],
+		undef, $rdr);
+	open $fh, '<', "$d/foo.mboxrd" or BAIL_OUT;
+	my $exp = do { local $/; <$fh> };
+	is($out, $exp, 'stdin => stdout');
+});
+done_testing;
diff --git a/t/net_reader-imap.t b/t/net_reader-imap.t
new file mode 100644
index 00000000..eea8b0fd
--- /dev/null
+++ b/t/net_reader-imap.t
@@ -0,0 +1,40 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($tmpdir, $for_destroy) = tmpdir;
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $sock = tcp_server;
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT "-imapd: $?";
+my ($host, $port) = tcp_host_port $sock;
+require_ok 'PublicInbox::NetReader';
+my $nrd = PublicInbox::NetReader->new;
+$nrd->add_url(my $url = "imap://$host:$port/t.v2.0");
+is($nrd->errors, undef, 'no errors');
+$nrd->{pi_cfg} = PublicInbox::Config->new($cfg_path);
+$nrd->imap_common_init;
+$nrd->{quiet} = 1;
+my (%eml, %urls, %args, $nr, @w);
+local $SIG{__WARN__} = sub { push(@w, @_) };
+$nrd->imap_each($url, sub {
+	my ($u, $uid, $kw, $eml, $arg) = @_;
+	++$urls{$u};
+	++$args{$arg};
+	like($uid, qr/\A[0-9]+\z/, 'got digit UID '.$uid);
+	++$eml{ref($eml)};
+	++$nr;
+}, 'blah');
+is(scalar(@w), 0, 'no warnings');
+ok($nr, 'got some emails');
+is($eml{'PublicInbox::Eml'}, $nr, 'got expected Eml objects');
+is(scalar keys %eml, 1, 'only got Eml objects');
+is($urls{$url}, $nr, 'one URL expected number of times');
+is(scalar keys %urls, 1, 'only got one URL');
+is($args{blah}, $nr, 'got arg expected number of times');
+is(scalar keys %args, 1, 'only got one arg');
+
+done_testing;

^ permalink raw reply related	[relevance 18%]

* [PATCHv3 4/4] lei: check for IMAP auth errors
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
                           ` (3 preceding siblings ...)
  2021-02-18 20:22 47%         ` [PATCHv3 3/4] lei: consolidate the bulk of the IPC code Eric Wong
@ 2021-02-18 20:22 63%         ` Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 20:22 UTC (permalink / raw)
  To: meta

We need to ensure authentication failures and error codes get
propagated to the parent process(es) properly.

v2: update MANIFEST
v3: LeiAuth.pm ->_lei_cfg bit moved to a previous commit
---
 MANIFEST                     |  1 +
 lib/PublicInbox/NetReader.pm |  3 +++
 xt/lei-auth-fail.t           | 20 ++++++++++++++++++++
 3 files changed, 24 insertions(+)
 create mode 100644 xt/lei-auth-fail.t

diff --git a/MANIFEST b/MANIFEST
index 19f73356..3d9ad616 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -466,6 +466,7 @@ xt/git_async_cmp.t
 xt/httpd-async-stream.t
 xt/imapd-mbsync-oimap.t
 xt/imapd-validate.t
+xt/lei-auth-fail.t
 xt/lei-sigpipe.t
 xt/mem-imapd-tls.t
 xt/mem-msgview.t
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index ad8c18d0..61ea538b 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -89,6 +89,9 @@ sub mic_for { # mic = Mail::IMAPClient
 		$self->{mic_arg}->{uri_section($uri)} = $mic_arg;
 	} else {
 		$err = "E: <$url> LOGIN: $@\n";
+		if ($cred && defined($cred->{password})) {
+			$err =~ s/\Q$cred->{password}\E/*******/g;
+		}
 		$mic = undef;
 	}
 	$cred->run($mic ? 'approve' : 'reject') if $cred;
diff --git a/xt/lei-auth-fail.t b/xt/lei-auth-fail.t
new file mode 100644
index 00000000..5308d0f9
--- /dev/null
+++ b/xt/lei-auth-fail.t
@@ -0,0 +1,20 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+
+# TODO: mock IMAP server which fails at authentication so we don't
+# have to make external connections to test this:
+my $imap_fail = $ENV{TEST_LEI_IMAP_FAIL_URL} //
+	'imaps://AzureDiamond:Hunter2@public-inbox.org:994/INBOX';
+test_lei(sub {
+	ok(!lei(qw(convert -o mboxrd:/dev/stdout), $imap_fail),
+		'IMAP auth failure on convert');
+	like($lei_err, qr!\bE:.*?imaps://.*?!sm, 'error shown');
+	unlike($lei_err, qr!Hunter2!s, 'password not shown');
+	is($lei_out, '', 'nothing output');
+	ok(!lei(qw(import), $imap_fail), 'IMAP auth failure on import');
+	like($lei_err, qr!\bE:.*?imaps://.*?!sm, 'error shown');
+	unlike($lei_err, qr!Hunter2!s, 'password not shown');
+});
+done_testing;

^ permalink raw reply related	[relevance 63%]

* [PATCHv3 3/4] lei: consolidate the bulk of the IPC code
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
                           ` (2 preceding siblings ...)
  2021-02-18 20:22 37%         ` [PATCHv3 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support Eric Wong
@ 2021-02-18 20:22 47%         ` Eric Wong
  2021-02-18 20:22 63%         ` [PATCHv3 4/4] lei: check for IMAP auth errors Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 20:22 UTC (permalink / raw)
  To: meta

The backends for "lei add-external --mirror", "lei convert", and
"lei import" all share a similar pattern for spawning background
workers.  Hoist out the common parts to slim down our code base
a bit.

The LeiXSearch and LeiToMail workers for "lei q" remains a the
odd duck due to the deep pipelining and parallelization.
---
 lib/PublicInbox/LEI.pm        | 19 +++++++++++++++++++
 lib/PublicInbox/LeiAuth.pm    | 17 +++--------------
 lib/PublicInbox/LeiConvert.pm | 22 +++++-----------------
 lib/PublicInbox/LeiImport.pm  | 19 ++++---------------
 lib/PublicInbox/LeiMirror.pm  | 19 ++++---------------
 5 files changed, 35 insertions(+), 61 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1e4c36d0..0b4bc20e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -468,6 +468,25 @@ sub lei_atfork_child {
 	$current_lei = $persist ? undef : $self; # for SIG{__WARN__}
 }
 
+sub workers_start {
+	my ($lei, $wq, $ident, $jobs, $ops) = @_;
+	$ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		%$ops
+	};
+	require PublicInbox::PktOp;
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$wq->wq_workers_start($ident, $jobs, $lei->oldset, { lei => $lei });
+	delete $lei->{pkt_op_p};
+	my $op = delete $lei->{pkt_op_c};
+	$lei->event_step_init;
+	# oneshot needs $op, daemon-mode uses DS->EventLoop to handle $op
+	$lei->{oneshot} ? $op : undef;
+}
+
 sub _help {
 	require PublicInbox::LeiHelp;
 	PublicInbox::LeiHelp::call($_[0], $_[1], \%CMD, \%OPTDESC);
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 6593ba51..7acb9900 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -43,24 +43,13 @@ sub auth_eof {
 sub auth_start {
 	my ($self, $lei, $post_auth_cb, @args) = @_;
 	$lei->_lei_cfg(1); # workers may need to read config
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
+	my $op = $lei->workers_start($self, 'auth', 1, {
 		'nrd_merge' => [ \&nrd_merge, $lei ],
 		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	$self->wq_workers_start('lei_auth', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	});
 	$self->wq_io_do('do_auth', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 78fd5e17..ba375772 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::PktOp;
 use PublicInbox::LeiStore;
 use PublicInbox::LeiOverview;
 
@@ -59,26 +58,15 @@ sub do_convert { # via wq_do
 	delete $self->{wcb}; # commit
 }
 
-sub convert_start {
+sub convert_start { # LeiAuth->auth_start callback
 	my ($lei) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ $lei->can('dclose'), $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{cnv};
-	$self->wq_workers_start('lei_convert', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_convert', 1, {
+		'' => [ $lei->can('dclose'), $lei ]
+	});
 	$self->wq_io_do('do_convert', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub call { # the main "lei convert" method
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 62a2a412..68cab12c 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::PktOp;
 
 sub _import_eml { # MboxReader callback
 	my ($eml, $sto, $set_kw) = @_;
@@ -31,13 +30,6 @@ sub import_done { # EOF callback for main daemon
 
 sub import_start {
 	my ($lei) = @_;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ \&import_done, $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{imp};
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $nrd = $lei->{nrd}) {
@@ -46,18 +38,15 @@ sub import_start {
 		my $nproc = $self->detect_nproc;
 		$j = $nproc if $j > $nproc;
 	}
-	$self->wq_workers_start('lei_import', $j, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_import', $j, {
+		'' => [ \&import_done, $lei ],
+	});
 	$self->wq_io_do('import_stdin', []) if $self->{0};
 	for my $input (@{$self->{inputs}}) {
 		$self->wq_io_do('import_path_url', [], $input);
 	}
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub call { # the main "lei import" method
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index c5153148..f8ca1ee5 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -8,7 +8,6 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
 use PublicInbox::Spawn qw(popen_rd spawn);
-use PublicInbox::PktOp;
 
 sub do_finish_mirror { # dwaitpid callback
 	my ($arg, $pid) = @_;
@@ -279,22 +278,12 @@ sub start {
 	require PublicInbox::Inbox;
 	require PublicInbox::Admin;
 	require PublicInbox::InboxWritable;
-	my $ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		'' => [ \&mirror_done, $lei ],
-	};
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
-	$self->wq_workers_start('lei_mirror', 1, $lei->oldset, {lei => $lei});
-	my $op = delete $lei->{pkt_op_c};
-	delete $lei->{pkt_op_p};
+	my $op = $lei->workers_start($self, 'lei_mirror', 1, {
+		'' => [ \&mirror_done, $lei ]
+	});
 	$self->wq_io_do('do_mirror', []);
 	$self->wq_close(1);
-	$lei->event_step_init; # wait for shutdowns
-	if ($lei->{oneshot}) {
-		while ($op->{sock}) { $op->event_step }
-	}
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {

^ permalink raw reply related	[relevance 47%]

* [PATCHv3 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support
  2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
  2021-02-18 20:22 68%         ` [PATCHv3 0/4] lei convert IMAP support Eric Wong
  2021-02-18 20:22 18%         ` [PATCHv3 1/4] lei convert: mail format conversion sub-command Eric Wong
@ 2021-02-18 20:22 37%         ` Eric Wong
  2021-02-18 20:22 47%         ` [PATCHv3 3/4] lei: consolidate the bulk of the IPC code Eric Wong
  2021-02-18 20:22 63%         ` [PATCHv3 4/4] lei: check for IMAP auth errors Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 20:22 UTC (permalink / raw)
  To: meta

This makes "lei import" more similar to "lei convert" and
allows importing from disparate sources simultaneously.

We'll also fix some ->child_error usage errors and make
the style of the code more similar to the "lei convert"
code.

v2: fix missing requires
---
 MANIFEST                     |   1 +
 lib/PublicInbox/LeiImport.pm | 129 ++++++++++++++++++++++++-----------
 t/lei-import-imap.t          |  28 ++++++++
 t/lei-import-maildir.t       |   4 +-
 t/lei_to_mail.t              |  10 +++
 5 files changed, 130 insertions(+), 42 deletions(-)
 create mode 100644 t/lei-import-imap.t

diff --git a/MANIFEST b/MANIFEST
index 4f146771..19f73356 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -365,6 +365,7 @@ t/kqnotify.t
 t/lei-convert.t
 t/lei-daemon.t
 t/lei-externals.t
+t/lei-import-imap.t
 t/lei-import-maildir.t
 t/lei-import.t
 t/lei-mirror.t
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 32f3a467..62a2a412 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -29,7 +29,7 @@ sub import_done { # EOF callback for main daemon
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
-sub do_import {
+sub import_start {
 	my ($lei) = @_;
 	my $ops = {
 		'!' => [ $lei->can('fail_handler'), $lei ],
@@ -39,7 +39,7 @@ sub do_import {
 	};
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	my $self = $lei->{imp};
-	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{argv}}) || 1;
+	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $nrd = $lei->{nrd}) {
 		# $j = $nrd->net_concurrency($j); TODO
 	} else {
@@ -50,8 +50,8 @@ sub do_import {
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
 	$self->wq_io_do('import_stdin', []) if $self->{0};
-	for my $x (@{$self->{argv}}) {
-		$self->wq_io_do('import_path_url', [], $x);
+	for my $input (@{$self->{inputs}}) {
+		$self->wq_io_do('import_path_url', [], $input);
 	}
 	$self->wq_close(1);
 	$lei->event_step_init; # wait for shutdowns
@@ -61,60 +61,91 @@ sub do_import {
 }
 
 sub call { # the main "lei import" method
-	my ($cls, $lei, @argv) = @_;
+	my ($cls, $lei, @inputs) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
+	my ($nrd, @f, @d);
 	$lei->{opt}->{kw} //= 1;
-	my $self = $lei->{imp} = bless { argv => \@argv }, $cls;
+	my $self = $lei->{imp} = bless { inputs => \@inputs }, $cls;
 	if ($lei->{opt}->{stdin}) {
-		@argv and return
-			$lei->fail("--stdin and locations (@argv) do not mix");
+		@inputs and return $lei->fail("--stdin and @inputs do not mix");
 		$lei->check_input_format or return;
 		$self->{0} = $lei->{0};
-	} else {
-		my @f;
-		for my $x (@argv) {
-			if (-f $x) { push @f, $x }
-			elsif (-d _) { require PublicInbox::MdirReader }
-			else {
-				require PublicInbox::NetReader;
-				$lei->{nrd} //= PublicInbox::NetReader->new;
-				$lei->{nrd}->add_url($x);
+	}
+
+	# TODO: do we need --format for non-stdin?
+	my $fmt = $lei->{opt}->{'format'};
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+			require PublicInbox::NetReader;
+			$nrd //= PublicInbox::NetReader->new;
+			$nrd->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--format=$fmt and `$ifmt:' conflict
+
 			}
-		}
-		if (@f) { $lei->check_input_format(\@f) or return }
-		if ($lei->{nrd} && (my @err = $lei->{nrd}->errors)) {
-			return $lei->fail(@err);
-		}
+			if (-f $input_path) {
+				require PublicInbox::MboxReader;
+				PublicInbox::MboxReader->can($ifmt) or return
+					$lei->fail("$ifmt not supported");
+			} elsif (-d _) {
+				require PublicInbox::MdirReader;
+				$ifmt eq 'maildir' or return
+					$lei->fail("$ifmt not supported");
+			} else {
+				return $lei->fail("Unable to handle $input");
+			}
+		} elsif (-f $input) { push @f, $input
+		} elsif (-d _) { push @d, $input
+		} else { return $lei->fail("Unable to handle $input") }
 	}
-	do_import($lei);
+	if (@f) { $lei->check_input_format(\@f) or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
+	}
+	$self->{inputs} = \@inputs;
+	return import_start($lei) if !$nrd;
+
+	if (my $err = $nrd->errors) {
+		return $lei->fail($err);
+	}
+	$nrd->{quiet} = $lei->{opt}->{quiet};
+	$lei->{nrd} = $nrd;
+	require PublicInbox::LeiAuth;
+	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
+	$auth->auth_start($lei, \&import_start, $lei);
 }
 
 sub ipc_atfork_child {
 	my ($self) = @_;
+	delete $self->{lei}->{imp}; # drop circular ref
 	$self->{lei}->lei_atfork_child;
 	$self->SUPER::ipc_atfork_child;
 }
 
 sub _import_fh {
-	my ($lei, $fh, $x) = @_;
+	my ($lei, $fh, $input, $ifmt) = @_;
 	my $set_kw = $lei->{opt}->{kw};
-	my $fmt = $lei->{opt}->{'format'};
 	eval {
-		if ($fmt eq 'eml') {
+		if ($ifmt eq 'eml') {
 			my $buf = do { local $/; <$fh> } //
-				return $lei->child_error(1 >> 8, <<"");
-error reading $x: $!
+				return $lei->child_error(1 << 8, <<"");
+error reading $input: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
 			_import_eml($eml, $lei->{sto}, $set_kw);
 		} else { # some mbox (->can already checked in call);
-			my $cb = PublicInbox::MboxReader->can($fmt) //
-				die "BUG: bad fmt=$fmt";
+			my $cb = PublicInbox::MboxReader->can($ifmt) //
+				die "BUG: bad fmt=$ifmt";
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
 	};
-	$lei->child_error(1 >> 8, "<stdin>: $@") if $@;
+	$lei->child_error(1 << 8, "<stdin>: $@") if $@;
 }
 
 sub _import_maildir { # maildir_each_file cb
@@ -122,27 +153,45 @@ sub _import_maildir { # maildir_each_file cb
 	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
 }
 
+sub _import_imap { # imap_each cb
+	my ($url, $uid, $kw, $eml, $sto, $set_kw) = @_;
+	warn "$url $uid";
+	$sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
+}
+
 sub import_path_url {
-	my ($self, $x) = @_;
+	my ($self, $input) = @_;
 	my $lei = $self->{lei};
+	my $ifmt = lc($lei->{opt}->{'format'} // '');
 	# TODO auto-detect?
-	if (-f $x) {
-		open my $fh, '<', $x or return $lei->child_error(1 >> 8, <<"");
-unable to open $x: $!
+	if ($input =~ m!\A(imap|nntp)s?://!i) {
+		$lei->{nrd}->imap_each($input, \&_import_imap, $lei->{sto},
+					$lei->{opt}->{kw});
+		return;
+	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+		$ifmt = lc $1;
+	}
+	if (-f $input) {
+		open my $fh, '<', $input or return $lei->child_error(1 << 8, <<"");
+unable to open $input: $!
 
-		_import_fh($lei, $fh, $x);
-	} elsif (-d _ && (-d "$x/cur" || -d "$x/new")) {
-		PublicInbox::MdirReader::maildir_each_file($x,
+		_import_fh($lei, $fh, $input, $ifmt);
+	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
+		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
+$input appears to a be a maildir, not $ifmt
+EOM
+		PublicInbox::MdirReader::maildir_each_file($input,
 					\&_import_maildir,
 					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
-		$lei->fail("$x unsupported (TODO)");
+		$lei->fail("$input unsupported (TODO)");
 	}
 }
 
 sub import_stdin {
 	my ($self) = @_;
-	_import_fh($self->{lei}, $self->{0}, '<stdin>');
+	my $lei = $self->{lei};
+	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'format'});
 }
 
 1;
diff --git a/t/lei-import-imap.t b/t/lei-import-imap.t
new file mode 100644
index 00000000..ee308723
--- /dev/null
+++ b/t/lei-import-imap.t
@@ -0,0 +1,28 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	lei_ok(qw(q bytes:1..));
+	my $out = json_utf8->decode($lei_out);
+	is_deeply($out, [ undef ], 'nothing imported, yet');
+	lei_ok('import', "imap://$host_port/t.v2.0");
+	lei_ok(qw(q bytes:1..));
+	$out = json_utf8->decode($lei_out);
+	ok(scalar(@$out) > 1, 'got imported messages');
+	is(pop @$out, undef, 'trailing JSON null element was null');
+	my %r;
+	for (@$out) { $r{ref($_)}++ }
+	is_deeply(\%r, { 'HASH' => scalar(@$out) }, 'all hashes');
+});
+done_testing;
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index 5842e19e..d2b059ad 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -23,8 +23,8 @@ test_lei(sub {
 	is_deeply($r2, $res, 'idempotent import');
 
 	rename("$md/cur/x:2,S", "$md/cur/x:2,SR") or BAIL_OUT "rename: $!";
-	ok($lei->(qw(import), $md), 'import Maildir after +answered');
-	ok($lei->(qw(q -d none s:boolean)), 'lei q after +answered');
+	lei_ok('import', "maildir:$md", \'import Maildir after +answered');
+	lei_ok(qw(q -d none s:boolean), \'lei q after +answered');
 	$res = json_utf8->decode($lei_out);
 	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
 	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 6a571660..72b90700 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -139,6 +139,16 @@ test_lei(sub {
 	is($res->[1], undef, 'only one result');
 });
 
+test_lei(sub {
+	lei_ok('import', "$mbox:$fn", \'imported mbox:/path') or diag $lei_err;
+	lei_ok(qw(q s:x), \'lei q works') or diag $lei_err;
+	my $res = json_utf8->decode($lei_out);
+	my $x = $res->[0];
+	is($x->{'s'}, 'x', 'subject imported') or diag $lei_out;
+	is_deeply($x->{'kw'}, ['seen'], 'kw imported') or diag $lei_out;
+	is($res->[1], undef, 'only one result');
+});
+
 for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 	my $zsfx2cmd = PublicInbox::LeiToMail->can('zsfx2cmd');
 	SKIP: {

^ permalink raw reply related	[relevance 37%]

* lei stuff that should be in a lei(1) or lei-overview(7)
@ 2021-02-18 20:28 99% Eric Wong
  2021-02-22  3:42 99% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-18 20:28 UTC (permalink / raw)
  To: meta

More random scattered thoughts

* Note bash completions in contrib/, encourage contributions for
  other shells

* Note UI/UX is a deeply personal choice, lei can't satisfy everyone

* Note Inline::C + "mkdir -p ~/.cache/public-inbox/inline-c"
  significantly improves performance.  Socket::MsgHdr
  (libsocket-msghdr-perl in Debian) further improves startup
  performance (most noticeable for bash completion)

* Don't impose or aggressively promote lei in your projects or
  communities.  That defeats the point of open standards.  Until
  AGPL acceptance grows, it's likely many users can't use lei.

* Primary author is scatter-brained, and getting worse :<

More to come...

^ permalink raw reply	[relevance 99%]

* lei q --save-as=... requires too much thinking
  @ 2021-02-18 20:42 70%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-18 20:42 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +'query' => [ 'SEARCH-TERMS...', 'search for messages matching terms', qw(

s/query/q/

> +	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a

Naming things is hard, so I don't think "lei q --save-as="
needs to exist.

Just the "-o DESTINATION" can be tracked and used to infer the
metadata used to make the saved search.  Maildir outputs can
store that information in a ".lei" subdir of the Maildir itself;
mboxes and IMAP folders will have it stored in SQLite somewhere.

As with externals, there'll be basename shortcuts, so saved
searches go into ~/lei-saved/foo/; one can just specify "foo"
the next time around as long as the basename is unique.

This is following what was done with externals, lei doesn't
allow the user to name the external like git remotes (e.g.
[remote "origin"] or [publicinbox "foo"]).  The normalized URL
or pathname is the name of the external; which saves cognitive
overhead for my feeble brain.

Of course, basename-only expansion and bash completions go a
long way towards making URLs/paths usable.

> +'ls-query' => [ '[FILTER]', 'list saved search queries',
> +		qw(name-only format|f=s z) ],
> +'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
> +'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],

These would operate on pathnames and URLs.

^ permalink raw reply	[relevance 70%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-18 12:07 71%   ` Eric Wong
@ 2021-02-19  3:10 71%     ` Kyle Meyer
  2021-02-19 11:13 71%       ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-19  3:10 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> How about we just drop --format from the documentation, for now?
> (or at least stop recommending it when using with -o)
>
> The stdout case might be a reason to keep it for "lei q",
> especially since stdout is the default output:
[...]

I don't feel strongly one way or the other about keeping --format, but
if it is kept around for stdout, I think it'd be good to document it
(i.e. your "stop recommending it when using with -o option").

^ permalink raw reply	[relevance 71%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-19  3:10 71%     ` Kyle Meyer
@ 2021-02-19 11:13 71%       ` Eric Wong
  2021-02-19 13:47 71%         ` Kyle Meyer
  2021-02-19 19:06 71%         ` Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-02-19 11:13 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > How about we just drop --format from the documentation, for now?
> > (or at least stop recommending it when using with -o)
> >
> > The stdout case might be a reason to keep it for "lei q",
> > especially since stdout is the default output:
> [...]
> 
> I don't feel strongly one way or the other about keeping --format, but
> if it is kept around for stdout, I think it'd be good to document it
> (i.e. your "stop recommending it when using with -o option").

Alright, I think keeping it and only recommending it for stdout
(or --stdin with import) is the way to go.

"-o format:/path/name" should be encouraged for regular file and
directory args.

Do you have time to update the current manpages?  Thanks in
advance if so, otherwise I'll try to do it at some point....


On a side note, it also occurs to me some users may expect paths
like /dev/fd/[0-2] or /proc/self/fd/[0-2] to work like the
/dev/stdout handling in lei-daemon.  We'll have to account for
that in daemon mode...

^ permalink raw reply	[relevance 71%]

* [PATCH 0/6] lei: start working on IMAP writes
@ 2021-02-19 12:09 69% Eric Wong
  2021-02-19 12:09 63% ` [PATCH 1/6] t/lei-externals: favor "-o format:$PATHNAME" over "-f" Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-19 12:09 UTC (permalink / raw)
  To: meta

Testing this will be tricky and require writable access to
an existing IMAP server via TEST_IMAP_WRITE_URL env.

I don't know if I want to do a writable IMAP server for testing
(or maybe accessible via lei somehow).

4/6 is fix for a long-standing bug carried over from Watch.pm
(well, as long as -watch has had IMAP support) which was within
the past year, but it's been a long year :<)

Eric Wong (6):
  t/lei-externals: favor "-o format:$PATHNAME" over "-f"
  lei_to_mail: get rid of empty _post_augment_maildir
  tests: require Mail::IMAPClient for IMAP tests
  net_reader: handle single-message IMAP mailboxes
  net_writer: start implementing IMAP write support
  URIimap: overload "" to ->as_string

 MANIFEST                     |  2 ++
 lib/PublicInbox/LeiToMail.pm | 14 +++++------
 lib/PublicInbox/NetReader.pm | 47 +++++++++++++++++++++--------------
 lib/PublicInbox/NetWriter.pm | 26 +++++++++++++++++++
 lib/PublicInbox/URIimap.pm   |  1 +
 t/lei-convert.t              |  2 +-
 t/lei-externals.t            |  8 +++---
 t/lei-import-imap.t          |  2 +-
 t/net_reader-imap.t          |  2 +-
 t/uri_imap.t                 |  1 +
 xt/lei-auth-fail.t           |  1 +
 xt/net_writer-imap.t         | 48 ++++++++++++++++++++++++++++++++++++
 12 files changed, 121 insertions(+), 33 deletions(-)
 create mode 100644 lib/PublicInbox/NetWriter.pm
 create mode 100644 xt/net_writer-imap.t

^ permalink raw reply	[relevance 69%]

* [PATCH 1/6] t/lei-externals: favor "-o format:$PATHNAME" over "-f"
  2021-02-19 12:09 69% [PATCH 0/6] lei: start working on IMAP writes Eric Wong
@ 2021-02-19 12:09 63% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-19 12:09 UTC (permalink / raw)
  To: meta

It'll be less ambiguous for inputs with "lei convert" and "lei import"

cf. https://public-inbox.org/meta/20210217044032.GA17934@dcvr/
---
 lib/PublicInbox/LeiToMail.pm | 2 +-
 t/lei-externals.t            | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 8a2d9471..b90756ae 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -358,7 +358,7 @@ sub new {
 		require PublicInbox::MboxReader if $lei->{opt}->{augment};
 		(-d $dst || (-e _ && !-w _)) and die
 			"$dst exists and is not a writable file\n";
-		$self->can("eml2$fmt") or die "bad mbox --format=$fmt\n";
+		$self->can("eml2$fmt") or die "bad mbox format: $fmt\n";
 		$self->{base_type} = 'mbox';
 	} else {
 		die "bad mail --format=$fmt\n";
diff --git a/t/lei-externals.t b/t/lei-externals.t
index f61b7e52..edfbb2bf 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -117,18 +117,18 @@ test_lei(sub {
 	unlike($lei_out, qr!https://example\.com/ibx/!s,
 		'removed canonical URL');
 SKIP: {
-	ok(!$lei->(qw(q s:prefix -o /dev/null -f maildir)), 'bad maildir');
+	ok(!lei(qw(q s:prefix -o maildir:/dev/null)), 'bad maildir');
 	like($lei_err, qr!/dev/null exists and is not a directory!,
 		'error shown');
 	is($? >> 8, 1, 'errored out with exit 1');
 
-	ok(!$lei->(qw(q s:prefix -f mboxcl2 -o), $home), 'bad mbox');
+	ok(!lei(qw(q s:prefix -o), "mboxcl2:$home"), 'bad mbox');
 	like($lei_err, qr!\Q$home\E exists and is not a writable file!,
 		'error shown');
 	is($? >> 8, 1, 'errored out with exit 1');
 
-	ok(!$lei->(qw(q s:prefix -o /dev/stdout -f Mbox2)), 'bad format');
-	like($lei_err, qr/bad mbox --format=mbox2/, 'error shown');
+	ok(!lei(qw(q s:prefix -o Mbox2:/dev/stdout)), 'bad format');
+	like($lei_err, qr/bad mbox format: mbox2/, 'error shown');
 	is($? >> 8, 1, 'errored out with exit 1');
 
 	# note, on a Bourne shell users should be able to use either:

^ permalink raw reply related	[relevance 63%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-19 11:13 71%       ` Eric Wong
@ 2021-02-19 13:47 71%         ` Kyle Meyer
  2021-02-19 19:06 71%         ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-19 13:47 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Do you have time to update the current manpages?  Thanks in
> advance if so, otherwise I'll try to do it at some point....

Yes, as long as getting to it in the next few days is speedy enough :)

^ permalink raw reply	[relevance 71%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-19 11:13 71%       ` Eric Wong
  2021-02-19 13:47 71%         ` Kyle Meyer
@ 2021-02-19 19:06 71%         ` Eric Wong
  2021-02-20  7:12 71%           ` Kyle Meyer
  1 sibling, 1 reply; 200+ results
From: Eric Wong @ 2021-02-19 19:06 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Kyle Meyer <kyle@kyleam.com> wrote:
> > Eric Wong writes:
> > 
> > > How about we just drop --format from the documentation, for now?
> > > (or at least stop recommending it when using with -o)
> > >
> > > The stdout case might be a reason to keep it for "lei q",
> > > especially since stdout is the default output:
> > [...]
> > 
> > I don't feel strongly one way or the other about keeping --format, but
> > if it is kept around for stdout, I think it'd be good to document it
> > (i.e. your "stop recommending it when using with -o option").
> 
> Alright, I think keeping it and only recommending it for stdout
> (or --stdin with import) is the way to go.
> 
> "-o format:/path/name" should be encouraged for regular file and
> directory args.

Actually, maybe "-o $format" be implicitly stdout?
(and "-i $format" be implicitly stdin for convert|import)

^ permalink raw reply	[relevance 71%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-19 19:06 71%         ` Eric Wong
@ 2021-02-20  7:12 71%           ` Kyle Meyer
  2021-02-20  8:07 71%             ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-20  7:12 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Eric Wong <e@80x24.org> wrote:
[...]
>> Alright, I think keeping it and only recommending it for stdout
>> (or --stdin with import) is the way to go.
>> 
>> "-o format:/path/name" should be encouraged for regular file and
>> directory args.
>
> Actually, maybe "-o $format" be implicitly stdout?
> (and "-i $format" be implicitly stdin for convert|import)

Hmm, true.  If we went that route, I guess the format auto-detection in
LeiOverview::detect_fmt() (currently just for maildir) would be dropped
because an -o argument without a colon would be taken as the format
rather than a destination that the format can be detected from?

^ permalink raw reply	[relevance 71%]

* Re: does "lei q" --format/-f need to exist?
  2021-02-20  7:12 71%           ` Kyle Meyer
@ 2021-02-20  8:07 71%             ` Eric Wong
  2021-02-23  3:45 51%               ` [PATCH] doc: lei: favor "-o format:$PATHNAME" over "-f" Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-20  8:07 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > Eric Wong <e@80x24.org> wrote:
> [...]
> >> Alright, I think keeping it and only recommending it for stdout
> >> (or --stdin with import) is the way to go.
> >> 
> >> "-o format:/path/name" should be encouraged for regular file and
> >> directory args.
> >
> > Actually, maybe "-o $format" be implicitly stdout?
> > (and "-i $format" be implicitly stdin for convert|import)
> 
> Hmm, true.  If we went that route, I guess the format auto-detection in
> LeiOverview::detect_fmt() (currently just for maildir) would be dropped
> because an -o argument without a colon would be taken as the format
> rather than a destination that the format can be detected from?

Maybe not dropped, but probably tweaked for DWIM-ness.

Maybe:

  If somebody wants a Maildir to dump JSON search results in they
  could use "-o ./json" or "-o json/" or "-o /path/to/json".

  "-o json" (no slashes or colons) would mean JSON output to stdout.


But then, "json" could be the name of an existing directory,
so if it exists...

Part of me thinks its too magical...

On the other hand, maybe only requiring the colon: "-o json:"
is enough to disambiguate and isn't too much typing.

We'll just assume nobody would want to end a directory
with ":".  They can still use "-o maildir:/i/like/colons:"
if they really want to end a dirname with ":" for whatever
reason...

^ permalink raw reply	[relevance 71%]

* [PATCH 0/7] "lei q -o imaps://..." support
@ 2021-02-21  7:41 69% Eric Wong
  2021-02-21  7:41 34% ` [PATCH 2/7] lei q: support IMAP/IMAPS --output destinations Eric Wong
  2021-02-21  7:41 58% ` [PATCH 4/7] lei q: move augment into lei2mail workers Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-02-21  7:41 UTC (permalink / raw)
  To: meta

-a/--augment dedupe is now parallel for both Maildirs and IMAP
stores (probably not worth the serialization cost for mbox*).

LeiAuth remains inefficient, unfortunately; but wq_broadcast
has been added to address it in the future.

The parallelization work for IMAP for "lei q" can also be done
for "lei convert" and "lei import", but it'll probably be opt-in
in case people care about preserving UID order.

Eric Wong (7):
  inbox_writable: require PublicInbox::MdirReader
  lei q: support IMAP/IMAPS --output destinations
  ipc: add wq_broadcast
  lei q: move augment into lei2mail workers
  ipc: support setting a locked number of WQ workers
  net_reader: use and accept URIimap objects in more places
  lei2mail: parallel augment for lock-free stores

 lib/PublicInbox/IPC.pm           |  35 +++++++--
 lib/PublicInbox/InboxWritable.pm |   1 +
 lib/PublicInbox/LeiAuth.pm       |   2 +-
 lib/PublicInbox/LeiOverview.pm   |   7 +-
 lib/PublicInbox/LeiQuery.pm      |  24 +++++--
 lib/PublicInbox/LeiToMail.pm     |  93 ++++++++++++++++++++++--
 lib/PublicInbox/LeiXSearch.pm    |  48 ++++++-------
 lib/PublicInbox/NetReader.pm     |  75 +++++++++++---------
 lib/PublicInbox/NetWriter.pm     |  12 ++++
 lib/PublicInbox/WQWorker.pm      |   8 +--
 lib/PublicInbox/Watch.pm         |  11 +--
 t/ipc.t                          |  39 +++++-----
 t/lei-externals.t                |   3 +-
 xt/net_writer-imap.t             | 118 ++++++++++++++++++++++++++++---
 14 files changed, 362 insertions(+), 114 deletions(-)


^ permalink raw reply	[relevance 69%]

* [PATCH 4/7] lei q: move augment into lei2mail workers
  2021-02-21  7:41 69% [PATCH 0/7] "lei q -o imaps://..." support Eric Wong
  2021-02-21  7:41 34% ` [PATCH 2/7] lei q: support IMAP/IMAPS --output destinations Eric Wong
@ 2021-02-21  7:41 58% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-02-21  7:41 UTC (permalink / raw)
  To: meta

This is a step which will allow us to parallelize augment
on Maildir and IMAP.
---
 lib/PublicInbox/LeiToMail.pm  | 10 +++++++++-
 lib/PublicInbox/LeiXSearch.pm | 18 ++++--------------
 t/lei-externals.t             |  3 ++-
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 0e0b0a43..e5398912 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -14,6 +14,7 @@ use PublicInbox::LeiDedupe;
 use PublicInbox::OnDestroy;
 use PublicInbox::Git;
 use PublicInbox::GitAsyncCat;
+use PublicInbox::PktOp qw(pkt_do);
 use Symbol qw(gensym);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
@@ -499,7 +500,7 @@ sub pre_augment { # fast (1 disk seek), runs in same process as post_augment
 
 sub do_augment { # slow, runs in wq worker
 	my ($self, $lei) = @_;
-	# _do_augment_maildir, _do_augment_mbox
+	# _do_augment_maildir, _do_augment_mbox, or _do_augment_imap
 	my $m = "_do_augment_$self->{base_type}";
 	$self->$m($lei);
 }
@@ -516,6 +517,13 @@ sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = delete $self->{lei};
 	$lei->lei_atfork_child;
+	if ($self->{-wq_worker_nr} == 0) {
+		local $0 = 'do_augment';
+		eval { do_augment($self, $lei) };
+		$lei->fail($@) if $@;
+		pkt_do($lei->{pkt_op_p}, '.') == 1 or
+					die "do_post_augment trigger: $!";
+	}
 	if (my $zpipe = delete $lei->{zpipe}) {
 		$lei->{1} = $zpipe->[1];
 		close $zpipe->[0];
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 10485220..a319b75f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -99,21 +99,21 @@ sub _mset_more ($$) {
 	$size >= $mo->{limit} && (($mo->{offset} += $size) < $mo->{limit});
 }
 
-# $startq will EOF when query_prepare is done augmenting and allow
+# $startq will EOF when do_augment is done augmenting and allow
 # query_mset and query_thread_mset to proceed.
 sub wait_startq ($) {
 	my ($lei) = @_;
 	my $startq = delete $lei->{startq} or return;
 	while (1) {
-		my $n = sysread($startq, my $query_prepare_done, 1);
+		my $n = sysread($startq, my $do_augment_done, 1);
 		if (defined $n) {
 			return if $n == 0; # no MUA
-			if ($query_prepare_done eq 'q') {
+			if ($do_augment_done eq 'q') {
 				$lei->{opt}->{quiet} = 1;
 				delete $lei->{opt}->{verbose};
 				delete $lei->{-progress};
 			} else {
-				$lei->fail("$$ WTF `$query_prepare_done'");
+				$lei->fail("$$ WTF `$do_augment_done'");
 			}
 			return;
 		}
@@ -386,15 +386,6 @@ sub ipc_atfork_child {
 	$self->SUPER::ipc_atfork_child;
 }
 
-sub query_prepare { # called by wq_io_do
-	my ($self) = @_;
-	local $0 = "$0 query_prepare";
-	my $lei = $self->{lei};
-	eval { $lei->{l2m}->do_augment($lei) };
-	$lei->fail($@) if $@;
-	pkt_do($lei->{pkt_op_p}, '.') == 1 or die "do_post_augment trigger: $!"
-}
-
 sub do_query {
 	my ($self, $lei) = @_;
 	my $ops = {
@@ -433,7 +424,6 @@ sub do_query {
 	delete $lei->{pkt_op_p};
 	$l2m->wq_close(1) if $l2m;
 	$lei->event_step_init; # wait for shutdowns
-	$self->wq_io_do('query_prepare', []) if $l2m; # for augment/dedupe
 	start_query($self, $lei);
 	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {
diff --git a/t/lei-externals.t b/t/lei-externals.t
index edfbb2bf..02b15232 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -186,7 +186,8 @@ SKIP: {
 		my @s = grep(/^Subject:/, $cat->());
 		is(scalar(@s), 1, "1 result in mbox$sfx");
 		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
-		is(grep(!/^#/, $lei_err), 0, 'no errors from augment');
+		is(grep(!/^#/, $lei_err), 0, 'no errors from augment') or
+			diag $lei_err;
 		@s = grep(/^Subject:/, my @wtf = $cat->());
 		is(scalar(@s), 2, "2 results in mbox$sfx");
 

^ permalink raw reply related	[relevance 58%]

* [PATCH 2/7] lei q: support IMAP/IMAPS --output destinations
  2021-02-21  7:41 69% [PATCH 0/7] "lei q -o imaps://..." support Eric Wong
@ 2021-02-21  7:41 34% ` Eric Wong
  2021-02-21  7:41 58% ` [PATCH 4/7] lei q: move augment into lei2mail workers Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-02-21  7:41 UTC (permalink / raw)
  To: meta

Augment (and dedupe) aren't parallel, yet, so its more sensitive to
high-latency networks.
---
 lib/PublicInbox/LeiAuth.pm     |   2 +-
 lib/PublicInbox/LeiOverview.pm |   7 +-
 lib/PublicInbox/LeiQuery.pm    |  18 ++++-
 lib/PublicInbox/LeiToMail.pm   |  56 +++++++++++++++-
 lib/PublicInbox/NetReader.pm   |   7 +-
 lib/PublicInbox/NetWriter.pm   |  12 ++++
 xt/net_writer-imap.t           | 118 ++++++++++++++++++++++++++++++---
 7 files changed, 202 insertions(+), 18 deletions(-)

diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 7acb9900..bf0110ed 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -63,7 +63,7 @@ sub ipc_atfork_child {
 }
 
 sub new {
-	my ($cls, $nrd) = @_;
+	my ($cls, $nrd) = @_; # nrd may be NetReader or descendant (NetWriter)
 	bless { nrd => $nrd }, $cls;
 }
 
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 3169bae6..4db1d8c8 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -59,7 +59,12 @@ sub new {
 
 	my $fmt = $opt->{$ofmt_key};
 	$fmt = lc($fmt) if defined $fmt;
-	if ($dst =~ s/\A([a-z0-9]+)://is) { # e.g. Maildir:/home/user/Mail/
+	if ($dst =~ m!\A([a-z0-9\+]+)://!is) {
+		defined($fmt) and return $lei->fail(<<"");
+--$ofmt_key=$fmt invalid with URL $dst
+
+		$fmt = lc $1;
+	} elsif ($dst =~ s/\A([a-z0-9]+)://is) { # e.g. Maildir:/home/user/Mail/
 		my $ofmt = lc $1;
 		$fmt //= $ofmt;
 		return $lei->fail(<<"") if $fmt ne $ofmt;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index f71beae6..eaf91f2e 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -11,14 +11,26 @@ sub prep_ext { # externals_each callback
 	$lxs->prepare_external($loc) unless $exclude->{$loc};
 }
 
-sub qstr_add { # for --stdin
+sub _start_query {
+	my ($self) = @_;
+	if (my $nwr = $self->{nwr}) {
+		require PublicInbox::LeiAuth;
+		my $auth = $self->{auth} = PublicInbox::LeiAuth->new($nwr);
+		my $lxs = $self->{lxs};
+		$auth->auth_start($self, $lxs->can('do_query'), $lxs, $self);
+	} else {
+		$self->{lxs}->do_query($self);
+	}
+}
+
+sub qstr_add { # PublicInbox::InputPipe::consume callback for --stdin
 	my ($self) = @_; # $_[1] = $rbuf
 	if (defined($_[1])) {
 		$_[1] eq '' and return eval {
 			my $lse = delete $self->{lse};
 			$lse->query_approxidate($lse->git,
 						$self->{mset_opt}->{qstr});
-			$self->{lxs}->do_query($self);
+			_start_query($self);
 		};
 		$self->{mset_opt}->{qstr} .= $_[1];
 	} else {
@@ -115,7 +127,7 @@ no query allowed on command-line with --stdin
 		return;
 	}
 	$mset_opt{qstr} = $lse->query_argv_to_string($lse->git, \@argv);
-	$lxs->do_query($self);
+	_start_query($self);
 }
 
 # shell completion helper called by lei__complete
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e89cca71..0e0b0a43 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -331,9 +331,31 @@ sub _maildir_write_cb ($$) {
 	}
 }
 
+sub _imap_write_cb ($$) {
+	my ($self, $lei) = @_;
+	my $dedupe = $lei->{dedupe};
+	$dedupe->prepare_dedupe if $dedupe;
+	my $imap_append = $lei->{nwr}->can('imap_append');
+	my $mic = $lei->{nwr}->mic_get($lei->{ovv}->{dst});
+	my $folder = $self->{uri}->mailbox;
+	sub { # for git_to_mail
+		my ($bref, $smsg, $eml) = @_;
+		$mic // return $lei->fail; # dst may be undef-ed in last run
+		if ($dedupe) {
+			$eml //= PublicInbox::Eml->new($$bref); # copy bref
+			return if $dedupe->is_dup($eml, $smsg->{blob});
+		}
+		eval { $imap_append->($mic, $folder, $bref, $smsg, $eml) };
+		if (my $err = $@) {
+			undef $mic;
+			die $err;
+		}
+	}
+}
+
 sub write_cb { # returns a callback for git_to_mail
 	my ($self, $lei) = @_;
-	# _mbox_write_cb or _maildir_write_cb
+	# _mbox_write_cb, _maildir_write_cb or _imap_write_cb
 	my $m = "_$self->{base_type}_write_cb";
 	$self->$m($lei);
 }
@@ -360,6 +382,18 @@ sub new {
 			"$dst exists and is not a writable file\n";
 		$self->can("eml2$fmt") or die "bad mbox format: $fmt\n";
 		$self->{base_type} = 'mbox';
+	} elsif ($fmt =~ /\Aimaps?\z/) { # TODO .onion support
+		require PublicInbox::NetWriter;
+		my $nwr = PublicInbox::NetWriter->new;
+		$nwr->add_url($dst);
+		$nwr->{quiet} = $lei->{opt}->{quiet};
+		my $err = $nwr->errors($dst);
+		return $lei->fail($err) if $err;
+		require PublicInbox::URIimap; # TODO: URI cast early
+		$self->{uri} = PublicInbox::URIimap->new($dst);
+		$self->{uri}->mailbox or die "No mailbox: $dst";
+		$lei->{nwr} = $nwr;
+		$self->{base_type} = 'imap';
 	} else {
 		die "bad mail --format=$fmt\n";
 	}
@@ -394,6 +428,26 @@ sub _do_augment_maildir {
 	}
 }
 
+sub _augment_imap { # PublicInbox::NetReader::imap_each cb
+	my ($url, $uid, $kw, $eml, $lei) = @_;
+	_augment($eml, $lei);
+}
+
+sub _do_augment_imap {
+	my ($self, $lei) = @_;
+	my $dst = $lei->{ovv}->{dst};
+	my $nwr = $lei->{nwr};
+	if ($lei->{opt}->{augment}) {
+		my $dedupe = $lei->{dedupe};
+		if ($dedupe && $dedupe->prepare_dedupe) {
+			$nwr->imap_each($dst, \&_augment_imap, $lei);
+			$dedupe->pause_dedupe;
+		}
+	} else { # clobber existing IMAP folder
+		$nwr->imap_delete_all($dst);
+	}
+}
+
 sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 92d004bc..541094a0 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -422,8 +422,13 @@ sub _imap_fetch_all ($$$) {
 # uses cached auth info prepared by mic_for
 sub mic_get {
 	my ($self, $sec) = @_;
-	my $mic_arg = $self->{mic_arg}->{$sec} or
+	my $mic_arg = $self->{mic_arg}->{$sec};
+	unless ($mic_arg) {
+		my $uri = PublicInbox::URIimap->new($sec);
+		$sec = uri_section($uri);
+		$mic_arg = $self->{mic_arg}->{$sec} or
 			die "BUG: no Mail::IMAPClient->new arg for $sec";
+	}
 	if (defined(my $cb_name = $mic_arg->{Authcallback})) {
 		if (ref($cb_name) ne 'CODE') {
 			$mic_arg->{Authcallback} = $self->can($cb_name);
diff --git a/lib/PublicInbox/NetWriter.pm b/lib/PublicInbox/NetWriter.pm
index 6f0a0b94..89f8662e 100644
--- a/lib/PublicInbox/NetWriter.pm
+++ b/lib/PublicInbox/NetWriter.pm
@@ -23,4 +23,16 @@ sub imap_append {
 		die "APPEND $folder: $@";
 }
 
+sub imap_delete_all {
+	my ($self, $url) = @_;
+	my $uri = PublicInbox::URIimap->new($url);
+	my $sec = $self->can('uri_section')->($uri);
+	local $0 = $uri->mailbox." $sec";
+	my $mic = $self->mic_get($sec) or die "E: not connected: $@";
+	$mic->select($uri->mailbox) or return; # non-existent
+	if ($mic->delete_message('1:*')) {
+		$mic->expunge;
+	}
+}
+
 1;
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index dfd765be..4832245a 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -7,6 +7,7 @@ use POSIX qw(strftime);
 use PublicInbox::OnDestroy;
 use PublicInbox::URIimap;
 use PublicInbox::Config;
+use Fcntl qw(O_EXCL O_WRONLY O_CREAT);
 my $imap_url = $ENV{TEST_IMAP_WRITE_URL} or
 	plan skip_all => 'TEST_IMAP_WRITE_URL unset';
 my $uri = PublicInbox::URIimap->new($imap_url);
@@ -19,30 +20,125 @@ my ($base) = ($0 =~ m!\b([^/]+)\.[^\.]+\z!);
 my $folder = "INBOX.$base-$host-".strftime('%Y%m%d%H%M%S', gmtime(time)).
 		"-$$-".sprintf('%x', int(rand(0xffffffff)));
 my $nwr = PublicInbox::NetWriter->new;
-$imap_url .= '/' unless substr($imap_url, -1) eq '/';
+chop($imap_url) if substr($imap_url, -1) eq '/';
 my $folder_uri = PublicInbox::URIimap->new("$imap_url/$folder");
 is($folder_uri->mailbox, $folder, 'folder correct') or
 		BAIL_OUT "BUG: bad $$uri";
 $nwr->add_url($$folder_uri);
 is($nwr->errors, undef, 'no errors');
 $nwr->{pi_cfg} = bless {}, 'PublicInbox::Config';
-my $mics = $nwr->imap_common_init;
+
+my $set_cred_helper = sub {
+	my ($f, $cred_set) = @_;
+	sysopen(my $fh, $f, O_CREAT|O_EXCL|O_WRONLY) or BAIL_OUT "open $f: $!";
+	print $fh <<EOF or BAIL_OUT "print $f: $!";
+[credential]
+	helper = $cred_set
+EOF
+	close $fh or BAIL_OUT "close $f: $!";
+};
+
+# allow testers with git-credential-store configured to reuse
+# stored credentials inside test_lei(sub {...}) when $ENV{HOME}
+# is overridden and localized.
+my ($cred_set, @cred_link, $tmpdir, $for_destroy);
+chomp(my $cred_helper = `git config credential.helper 2>/dev/null`);
+if ($cred_helper eq 'store') {
+	my $config = $ENV{XDG_CONFIG_HOME} // "$ENV{HOME}/.config";
+	for my $f ("$ENV{HOME}/.git-credentials", "$config/git/credentials") {
+		next unless -f $f;
+		@cred_link = ($f, '/.git-credentials');
+		last;
+	}
+	$cred_set = qq("$cred_helper");
+} elsif ($cred_helper =~ /\Acache(?:[ \t]|\z)/) {
+	my $cache = $ENV{XDG_CACHE_HOME} // "$ENV{HOME}/.cache";
+	for my $d ("$ENV{HOME}/.git-credential-cache",
+			"$cache/git/credential") {
+		next unless -d $d;
+		@cred_link = ($d, '/.git-credential-cache');
+		$cred_set = qq("$cred_helper");
+		last;
+	}
+} elsif (!$cred_helper) { # make the test less painful if no creds configured
+	($tmpdir, $for_destroy) = tmpdir;
+	my $d = "$tmpdir/.git-credential-cache";
+	mkdir($d, 0700) or BAIL_OUT $!;
+	$cred_set = "cache --timeout=60";
+	@cred_link = ($d, '/.git-credential-cache');
+} else {
+	diag "credential.helper=$cred_helper will not be used for this test";
+}
+
+my $mics = do {
+	local $ENV{HOME} = $tmpdir // $ENV{HOME};
+	if ($tmpdir && $cred_set) {
+		$set_cred_helper->("$ENV{HOME}/.gitconfig", $cred_set)
+	}
+	$nwr->imap_common_init;
+};
 my $mic = (values %$mics)[0];
-my $cleanup = PublicInbox::OnDestroy->new(sub {
+my $cleanup = PublicInbox::OnDestroy->new($$, sub {
+	my $mic = $nwr->mic_get($imap_url);
 	$mic->delete($folder) or fail "delete $folder <$folder_uri>: $@";
+	if ($tmpdir && -f "$tmpdir/.gitconfig") {
+		local $ENV{HOME} = $tmpdir;
+		system(qw(git credential-cache exit));
+	}
 });
 my $imap_append = $nwr->can('imap_append');
 my $smsg = bless { kw => [ 'seen' ] }, 'PublicInbox::Smsg';
 $imap_append->($mic, $folder, undef, $smsg, eml_load('t/plack-qp.eml'));
-my @res;
 $nwr->{quiet} = 1;
-$nwr->imap_each($$folder_uri, sub {
-	my ($u, $uid, $kw, $eml, $arg) = @_;
-	push @res, [ $kw, $eml ];
-});
-is(scalar(@res), 1, 'got appended message');
-is_deeply(\@res, [ [ [ 'seen' ], eml_load('t/plack-qp.eml') ] ],
+my $imap_slurp_all = sub {
+	my ($u, $uid, $kw, $eml, $res) = @_;
+	push @$res, [ $kw, $eml ];
+};
+$nwr->imap_each($$folder_uri, $imap_slurp_all, my $res = []);
+is(scalar(@$res), 1, 'got appended message');
+my $plack_qp_eml = eml_load('t/plack-qp.eml');
+is_deeply($res, [ [ [ 'seen' ], $plack_qp_eml ] ],
 	'uploaded message read back');
+$res = $mic = $mics = undef;
+
+test_lei(sub {
+	my ($ro_home, $cfg_path) = setup_public_inboxes;
+	my $cfg = PublicInbox::Config->new($cfg_path);
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		lei_ok qw(add-external -q), $ibx->{inboxdir} or BAIL_OUT;
+	});
+
+	# cred_link[0] may be on a different (hopefully encrypted) FS,
+	# we only symlink to it here, so we don't copy any sensitive data
+	# into the temporary directory
+	if (@cred_link && !symlink($cred_link[0], $ENV{HOME}.$cred_link[1])) {
+		diag "symlink @cred_link: $! (non-fatal)";
+		$cred_set = undef;
+	}
+	$set_cred_helper->("$ENV{HOME}/.gitconfig", $cred_set) if $cred_set;
+
+	lei_ok qw(q f:qp@example.com -o), $$folder_uri;
+	$nwr->imap_each($$folder_uri, $imap_slurp_all, my $res = []);
+	is(scalar(@$res), 1, 'got one deduped result') or diag explain($res);
+	is_deeply($res->[0]->[1], $plack_qp_eml,
+			'lei q wrote expected result');
+
+	lei_ok qw(q f:matz -a -o), $$folder_uri;
+	$nwr->imap_each($$folder_uri, $imap_slurp_all, my $aug = []);
+	is(scalar(@$aug), 2, '2 results after augment') or diag explain($aug);
+	my $exp = $res->[0]->[1]->as_string;
+	is(scalar(grep { $_->[1]->as_string eq $exp } @$aug), 1,
+			'original remains after augment');
+	$exp = eml_load('t/iso-2202-jp.eml')->as_string;
+	is(scalar(grep { $_->[1]->as_string eq $exp } @$aug), 1,
+			'new result shown after augment');
+
+	lei_ok qw(q s:thisbetternotgiveanyresult -o), $folder_uri->as_string;
+	$nwr->imap_each($$folder_uri, $imap_slurp_all, my $empty = []);
+	is(scalar(@$empty), 0, 'no results w/o augment');
+
+});
 
-undef $cleanup;
+undef $cleanup; # remove temporary folder
 done_testing;

^ permalink raw reply related	[relevance 34%]

* [PATCH] lei-daemon: prefer graceful shutdowns
@ 2021-02-21 18:28 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-21 18:28 UTC (permalink / raw)
  To: meta

We'll keep the daemon alive as long as a a script/lei client
remains connected.  This ought to improve user experience
and is in line with what -imapd/-httpd/-nntpd users have
expected over the years.
---
 lib/PublicInbox/LEI.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0b4bc20e..8d49b212 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1101,6 +1101,8 @@ sub lazy_start {
 	exit($exit_code // 0);
 }
 
+sub busy { 1 } # prevent daemon-shutdown if client is connected
+
 # for users w/o Socket::Msghdr installed or Inline::C enabled
 sub oneshot {
 	my ($main_pkg) = @_;

^ permalink raw reply related	[relevance 71%]

* [PATCH] t/lei*: drop $lei->(...) sub
@ 2021-02-21 19:59 37% Eric Wong
  2021-02-21 20:42 71% ` [SQUASH 2/1] t/lei-externals: squash fix Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-21 19:59 UTC (permalink / raw)
  To: meta

lei() and lei_ok() are superior since they offer prototype
checks and lei_ok() adds another check + description DRY-ness.

The $lei sub was only bound to a variable since it was in
t/lei.t and named subs don't work well with the key2sub()
wrapper.
---
 lib/PublicInbox/TestCommon.pm | 13 +++++----
 t/lei-daemon.t                | 14 +++++-----
 t/lei-externals.t             | 51 +++++++++++++++++------------------
 t/lei-import-maildir.t        |  8 +++---
 t/lei-import.t                | 26 +++++++++---------
 t/lei-mirror.t                | 16 +++++------
 t/lei.t                       | 50 +++++++++++++++++-----------------
 7 files changed, 88 insertions(+), 90 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 3eb08e9f..ca05fa21 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -16,7 +16,7 @@ BEGIN {
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes
 		tcp_host_port test_lei lei lei_ok
-		$lei $lei_out $lei_err $lei_opt);
+		$lei_out $lei_err $lei_opt);
 	require Test::More;
 	my @methods = grep(!/\W/, @Test::More::EXPORT);
 	eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
@@ -446,7 +446,8 @@ sub have_xapian_compact () {
 }
 
 our ($err_skip, $lei_opt, $lei_out, $lei_err);
-our $lei = sub {
+# favor lei() or lei_ok() over $lei for new code
+sub lei (@) {
 	my ($cmd, $env, $xopt) = @_;
 	$lei_out = $lei_err = '';
 	if (!ref($cmd)) {
@@ -459,8 +460,6 @@ our $lei = sub {
 	$res;
 };
 
-sub lei (@) { $lei->(@_) }
-
 sub lei_ok (@) {
 	my $msg = ref($_[-1]) eq 'SCALAR' ? pop(@_) : undef;
 	my $tmpdir = quotemeta(File::Spec->tmpdir);
@@ -510,11 +509,11 @@ EOM
 		mkdir($xrd, 0700) or BAIL_OUT "mkdir: $!";
 		local $ENV{XDG_RUNTIME_DIR} = $xrd;
 		$cb->();
-		ok($lei->(qw(daemon-pid)), "daemon-pid after $t");
+		lei_ok(qw(daemon-pid), \"daemon-pid after $t");
 		chomp($daemon_pid = $lei_out);
 		if ($daemon_pid) {
 			ok(kill(0, $daemon_pid), "daemon running after $t");
-			ok($lei->(qw(daemon-kill)), "daemon-kill after $t");
+			lei_ok(qw(daemon-kill), \"daemon-kill after $t");
 		} else {
 			fail("daemon not running after $t");
 		}
@@ -528,7 +527,7 @@ EOM
 		local $ENV{HOME} = $home;
 		# force sun_path[108] overflow:
 		my $xrd = "$home/1shot-test".('.sun_path' x 108);
-		local $err_skip = qr!\Q$xrd!; # for $lei->() filtering
+		local $err_skip = qr!\Q$xrd!; # for lei() filtering
 		local $ENV{XDG_RUNTIME_DIR} = $xrd;
 		$cb->();
 	}
diff --git a/t/lei-daemon.t b/t/lei-daemon.t
index c55ba86c..c30e5ac1 100644
--- a/t/lei-daemon.t
+++ b/t/lei-daemon.t
@@ -6,7 +6,7 @@ use strict; use v5.10.1; use PublicInbox::TestCommon;
 test_lei({ daemon_only => 1 }, sub {
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
 	my $err_log = "$ENV{XDG_RUNTIME_DIR}/lei/errors.log";
-	ok($lei->('daemon-pid'), 'daemon-pid');
+	lei_ok('daemon-pid');
 	is($lei_err, '', 'no error from daemon-pid');
 	like($lei_out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
 	chomp(my $pid = $lei_out);
@@ -17,12 +17,12 @@ test_lei({ daemon_only => 1 }, sub {
 	print $efh "phail\n" or BAIL_OUT $!;
 	close $efh or BAIL_OUT $!;
 
-	ok($lei->('daemon-pid'), 'daemon-pid');
+	lei_ok('daemon-pid');
 	chomp(my $pid_again = $lei_out);
 	is($pid, $pid_again, 'daemon-pid idempotent');
 	like($lei_err, qr/phail/, 'got mock "phail" error previous run');
 
-	ok($lei->(qw(daemon-kill)), 'daemon-kill');
+	lei_ok(qw(daemon-kill));
 	is($lei_out, '', 'no output from daemon-kill');
 	is($lei_err, '', 'no error from daemon-kill');
 	for (0..100) {
@@ -32,22 +32,22 @@ test_lei({ daemon_only => 1 }, sub {
 	ok(-S $sock, 'sock still exists');
 	ok(!kill(0, $pid), 'pid gone after stop');
 
-	ok($lei->(qw(daemon-pid)), 'daemon-pid');
+	lei_ok(qw(daemon-pid));
 	chomp(my $new_pid = $lei_out);
 	ok(kill(0, $new_pid), 'new pid is running');
 	ok(-S $sock, 'sock still exists');
 
 	for my $sig (qw(-0 -CHLD)) {
-		ok($lei->('daemon-kill', $sig), "handles $sig");
+		lei_ok('daemon-kill', $sig, \"handles $sig");
 	}
 	is($lei_out.$lei_err, '', 'no output on innocuous signals');
-	ok($lei->('daemon-pid'), 'daemon-pid');
+	lei_ok('daemon-pid');
 	chomp $lei_out;
 	is($lei_out, $new_pid, 'PID unchanged after -0/-CHLD');
 
 	if ('socket inaccessible') {
 		chmod 0000, $sock or BAIL_OUT "chmod 0000: $!";
-		ok($lei->('help'), 'connect fail, one-shot fallback works');
+		lei_ok('help', \'connect fail, one-shot fallback works');
 		like($lei_err, qr/\bconnect\(/, 'connect error noted');
 		like($lei_out, qr/^usage: /, 'help output works');
 		chmod 0700, $sock or BAIL_OUT "chmod 0700: $!";
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 02b15232..edaaa5f8 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -20,11 +20,11 @@ SKIP: {
 	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
 	my @cmd = ('q', '--only', $url, '-q', "m:$mid");
-	ok($lei->(@cmd), "query $url");
+	lei_ok(@cmd, \"query $url");
 	is($lei_err, '', "no errors on $url");
 	my $res = json_utf8->decode($lei_out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
-	ok($lei->(@cmd, 'd:..20101002'), 'no results, no error');
+	lei(@cmd, 'd:..20101002', \'no results, no error');
 	is($lei_err, '', 'no output on 404, matching local FS behavior');
 	is($lei_out, "[null]\n", 'got null results');
 } # /SKIP
@@ -93,27 +93,26 @@ test_lei(sub {
 			https://example https://example. https://example.co
 			https://example.com https://example.com/
 			https://example.com/i https://example.com/ibx)) {
-		ok($lei->(qw(_complete lei forget-external), $u),
-			"partial completion for URL $u");
+		lei_ok(qw(_complete lei forget-external), $u,
+			\"partial completion for URL $u");
 		is($lei_out, "https://example.com/ibx/\n",
 			"completed partial URL $u");
 		for my $qo (qw(-I --include --exclude --only)) {
-			ok($lei->(qw(_complete lei q), $qo, $u),
-				"partial completion for URL q $qo $u");
+			lei_ok(qw(_complete lei q), $qo, $u,
+				\"partial completion for URL q $qo $u");
 			is($lei_out, "https://example.com/ibx/\n",
 				"completed partial URL $u on q $qo");
 		}
 	}
-	ok($lei->(qw(_complete lei add-external), 'https://'),
-		'add-external hostname completion');
+	lei_ok(qw(_complete lei add-external), 'https://',
+		\'add-external hostname completion');
 	is($lei_out, "https://example.com/\n", 'completed up to hostname');
 
-	$lei->('ls-external');
+	lei_ok('ls-external');
 	like($lei_out, qr!https://example\.com/ibx/!s, 'added canonical URL');
 	is($lei_err, '', 'no warnings on ls-external');
-	ok($lei->(qw(forget-external -q https://EXAMPLE.com/ibx)),
-		'forget');
-	$lei->('ls-external');
+	lei_ok(qw(forget-external -q https://EXAMPLE.com/ibx));
+	lei_ok('ls-external');
 	unlike($lei_out, qr!https://example\.com/ibx/!s,
 		'removed canonical URL');
 SKIP: {
@@ -137,14 +136,14 @@ SKIP: {
 	# or use single quotes, it should not matter.  Users only need
 	# to know shell quoting rules, not Xapian quoting rules.
 	# No double-quoting should be imposed on users on the CLI
-	$lei->('q', 's:use boolean prefix');
+	lei_ok('q', 's:use boolean prefix');
 	like($lei_out, qr/search: use boolean prefix/,
 		'phrase search got result');
 	my $res = json_utf8->decode($lei_out);
 	is(scalar(@$res), 2, 'only 2 element array (1 result)');
 	is($res->[1], undef, 'final element is undef'); # XXX should this be?
 	is(ref($res->[0]), 'HASH', 'first element is hashref');
-	$lei->('q', '--pretty', 's:use boolean prefix');
+	lei_ok('q', '--pretty', 's:use boolean prefix');
 	my $pretty = json_utf8->decode($lei_out);
 	is_deeply($res, $pretty, '--pretty is identical after decode');
 
@@ -153,29 +152,29 @@ SKIP: {
 		$fh->autoflush(1);
 		print $fh 's:use d:..5.days.from.now' or BAIL_OUT $!;
 		seek($fh, 0, SEEK_SET) or BAIL_OUT $!;
-		ok($lei->([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $fh }),
-				'--stdin on regular file works');
+		lei_ok([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $fh },
+				\'--stdin on regular file works');
 		like($lei_out, qr/use boolean/, '--stdin on regular file');
 	}
 	{
 		pipe(my ($r, $w)) or BAIL_OUT $!;
 		print $w 's:use' or BAIL_OUT $!;
 		close $w or BAIL_OUT $!;
-		ok($lei->([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $r }),
-				'--stdin on pipe file works');
+		lei_ok([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $r },
+				\'--stdin on pipe file works');
 		like($lei_out, qr/use boolean prefix/, '--stdin on pipe');
 	}
-	ok(!$lei->(qw(q -q --stdin s:use)), "--stdin and argv don't mix");
+	ok(!lei(qw(q -q --stdin s:use)), "--stdin and argv don't mix");
 
 	for my $fmt (qw(ldjson ndjson jsonl)) {
-		$lei->('q', '-f', $fmt, 's:use boolean prefix');
+		lei_ok('q', '-f', $fmt, 's:use boolean prefix');
 		is($lei_out, json_utf8->encode($pretty->[0])."\n", "-f $fmt");
 	}
 
 	require IO::Uncompress::Gunzip;
 	for my $sfx ('', '.gz') {
 		my $f = "$home/mbox$sfx";
-		$lei->('q', '-o', "mboxcl2:$f", 's:use boolean prefix');
+		lei_ok('q', '-o', "mboxcl2:$f", 's:use boolean prefix');
 		my $cat = $sfx eq '' ? sub {
 			open my $mb, '<', $f or fail "no mbox: $!";
 			<$mb>
@@ -185,26 +184,26 @@ SKIP: {
 		};
 		my @s = grep(/^Subject:/, $cat->());
 		is(scalar(@s), 1, "1 result in mbox$sfx");
-		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
+		lei_ok('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
 		is(grep(!/^#/, $lei_err), 0, 'no errors from augment') or
 			diag $lei_err;
 		@s = grep(/^Subject:/, my @wtf = $cat->());
 		is(scalar(@s), 2, "2 results in mbox$sfx");
 
-		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
+		lei_ok('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
 		is(grep(!/^#/, $lei_err), 0, "no errors on no results ($sfx)");
 
 		my @s2 = grep(/^Subject:/, $cat->());
 		is_deeply(\@s2, \@s,
 			"same 2 old results w/ --augment and bad search $sfx");
 
-		$lei->('q', '-o', "mboxcl2:$f", 's:nonexistent');
+		lei_ok('q', '-o', "mboxcl2:$f", 's:nonexistent');
 		my @res = $cat->();
 		is_deeply(\@res, [], "clobber w/o --augment $sfx");
 	}
-	ok(!$lei->('q', '-o', "$home/mbox", 's:nope'),
+	ok(!lei('q', '-o', "$home/mbox", 's:nope'),
 			'fails if mbox format unspecified');
-	ok(!$lei->(qw(q --no-local s:see)), '--no-local');
+	ok(!lei(qw(q --no-local s:see)), '--no-local');
 	is($? >> 8, 1, 'proper exit code');
 	like($lei_err, qr/no local or remote.+? to search/, 'no inbox');
 	my %e = (
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index d2b059ad..a3796491 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -10,15 +10,15 @@ test_lei(sub {
 	}
 	symlink(abs_path('t/data/0001.patch'), "$md/cur/x:2,S") or
 		BAIL_OUT "symlink $md $!";
-	ok($lei->(qw(import), $md), 'import Maildir');
-	ok($lei->(qw(q s:boolean)), 'lei q');
+	lei_ok(qw(import), $md, \'import Maildir');
+	lei_ok(qw(q s:boolean));
 	my $res = json_utf8->decode($lei_out);
 	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
 	is_deeply($res->[0]->{kw}, ['seen'], 'keyword set');
 	is($res->[1], undef, 'only got one result');
 
-	ok($lei->(qw(import), $md), 'import Maildir again');
-	ok($lei->(qw(q -d none s:boolean)), 'lei q w/o dedupe');
+	lei_ok(qw(import), $md, \'import Maildir again');
+	lei_ok(qw(q -d none s:boolean), \'lei q w/o dedupe');
 	my $r2 = json_utf8->decode($lei_out);
 	is_deeply($r2, $res, 'idempotent import');
 
diff --git a/t/lei-import.t b/t/lei-import.t
index b691798a..46747a91 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -3,18 +3,18 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 test_lei(sub {
-ok(!$lei->(qw(import -f bogus), 't/plack-qp.eml'), 'fails with bogus format');
+ok(!lei(qw(import -f bogus), 't/plack-qp.eml'), 'fails with bogus format');
 like($lei_err, qr/\bbogus unrecognized/, 'gave error message');
 
-ok($lei->(qw(q s:boolean)), 'search miss before import');
+lei_ok(qw(q s:boolean), \'search miss before import');
 unlike($lei_out, qr/boolean/i, 'no results, yet');
 open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
-ok($lei->([qw(import -f eml -)], undef, { %$lei_opt, 0 => $fh }),
-	'import single file from stdin') or diag $lei_err;
+lei_ok([qw(import -f eml -)], undef, { %$lei_opt, 0 => $fh },
+	\'import single file from stdin') or diag $lei_err;
 close $fh;
-ok($lei->(qw(q s:boolean)), 'search hit after import');
-ok($lei->(qw(import -f eml), 't/data/message_embed.eml'),
-	'import single file by path');
+lei_ok(qw(q s:boolean), \'search hit after import');
+lei_ok(qw(import -f eml), 't/data/message_embed.eml',
+	\'import single file by path');
 
 my $str = <<'';
 From: a@b
@@ -22,17 +22,17 @@ Message-ID: <x@y>
 Status: RO
 
 my $opt = { %$lei_opt, 0 => \$str };
-ok($lei->([qw(import -f eml -)], undef, $opt),
-	'import single file with keywords from stdin');
-$lei->(qw(q m:x@y));
+lei_ok([qw(import -f eml -)], undef, $opt,
+	\'import single file with keywords from stdin');
+lei_ok(qw(q m:x@y));
 my $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
 
 $str =~ tr/x/v/; # v@y
-ok($lei->([qw(import --no-kw -f eml -)], undef, $opt),
-	'import single file with --no-kw from stdin');
-$lei->(qw(q m:v@y));
+lei_ok([qw(import --no-kw -f eml -)], undef, $opt,
+	\'import single file with --no-kw from stdin');
+lei(qw(q m:v@y));
 $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, [], 'no keywords set');
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index cbe300da..1d113e3e 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -13,28 +13,28 @@ my $td = start_script($cmd, { PI_CONFIG => $cfg_path }, { 3 => $sock });
 test_lei({ tmpdir => $tmpdir }, sub {
 	my $home = $ENV{HOME};
 	my $t1 = "$home/t1-mirror";
-	ok($lei->('add-external', $t1, '--mirror', "$http/t1/"), '--mirror v1');
+	lei_ok('add-external', $t1, '--mirror', "$http/t1/", \'--mirror v1');
 	ok(-f "$t1/public-inbox/msgmap.sqlite3", 't1-mirror indexed');
 
-	ok($lei->('ls-external'), 'ls-external');
+	lei_ok('ls-external');
 	like($lei_out, qr!\Q$t1\E!, 't1 added to ls-externals');
 
 	my $t2 = "$home/t2-mirror";
-	ok($lei->('add-external', $t2, '--mirror', "$http/t2/"), '--mirror v2');
+	lei_ok('add-external', $t2, '--mirror', "$http/t2/", \'--mirror v2');
 	ok(-f "$t2/msgmap.sqlite3", 't2-mirror indexed');
 
-	ok($lei->('ls-external'), 'ls-external');
+	lei_ok('ls-external');
 	like($lei_out, qr!\Q$t2\E!, 't2 added to ls-externals');
 
-	ok(!$lei->('add-external', $t2, '--mirror', "$http/t2/"),
+	ok(!lei('add-external', $t2, '--mirror', "$http/t2/"),
 		'--mirror fails if reused') or diag "$lei_err.$lei_out = $?";
 
-	ok($lei->('ls-external'), 'ls-external');
+	lei_ok('ls-external');
 	like($lei_out, qr!\Q$t2\E!, 'still in ls-externals');
 
-	ok(!$lei->('add-external', "$t2-fail", '-Lmedium'), '--mirror v2');
+	ok(!lei('add-external', "$t2-fail", '-Lmedium'), '--mirror v2');
 	ok(!-d "$t2-fail", 'destination not created on failure');
-	ok($lei->('ls-external'), 'ls-external');
+	lei_ok('ls-external');
 	unlike($lei_out, qr!\Q$t2-fail\E!, 'not added to ls-external');
 });
 
diff --git a/t/lei.t b/t/lei.t
index 4785acca..2e0b8a1f 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -13,36 +13,36 @@ my $home_trash = [];
 my $cleanup = sub { rmtree([@$home_trash, @_]) };
 
 my $test_help = sub {
-	ok(!$lei->(), 'no args fails');
+	ok(!lei([]), 'no args fails');
 	is($? >> 8, 1, '$? is 1');
 	is($lei_out, '', 'nothing in stdout');
 	like($lei_err, qr/^usage:/sm, 'usage in stderr');
 
 	for my $arg (['-h'], ['--help'], ['help'], [qw(daemon-pid --help)]) {
-		ok($lei->($arg), "lei @$arg");
+		lei_ok($arg);
 		like($lei_out, qr/^usage:/sm, "usage in stdout (@$arg)");
 		is($lei_err, '', "nothing in stderr (@$arg)");
 	}
 
 	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
-		ok(!$lei->($arg), "lei @$arg");
+		ok(!lei($arg), "lei @$arg");
 		is($? >> 8, 1, '$? set correctly');
 		isnt($lei_err, '', 'something in stderr');
 		is($lei_out, '', 'nothing in stdout');
 	}
-	ok($lei->(qw(init -h)), 'init -h');
+	lei_ok(qw(init -h));
 	like($lei_out, qr! \Q$home\E/\.local/share/lei/store\b!,
 		'actual path shown in init -h');
-	ok($lei->(qw(init -h), { XDG_DATA_HOME => '/XDH' }),
-		'init with XDG_DATA_HOME');
+	lei_ok(qw(init -h), { XDG_DATA_HOME => '/XDH' },
+		\'init with XDG_DATA_HOME');
 	like($lei_out, qr! /XDH/lei/store\b!, 'XDG_DATA_HOME in init -h');
 	is($lei_err, '', 'no errors from init -h');
 
-	ok($lei->(qw(config -h)), 'config-h');
+	lei_ok(qw(config -h));
 	like($lei_out, qr! \Q$home\E/\.config/lei/config\b!,
 		'actual path shown in config -h');
-	ok($lei->(qw(config -h), { XDG_CONFIG_HOME => '/XDC' }),
-		'config with XDG_CONFIG_HOME');
+	lei_ok(qw(config -h), { XDG_CONFIG_HOME => '/XDC' },
+		\'config with XDG_CONFIG_HOME');
 	like($lei_out, qr! /XDC/lei/config\b!, 'XDG_CONFIG_HOME in config -h');
 	is($lei_err, '', 'no errors from config -h');
 };
@@ -55,24 +55,24 @@ my $ok_err_info = sub {
 
 my $test_init = sub {
 	$cleanup->();
-	ok($lei->('init'), 'init w/o args');
+	lei_ok('init', \'init w/o args');
 	$ok_err_info->('after init w/o args');
-	ok($lei->('init'), 'idempotent init w/o args');
+	lei_ok('init', \'idempotent init w/o args');
 	$ok_err_info->('after idempotent init w/o args');
 
-	ok(!$lei->('init', "$home/x"), 'init conflict');
+	ok(!lei('init', "$home/x"), 'init conflict');
 	is(grep(/^E:/, split(/^/, $lei_err)), 1, 'got error on conflict');
 	ok(!-e "$home/x", 'nothing created on conflict');
 	$cleanup->();
 
-	ok($lei->('init', "$home/x"), 'init conflict resolved');
+	lei_ok('init', "$home/x", \'init conflict resolved');
 	$ok_err_info->('init w/ arg');
-	ok($lei->('init', "$home/x"), 'init idempotent w/ path');
+	lei_ok('init', "$home/x", \'init idempotent w/ path');
 	$ok_err_info->('init idempotent w/ arg');
 	ok(-d "$home/x", 'created dir');
 	$cleanup->("$home/x");
 
-	ok(!$lei->('init', "$home/x", "$home/2"), 'too many args fails');
+	ok(!lei('init', "$home/x", "$home/2"), 'too many args fails');
 	like($lei_err, qr/too many/, 'noted excessive');
 	ok(!-e "$home/x", 'x not created on excessive');
 	for my $d (@$home_trash) {
@@ -84,24 +84,24 @@ my $test_init = sub {
 
 my $test_config = sub {
 	$cleanup->();
-	ok($lei->(qw(config a.b c)), 'config set var');
+	lei_ok(qw(config a.b c), \'config set var');
 	is($lei_out.$lei_err, '', 'no output on var set');
-	ok($lei->(qw(config -l)), 'config -l');
+	lei_ok(qw(config -l), \'config -l');
 	is($lei_err, '', 'no errors on listing');
 	is($lei_out, "a.b=c\n", 'got expected output');
-	ok(!$lei->(qw(config -f), "$home/.config/f", qw(x.y z)),
+	ok(!lei(qw(config -f), "$home/.config/f", qw(x.y z)),
 			'config set var with -f fails');
 	like($lei_err, qr/not supported/, 'not supported noted');
 	ok(!-f "$home/config/f", 'no file created');
 };
 
 my $test_completion = sub {
-	ok($lei->(qw(_complete lei)), 'no errors on complete');
+	lei_ok(qw(_complete lei), \'no errors on complete');
 	my %out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	ok($out{'q'}, "`lei q' offered as completion");
 	ok($out{'add-external'}, "`lei add-external' offered as completion");
 
-	ok($lei->(qw(_complete lei q)), 'complete q (no args)');
+	lei_ok(qw(_complete lei q), \'complete q (no args)');
 	%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	for my $sw (qw(-f --format -o --output --mfolder --augment -a
 			--mua --no-local --local --verbose -v
@@ -110,17 +110,17 @@ my $test_completion = sub {
 		ok($out{$sw}, "$sw offered as `lei q' completion");
 	}
 
-	ok($lei->(qw(_complete lei q --form)), 'complete q --format');
+	lei_ok(qw(_complete lei q --form), \'complete q --format');
 	is($lei_out, "--format\n", 'complete lei q --format');
 	for my $sw (qw(-f --format)) {
-		ok($lei->(qw(_complete lei q), $sw), "complete q $sw ARG");
+		lei_ok(qw(_complete lei q), $sw);
 		%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 		for my $f (qw(mboxrd mboxcl2 mboxcl mboxo json jsonl
 				concatjson maildir)) {
 			ok($out{$f}, "got $sw $f as output format");
 		}
 	}
-	ok($lei->(qw(_complete lei import)), 'complete import');
+	lei_ok(qw(_complete lei import));
 	%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	for my $sw (qw(--flags --no-flags --no-kw --kw --no-keywords
 			--keywords)) {
@@ -131,9 +131,9 @@ my $test_completion = sub {
 my $test_fail = sub {
 SKIP: {
 	skip 'no curl', 3 unless which('curl');
-	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
+	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
 	is($? >> 8, 3, 'got curl exit for bogus URL');
-	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
+	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
 	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir');
 	is($lei_out, '', 'no output');
 }; # /SKIP

^ permalink raw reply related	[relevance 37%]

* [SQUASH 2/1] t/lei-externals: squash fix
  2021-02-21 19:59 37% [PATCH] t/lei*: drop $lei->(...) sub Eric Wong
@ 2021-02-21 20:42 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-21 20:42 UTC (permalink / raw)
  To: meta

:x
---
 t/lei-externals.t | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lei-externals.t b/t/lei-externals.t
index edaaa5f8..233f6092 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -24,7 +24,7 @@ SKIP: {
 	is($lei_err, '', "no errors on $url");
 	my $res = json_utf8->decode($lei_out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
-	lei(@cmd, 'd:..20101002', \'no results, no error');
+	lei_ok(@cmd, 'd:..20101002', \'no results, no error');
 	is($lei_err, '', 'no output on 404, matching local FS behavior');
 	is($lei_out, "[null]\n", 'got null results');
 } # /SKIP

^ permalink raw reply related	[relevance 71%]

* Re: lei stuff that should be in a lei(1) or lei-overview(7)
  2021-02-18 20:28 99% lei stuff that should be in a lei(1) or lei-overview(7) Eric Wong
@ 2021-02-22  3:42 99% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22  3:42 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> More random scattered thoughts

If "lei" conflicts with an existing script or alias on your
system, consider naming it "lorelei"(*).  Of course, the
shorter name is preferred to save keystrokes.

(*) partly named after a well-known instance of public-inbox

^ permalink raw reply	[relevance 99%]

* lei: accessing blob after import requires daemon restart
@ 2021-02-22  5:37 71% Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-22  5:37 UTC (permalink / raw)
  To: meta

When playing around with lei-import, I was unable to display an mbox for
a just-imported message:

  # starting in an uninitialized state with no externals
  $ curl -d'' -fsS 'https://public-inbox.org/meta/?q=cgit+blob+solver&x=m' \
    | zcat >t.mbox
  $ lei import mboxrd:t.mbox
  $ lei q -f ldjson s:remove
  {"blob":"089cdf1af1d738068494a653532ed01b1844407d","docid":11,...}
  $ lei q -f mboxrd s:remove
  missing 089cdf1af1d738068494a653532ed01b1844407d

It's in the local repo though:

  $ git -C ~/.local/share/lei/store/local/0.git \
    rev-parse --verify 089cdf1af1d738068494a653532ed01b1844407d^{blob}
  089cdf1af1d738068494a653532ed01b1844407d

Killing the daemon and trying again resolves the issue:

  $ lei daemon-kill
  $ lei q -f mboxrd s:remove | head -1
  From 089cdf1af1d738068494a653532ed01b1844407d=99@mboxrd Thu Jan  1 00:00:00 1970

Sorry if I'm reporting a known to-do; with a quick search, I didn't spot
anything on the list or in the code, and it feels like enough of a
corner case to be worth mentioning.

^ permalink raw reply	[relevance 71%]

* [PATCH 00/10] lei: avoid wasting IMAP connections
@ 2021-02-22 11:21 69% Eric Wong
    0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-22 11:21 UTC (permalink / raw)
  To: meta

This makes the code a bit less straightforward, unfortunately;
but I've tried to comment it a bit and add some flow notes.
The payoff is it saves IMAP connection setup costs which is
noticeable in high-latency and/or metered bandwidth situations.

LeiAuth is signficantly rewritten so it uses lei-daemon
to route credentials from the first worker to other workers.

Eric Wong (10):
  lei_auth: rename {nrd} field to {net} for clarity
  lei: keep client {sock} in short-lived workers
  lei: _lei_cfg: return empty hashref if unconfigured
  lei convert: auth directly from worker process
  lei import: no separate auth worker
  lei_auth: migrate common auth code from lei_import
  lei q: reduce wasted IMAP connection for auth
  net_reader: mic_get: reuse connections if cache enabled
  lei convert: inline convert_start
  lei_auth: trim and remove leftover worker code

 lib/PublicInbox/LEI.pm         |  8 ++--
 lib/PublicInbox/LeiAuth.pm     | 76 +++++++++++++---------------------
 lib/PublicInbox/LeiConvert.pm  | 36 +++++++---------
 lib/PublicInbox/LeiExternal.pm |  6 +--
 lib/PublicInbox/LeiImport.pm   | 60 ++++++++++++++++-----------
 lib/PublicInbox/LeiQuery.pm    |  9 ++--
 lib/PublicInbox/LeiToMail.pm   | 53 ++++++++++++++++--------
 lib/PublicInbox/LeiXSearch.pm  | 26 ++++++++----
 lib/PublicInbox/NetReader.pm   | 20 +++++----
 lib/PublicInbox/NetWriter.pm   |  2 +-
 10 files changed, 158 insertions(+), 138 deletions(-)


^ permalink raw reply	[relevance 69%]

* [PATCH 02/10] lei: keep client {sock} in short-lived workers
  @ 2021-02-22 11:22 68%   ` Eric Wong
  2021-02-22 11:22 65%   ` [PATCH 03/10] lei: _lei_cfg: return empty hashref if unconfigured Eric Wong
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 11:22 UTC (permalink / raw)
  To: meta

For non-persistent workers, there's no harm in keeping the
client socket open.  This means we can avoid dancing around
closing it in PublicInbox::LeiAuth::ipc_atfork_child.
Eventually, other WQ workers will trigger "git credential"
spawning in script/lei directly.
---
 lib/PublicInbox/LEI.pm     | 4 ++--
 lib/PublicInbox/LeiAuth.pm | 3 ---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8d49b212..73c9e267 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -443,7 +443,7 @@ sub lei_atfork_child {
 	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
 	if ($persist) {
-		my @io = delete @$self{0,1,2};
+		my @io = delete @$self{qw(0 1 2 sock)};
 		unless ($self->{oneshot}) {
 			close($_) for @io;
 		}
@@ -451,7 +451,7 @@ sub lei_atfork_child {
 		delete $self->{0};
 	}
 	delete @$self{qw(cnv)};
-	for (delete @$self{qw(3 sock old_1 au_done)}) {
+	for (delete @$self{qw(3 old_1 au_done)}) {
 		close($_) if defined($_);
 	}
 	if (my $op_c = delete $self->{pkt_op_c}) {
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index c70d8e8f..f2cdb026 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -54,11 +54,8 @@ sub auth_start {
 
 sub ipc_atfork_child {
 	my ($self) = @_;
-	# prevent {sock} from being closed in lei_atfork_child:
-	my $s = delete $self->{lei}->{sock};
 	delete $self->{lei}->{auth}; # drop circular ref
 	$self->{lei}->lei_atfork_child;
-	$self->{lei}->{sock} = $s if $s;
 	$self->SUPER::ipc_atfork_child;
 }
 

^ permalink raw reply related	[relevance 68%]

* [PATCH 03/10] lei: _lei_cfg: return empty hashref if unconfigured
    2021-02-22 11:22 68%   ` [PATCH 02/10] lei: keep client {sock} in short-lived workers Eric Wong
@ 2021-02-22 11:22 65%   ` Eric Wong
  2021-02-22 11:22 60%   ` [PATCH 04/10] lei convert: auth directly from worker process Eric Wong
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 11:22 UTC (permalink / raw)
  To: meta

Existing callers in LeiExternal actually depend on this,
and LeiAuth shouldn't need to be creating a config file
just to do a conversion against an anonymous IMAP server.
---
 lib/PublicInbox/LEI.pm         | 2 +-
 lib/PublicInbox/LeiAuth.pm     | 1 -
 lib/PublicInbox/LeiExternal.pm | 6 +++---
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 73c9e267..dd34c668 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -592,7 +592,7 @@ sub _lei_cfg ($;$) {
 	if (!@st) {
 		unless ($creat) {
 			delete $self->{cfg};
-			return;
+			return bless {}, 'PublicInbox::Config';
 		}
 		my (undef, $cfg_dir, undef) = File::Spec->splitpath($f);
 		-d $cfg_dir or mkpath($cfg_dir) or die "mkpath($cfg_dir): $!\n";
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index f2cdb026..5d321be2 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -42,7 +42,6 @@ sub auth_eof {
 
 sub auth_start {
 	my ($self, $lei, $post_auth_cb, @args) = @_;
-	$lei->_lei_cfg(1); # workers may need to read config
 	my $op = $lei->workers_start($self, 'auth', 1, {
 		'net_merge' => [ \&net_merge, $lei ],
 		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 6cc2e671..0cc84cca 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -9,7 +9,7 @@ use PublicInbox::Config;
 
 sub externals_each {
 	my ($self, $cb, @arg) = @_;
-	my $cfg = $self->_lei_cfg(0);
+	my $cfg = $self->_lei_cfg;
 	my %boost;
 	for my $sec (grep(/\Aexternal\./, @{$cfg->{-section_order}})) {
 		my $loc = substr($sec, length('external.'));
@@ -234,7 +234,7 @@ sub _complete_url_common ($) {
 # shell completion helper called by lei__complete
 sub _complete_forget_external {
 	my ($self, @argv) = @_;
-	my $cfg = $self->_lei_cfg(0);
+	my $cfg = $self->_lei_cfg;
 	my ($cur, $re) = _complete_url_common(\@argv);
 	# FIXME: bash completion off "http:" or "https:" when the last
 	# character is a colon doesn't work properly even if we're
@@ -250,7 +250,7 @@ sub _complete_forget_external {
 
 sub _complete_add_external { # for bash, this relies on "compopt -o nospace"
 	my ($self, @argv) = @_;
-	my $cfg = $self->_lei_cfg(0);
+	my $cfg = $self->_lei_cfg;
 	my ($cur, $re) = _complete_url_common(\@argv);
 	require URI;
 	map {

^ permalink raw reply related	[relevance 65%]

* [PATCH 04/10] lei convert: auth directly from worker process
    2021-02-22 11:22 68%   ` [PATCH 02/10] lei: keep client {sock} in short-lived workers Eric Wong
  2021-02-22 11:22 65%   ` [PATCH 03/10] lei: _lei_cfg: return empty hashref if unconfigured Eric Wong
@ 2021-02-22 11:22 60%   ` Eric Wong
  2021-02-22 11:22 53%   ` [PATCH 05/10] lei import: no separate auth worker Eric Wong
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 11:22 UTC (permalink / raw)
  To: meta

Since this only has one worker, we can auth directly in the
worker since the convert worker now has access to the script/lei
{sock} for running "git credential".
---
 lib/PublicInbox/LeiConvert.pm | 20 +++++++++++---------
 lib/PublicInbox/NetReader.pm  | 16 +++++++++-------
 lib/PublicInbox/NetWriter.pm  |  2 +-
 3 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index ba375772..3a714502 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -32,6 +32,10 @@ sub do_convert { # via wq_do
 	my ($self) = @_;
 	my $lei = $self->{lei};
 	my $in_fmt = $lei->{opt}->{'in-format'};
+	my $mics;
+	if (my $nrd = $lei->{nrd}) { # may prompt user once
+		$nrd->{mics_cached} = $nrd->imap_common_init($lei);
+	}
 	if (my $stdin = delete $self->{0}) {
 		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
 	}
@@ -120,16 +124,14 @@ sub call { # the main "lei convert" method
 		require PublicInbox::MdirReader;
 	}
 	$self->{inputs} = \@inputs;
-	return convert_start($lei) if !$nrd;
-
-	if (my $err = $nrd->errors) {
-		return $lei->fail($err);
+	if ($nrd) {
+		if (my $err = $nrd->errors) {
+			return $lei->fail($err);
+		}
+		$nrd->{quiet} = $opt->{quiet};
+		$lei->{nrd} = $nrd;
 	}
-	$nrd->{quiet} = $opt->{quiet};
-	$lei->{nrd} = $nrd;
-	require PublicInbox::LeiAuth;
-	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
-	$auth->auth_start($lei, \&convert_start, $lei);
+	convert_start($lei);
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 0956d5da..c29e09c1 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -423,14 +423,16 @@ sub _imap_fetch_all ($$$) {
 
 # uses cached auth info prepared by mic_for
 sub mic_get {
-	my ($self, $sec) = @_;
-	my $mic_arg = $self->{mic_arg}->{$sec};
-	unless ($mic_arg) {
-		my $uri = ref $sec ? $sec : PublicInbox::URIimap->new($sec);
-		$sec = uri_section($uri);
-		$mic_arg = $self->{mic_arg}->{$sec} or
-			die "BUG: no Mail::IMAPClient->new arg for $sec";
+	my ($self, $uri) = @_;
+	my $sec = uri_section($uri);
+	# see if caller saved result of imap_common_init
+	if (my $cached = $self->{mics_cached}) {
+		my $mic = $cached->{$sec};
+		return $mic if $mic && $mic->IsConnected;
+		delete $cached->{$sec};
 	}
+	my $mic_arg = $self->{mic_arg}->{$sec} or
+			die "BUG: no Mail::IMAPClient->new arg for $sec";
 	if (defined(my $cb_name = $mic_arg->{Authcallback})) {
 		if (ref($cb_name) ne 'CODE') {
 			$mic_arg->{Authcallback} = $self->can($cb_name);
diff --git a/lib/PublicInbox/NetWriter.pm b/lib/PublicInbox/NetWriter.pm
index 89f8662e..c68b0669 100644
--- a/lib/PublicInbox/NetWriter.pm
+++ b/lib/PublicInbox/NetWriter.pm
@@ -28,7 +28,7 @@ sub imap_delete_all {
 	my $uri = PublicInbox::URIimap->new($url);
 	my $sec = $self->can('uri_section')->($uri);
 	local $0 = $uri->mailbox." $sec";
-	my $mic = $self->mic_get($sec) or die "E: not connected: $@";
+	my $mic = $self->mic_get($uri) or die "E: not connected: $@";
 	$mic->select($uri->mailbox) or return; # non-existent
 	if ($mic->delete_message('1:*')) {
 		$mic->expunge;

^ permalink raw reply related	[relevance 60%]

* [PATCH 05/10] lei import: no separate auth worker
                       ` (2 preceding siblings ...)
  2021-02-22 11:22 60%   ` [PATCH 04/10] lei convert: auth directly from worker process Eric Wong
@ 2021-02-22 11:22 53%   ` Eric Wong
  2021-02-22 11:22 40%   ` [PATCH 07/10] lei q: reduce wasted IMAP connection for auth Eric Wong
  2021-02-22 11:22 71%   ` [PATCH 09/10] lei convert: inline convert_start Eric Wong
  5 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 11:22 UTC (permalink / raw)
  To: meta

We'll start sharing auth info from the first worker to the
rest of the workers via wq_broadcast.

This lays the groundwork for getting rid of LeiAuth workers for
authentication work and reducing network round trips required
for IMAP.
---
 lib/PublicInbox/LeiImport.pm | 87 ++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 24 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 68cab12c..5e2e61af 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -8,6 +8,7 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::PktOp qw(pkt_do);
 
 sub _import_eml { # MboxReader callback
 	my ($eml, $sto, $set_kw) = @_;
@@ -28,24 +29,54 @@ sub import_done { # EOF callback for main daemon
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
+sub net_merge_all { # via wq_broadcast
+	my ($self, $net_new) = @_;
+	my $net = $self->{lei}->{net};
+	%$net = (%$net, %$net_new);
+	pkt_do($self->{lei}->{pkt_op_p}, 'net_merge_done1') or
+		die "pkt_op_do net_merge_done1: $!";
+}
+
+sub net_merge_continue { # first worker is done with auth
+	my ($self, $net_new) = @_;
+	$self->wq_broadcast('net_merge_all', $net_new);
+}
+
+sub net_merge_complete {
+	my ($self) = @_;
+	for my $input (@{$self->{inputs}}) {
+		$self->wq_io_do('import_path_url', [], $input);
+	}
+	$self->wq_close(1);
+}
+
+sub net_merge_done1 {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	return if ++$lei->{nr_net_merge_done} != $self->{-wq_nr_workers};
+	net_merge_complete($self);
+}
+
 sub import_start {
 	my ($lei) = @_;
 	my $self = $lei->{imp};
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
-	if (my $nrd = $lei->{nrd}) {
-		# $j = $nrd->net_concurrency($j); TODO
+	if (my $net = $lei->{net}) {
+		# $j = $net->net_concurrency($j); TODO
 	} else {
 		my $nproc = $self->detect_nproc;
 		$j = $nproc if $j > $nproc;
 	}
-	my $op = $lei->workers_start($self, 'lei_import', $j, {
-		'' => [ \&import_done, $lei ],
-	});
-	$self->wq_io_do('import_stdin', []) if $self->{0};
-	for my $input (@{$self->{inputs}}) {
-		$self->wq_io_do('import_path_url', [], $input);
+	my $ops = { '' => [ \&import_done, $lei ] };
+	my $auth = $lei->{auth};
+	if ($auth) {
+		$ops->{net_merge} = [ \&net_merge_continue, $self ];
+		$ops->{net_merge_done1} = [ \&net_merge_done1, $self ];
 	}
-	$self->wq_close(1);
+	$self->{-wq_nr_workers} = $j // 1; # locked
+	my $op = $lei->workers_start($self, 'lei_import', undef, $ops);
+	$self->wq_io_do('import_stdin', []) if $self->{0};
+	net_merge_complete($self) if !$auth;
 	while ($op && $op->{sock}) { $op->event_step }
 }
 
@@ -53,7 +84,7 @@ sub call { # the main "lei import" method
 	my ($cls, $lei, @inputs) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
-	my ($nrd, @f, @d);
+	my ($net, @f, @d);
 	$lei->{opt}->{kw} //= 1;
 	my $self = $lei->{imp} = bless { inputs => \@inputs }, $cls;
 	if ($lei->{opt}->{stdin}) {
@@ -69,8 +100,8 @@ sub call { # the main "lei import" method
 		my $input_path = $input;
 		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
 			require PublicInbox::NetReader;
-			$nrd //= PublicInbox::NetReader->new;
-			$nrd->add_url($input);
+			$net //= PublicInbox::NetReader->new;
+			$net->add_url($input);
 		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
 			my $ifmt = lc $1;
 			if (($fmt // $ifmt) ne $ifmt) {
@@ -98,23 +129,31 @@ sub call { # the main "lei import" method
 		require PublicInbox::MdirReader;
 	}
 	$self->{inputs} = \@inputs;
-	return import_start($lei) if !$nrd;
-
-	if (my $err = $nrd->errors) {
-		return $lei->fail($err);
+	if ($net) {
+		if (my $err = $net->errors) {
+			return $lei->fail($err);
+		}
+		$net->{quiet} = $lei->{opt}->{quiet};
+		$lei->{net} = $net;
+		require PublicInbox::LeiAuth;
+		$lei->{auth} = PublicInbox::LeiAuth->new($net);
 	}
-	$nrd->{quiet} = $lei->{opt}->{quiet};
-	$lei->{nrd} = $nrd;
-	require PublicInbox::LeiAuth;
-	my $auth = $lei->{auth} = PublicInbox::LeiAuth->new($nrd);
-	$auth->auth_start($lei, \&import_start, $lei);
+	import_start($lei);
 }
 
 sub ipc_atfork_child {
 	my ($self) = @_;
-	delete $self->{lei}->{imp}; # drop circular ref
-	$self->{lei}->lei_atfork_child;
+	my $lei = $self->{lei};
+	delete $lei->{imp}; # drop circular ref
+	$lei->lei_atfork_child;
 	$self->SUPER::ipc_atfork_child;
+	my $net = $lei->{net};
+	if ($net && $self->{-wq_worker_nr} == 0) {
+		my $mics = $net->imap_common_init($lei);
+		PublicInbox::LeiAuth::net_merge($lei, $net);
+		$net->{mics_cached} = $mics;
+	}
+	undef;
 }
 
 sub _import_fh {
@@ -154,7 +193,7 @@ sub import_path_url {
 	my $ifmt = lc($lei->{opt}->{'format'} // '');
 	# TODO auto-detect?
 	if ($input =~ m!\A(imap|nntp)s?://!i) {
-		$lei->{nrd}->imap_each($input, \&_import_imap, $lei->{sto},
+		$lei->{net}->imap_each($input, \&_import_imap, $lei->{sto},
 					$lei->{opt}->{kw});
 		return;
 	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {

^ permalink raw reply related	[relevance 53%]

* [PATCH 09/10] lei convert: inline convert_start
                       ` (4 preceding siblings ...)
  2021-02-22 11:22 40%   ` [PATCH 07/10] lei q: reduce wasted IMAP connection for auth Eric Wong
@ 2021-02-22 11:22 71%   ` Eric Wong
  5 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 11:22 UTC (permalink / raw)
  To: meta

Since we stopped using LeiAuth as a WQ worker, keeping this
around as a single-use sub makes no sense and wastes several
KB of memory.
---
 lib/PublicInbox/LeiConvert.pm | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index b45de4e0..4839dea4 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -62,17 +62,6 @@ sub do_convert { # via wq_do
 	delete $self->{wcb}; # commit
 }
 
-sub convert_start {
-	my ($lei) = @_;
-	my $self = $lei->{cnv};
-	my $op = $lei->workers_start($self, 'lei_convert', 1, {
-		'' => [ $lei->can('dclose'), $lei ]
-	});
-	$self->wq_io_do('do_convert', []);
-	$self->wq_close(1);
-	while ($op && $op->{sock}) { $op->event_step }
-}
-
 sub call { # the main "lei convert" method
 	my ($cls, $lei, @inputs) = @_;
 	my $opt = $lei->{opt};
@@ -131,7 +120,12 @@ sub call { # the main "lei convert" method
 		$nrd->{quiet} = $opt->{quiet};
 		$lei->{nrd} = $nrd;
 	}
-	convert_start($lei);
+	my $op = $lei->workers_start($self, 'lei_convert', 1, {
+		'' => [ $lei->can('dclose'), $lei ]
+	});
+	$self->wq_io_do('do_convert', []);
+	$self->wq_close(1);
+	while ($op && $op->{sock}) { $op->event_step }
 }
 
 sub ipc_atfork_child {

^ permalink raw reply related	[relevance 71%]

* [PATCH 07/10] lei q: reduce wasted IMAP connection for auth
                       ` (3 preceding siblings ...)
  2021-02-22 11:22 53%   ` [PATCH 05/10] lei import: no separate auth worker Eric Wong
@ 2021-02-22 11:22 40%   ` Eric Wong
  2021-02-22 11:22 71%   ` [PATCH 09/10] lei convert: inline convert_start Eric Wong
  5 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 11:22 UTC (permalink / raw)
  To: meta

We can rework the first lei2mail worker to authenticate, and
then share auth info with the rest of the lei2mail workers.  As
with "lei import", this uses PktOp and lei-daemon to share
updated credentials between the first an subsequent l2m workers.
---
 lib/PublicInbox/LeiAuth.pm    | 37 ------------------------
 lib/PublicInbox/LeiConvert.pm |  2 +-
 lib/PublicInbox/LeiQuery.pm   |  9 ++----
 lib/PublicInbox/LeiToMail.pm  | 53 ++++++++++++++++++++++++-----------
 lib/PublicInbox/LeiXSearch.pm | 26 ++++++++++++-----
 5 files changed, 59 insertions(+), 68 deletions(-)

diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index d329eadb..b4777114 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -20,13 +20,6 @@ sub net_merge {
 	}
 }
 
-sub do_auth { # called via wq_io_do
-	my ($self) = @_;
-	my ($lei, $net) = @$self{qw(lei net)};
-	$net->imap_common_init($lei);
-	net_merge($lei, $net); # tell lei-daemon updated auth info
-}
-
 sub do_auth_atfork { # used by IPC WQ workers
 	my ($self, $wq) = @_;
 	return if $wq->{-wq_worker_nr} != 0;
@@ -63,36 +56,6 @@ sub op_merge { # prepares PktOp->pair ops
 	$ops->{net_merge_done1} = [ \&net_merge_done1, $wq ];
 }
 
-sub do_finish_auth { # dwaitpid callback
-	my ($arg, $pid) = @_;
-	my ($self, $lei, $post_auth_cb, @args) = @$arg;
-	$? ? $lei->dclose : $post_auth_cb->(@args);
-}
-
-sub auth_eof {
-	my ($lei, $post_auth_cb, @args) = @_;
-	my $self = delete $lei->{auth} or return;
-	$self->wq_wait_old(\&do_finish_auth, $lei, $post_auth_cb, @args);
-}
-
-sub auth_start {
-	my ($self, $lei, $post_auth_cb, @args) = @_;
-	my $op = $lei->workers_start($self, 'auth', 1, {
-		'net_merge' => [ \&net_merge, $lei ],
-		'' => [ \&auth_eof, $lei, $post_auth_cb, @args ],
-	});
-	$self->wq_io_do('do_auth', []);
-	$self->wq_close(1);
-	while ($op && $op->{sock}) { $op->event_step }
-}
-
-sub ipc_atfork_child {
-	my ($self) = @_;
-	delete $self->{lei}->{auth}; # drop circular ref
-	$self->{lei}->lei_atfork_child;
-	$self->SUPER::ipc_atfork_child;
-}
-
 sub new {
 	my ($cls, $net) = @_; # net may be NetReader or descendant (NetWriter)
 	bless { net => $net }, $cls;
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 3a714502..b45de4e0 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -62,7 +62,7 @@ sub do_convert { # via wq_do
 	delete $self->{wcb}; # commit
 }
 
-sub convert_start { # LeiAuth->auth_start callback
+sub convert_start {
 	my ($lei) = @_;
 	my $self = $lei->{cnv};
 	my $op = $lei->workers_start($self, 'lei_convert', 1, {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 398f834f..64c9394c 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -13,14 +13,11 @@ sub prep_ext { # externals_each callback
 
 sub _start_query {
 	my ($self) = @_;
-	if (my $nwr = $self->{nwr}) {
+	if (my $net = $self->{net}) {
 		require PublicInbox::LeiAuth;
-		my $auth = $self->{auth} = PublicInbox::LeiAuth->new($nwr);
-		my $lxs = $self->{lxs};
-		$auth->auth_start($self, $lxs->can('do_query'), $lxs, $self);
-	} else {
-		$self->{lxs}->do_query($self);
+		$self->{auth} = PublicInbox::LeiAuth->new($net);
 	}
+	$self->{lxs}->do_query($self);
 }
 
 sub qstr_add { # PublicInbox::InputPipe::consume callback for --stdin
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 6efd398a..df813064 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -345,8 +345,8 @@ sub _imap_write_cb ($$) {
 	my ($self, $lei) = @_;
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe if $dedupe;
-	my $imap_append = $lei->{nwr}->can('imap_append');
-	my $mic = $lei->{nwr}->mic_get($self->{uri});
+	my $imap_append = $lei->{net}->can('imap_append');
+	my $mic = $lei->{net}->mic_get($self->{uri});
 	my $folder = $self->{uri}->mailbox;
 	sub { # for git_to_mail
 		my ($bref, $smsg, $eml) = @_;
@@ -394,15 +394,15 @@ sub new {
 		$self->{base_type} = 'mbox';
 	} elsif ($fmt =~ /\Aimaps?\z/) { # TODO .onion support
 		require PublicInbox::NetWriter;
-		my $nwr = PublicInbox::NetWriter->new;
-		$nwr->add_url($dst);
-		$nwr->{quiet} = $lei->{opt}->{quiet};
-		my $err = $nwr->errors($dst);
+		my $net = PublicInbox::NetWriter->new;
+		$net->add_url($dst);
+		$net->{quiet} = $lei->{opt}->{quiet};
+		my $err = $net->errors($dst);
 		return $lei->fail($err) if $err;
 		require PublicInbox::URIimap; # TODO: URI cast early
 		$self->{uri} = PublicInbox::URIimap->new($dst);
 		$self->{uri}->mailbox or die "No mailbox: $dst";
-		$lei->{nwr} = $nwr;
+		$lei->{net} = $net;
 		$self->{base_type} = 'imap';
 	} else {
 		die "bad mail --format=$fmt\n";
@@ -447,15 +447,16 @@ sub _augment_imap { # PublicInbox::NetReader::imap_each cb
 
 sub _do_augment_imap {
 	my ($self, $lei) = @_;
-	my $nwr = $lei->{nwr};
+	my $net = $lei->{net};
 	if ($lei->{opt}->{augment}) {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
-			$nwr->imap_each($self->{uri}, \&_augment_imap, $lei);
+			$net->imap_each($self->{uri}, \&_augment_imap, $lei);
 			$dedupe->pause_dedupe;
 		}
-	} else { # clobber existing IMAP folder
-		$nwr->imap_delete_all($self->{uri});
+	} elsif (!$self->{-wq_worker_nr}) { # undef or 0
+		# clobber existing IMAP folder
+		$net->imap_delete_all($self->{uri});
 	}
 }
 
@@ -523,16 +524,18 @@ sub post_augment {
 	$m->($self, $lei, @args);
 }
 
-sub ipc_atfork_child {
+sub do_post_auth {
 	my ($self) = @_;
-	my $lei = delete $self->{lei};
-	$lei->lei_atfork_child;
+	my $lei = $self->{lei};
+	# lei_xsearch can start as soon as all l2m workers get here
+	pkt_do($lei->{pkt_op_p}, 'incr_start_query') or
+		die "incr_start_query: $!";
 	my $aug;
 	if (lock_free($self)) {
 		my $mod = $self->{-wq_nr_workers};
 		my $shard = $self->{-wq_worker_nr};
-		if (my $nwr = $lei->{nwr}) {
-			$nwr->{shard_info} = [ $mod, $shard ];
+		if (my $net = $lei->{net}) {
+			$net->{shard_info} = [ $mod, $shard ];
 		} else { # Maildir (MH?)
 			$self->{shard_info} = [ $mod, $shard ];
 		}
@@ -545,13 +548,20 @@ sub ipc_atfork_child {
 		eval { do_augment($self, $lei) };
 		$lei->fail($@) if $@;
 		pkt_do($lei->{pkt_op_p}, $aug) == 1 or
-					die "do_post_augment trigger: $!";
+				die "do_post_augment trigger: $!";
 	}
 	if (my $zpipe = delete $lei->{zpipe}) {
 		$lei->{1} = $zpipe->[1];
 		close $zpipe->[0];
 	}
 	$self->{wcb} = $self->write_cb($lei);
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	$lei->lei_atfork_child;
+	$lei->{auth}->do_auth_atfork($self) if $lei->{auth};
 	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
 }
@@ -584,4 +594,13 @@ sub wq_atexit_child {
 	$SIG{__WARN__} = 'DEFAULT';
 }
 
+# called in top-level lei-daemon when LeiAuth is done
+sub net_merge_complete {
+	my ($self) = @_;
+	$self->wq_broadcast('do_post_auth');
+	$self->wq_close(1);
+}
+
+no warnings 'once'; # the following works even when LeiAuth is lazy-loaded
+*net_merge_all = \&PublicInbox::LeiAuth::net_merge_all;
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e982165f..6dcadf0a 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -348,7 +348,7 @@ sub do_post_augment {
 	close(delete $lei->{au_done}); # triggers wait_startq in lei_xsearch
 }
 
-sub incr_post_augment { # called whenever an l2m shard finishes
+sub incr_post_augment { # called whenever an l2m shard finishes augment
 	my ($lei) = @_;
 	my $l2m = $lei->{l2m} or die 'BUG: unexpected incr_post_augment';
 	return if ++$lei->{nr_post_augment} != $l2m->{-wq_nr_workers};
@@ -366,8 +366,8 @@ sub concurrency {
 }
 
 sub start_query { # always runs in main (lei-daemon) process
-	my ($self, $lei) = @_;
-	if ($lei->{opt}->{threads}) {
+	my ($self) = @_;
+	if ($self->{threads}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_thread_mset', [], $ibxish);
 		}
@@ -382,6 +382,13 @@ sub start_query { # always runs in main (lei-daemon) process
 	for my $uris (@$q) {
 		$self->wq_io_do('query_remote_mboxrd', [], $uris);
 	}
+	$self->wq_close(1); # lei_xsearch workers stop when done
+}
+
+sub incr_start_query { # called whenever an l2m shard starts do_post_auth
+	my ($self, $l2m) = @_;
+	return if ++$self->{nr_start_query} != $l2m->{-wq_nr_workers};
+	start_query($self);
 }
 
 sub ipc_atfork_child {
@@ -393,6 +400,7 @@ sub ipc_atfork_child {
 
 sub do_query {
 	my ($self, $lei) = @_;
+	my $l2m = $lei->{l2m};
 	my $ops = {
 		'|' => [ $lei->can('sigpipe_handler'), $lei ],
 		'!' => [ $lei->can('fail_handler'), $lei ],
@@ -402,12 +410,13 @@ sub do_query {
 		'mset_progress' => [ \&mset_progress, $lei ],
 		'x_it' => [ $lei->can('x_it'), $lei ],
 		'child_error' => [ $lei->can('child_error'), $lei ],
+		'incr_start_query' => [ \&incr_start_query, $self, $l2m ],
 	};
+	$lei->{auth}->op_merge($ops, $l2m) if $l2m && $lei->{auth};
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	$lei->{1}->autoflush(1);
 	$lei->start_pager if delete $lei->{need_pager};
 	$lei->{ovv}->ovv_begin($lei);
-	my $l2m = $lei->{l2m};
 	if ($l2m) {
 		$l2m->pre_augment($lei);
 		if ($lei->{opt}->{augment} && delete $lei->{early_mua}) {
@@ -428,10 +437,13 @@ sub do_query {
 				$lei->oldset, { lei => $lei });
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
-	$l2m->wq_close(1) if $l2m;
+	$self->{threads} = $lei->{opt}->{threads};
+	if ($l2m) {
+		$l2m->net_merge_complete unless $lei->{auth};
+	} else {
+		start_query($self);
+	}
 	$lei->event_step_init; # wait for shutdowns
-	start_query($self, $lei);
-	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {
 		while ($op->{sock}) { $op->event_step }
 	}

^ permalink raw reply related	[relevance 40%]

* [PATCH 2/2] lei: avoid needless env passing to subcommands
  @ 2021-02-22 21:38 53% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-22 21:38 UTC (permalink / raw)
  To: meta

We already localize %ENV before calling dispatch(), so
it's needless overhead in spawn() to be checking env for
undef values in those cases.
---
 lib/PublicInbox/LEI.pm        | 4 ++--
 lib/PublicInbox/LeiMirror.pm  | 6 +++---
 lib/PublicInbox/LeiToMail.pm  | 4 ++--
 lib/PublicInbox/LeiXSearch.pm | 4 ++--
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 31dbd01f..019b3152 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -838,8 +838,7 @@ sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 # caller needs to "-t $self->{1}" to check if tty
 sub start_pager {
 	my ($self) = @_;
-	my $env = $self->{env};
-	my $fh = popen_rd([qw(git var GIT_PAGER)], $env);
+	my $fh = popen_rd([qw(git var GIT_PAGER)]);
 	chomp(my $pager = <$fh> // '');
 	close($fh) or warn "`git var PAGER' error: \$?=$?";
 	return if $pager eq 'cat' || $pager eq '';
@@ -848,6 +847,7 @@ sub start_pager {
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
 	my $pgr = [ undef, @$rdr{1, 2} ];
+	my $env = $self->{env};
 	if ($self->{sock}) { # lei(1) process runs it
 		delete @$new_env{keys %$env}; # only set iff unset
 		send_exec_cmd($self, [ @$rdr{0..2} ], [$pager], $new_env);
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index f8ca1ee5..65818796 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -32,7 +32,7 @@ sub try_scrape {
 	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
 	my $cmd = $curl->for_uri($lei, $uri, '--compressed');
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
-	my $fh = popen_rd($cmd, $lei->{env}, $opt);
+	my $fh = popen_rd($cmd, undef, $opt);
 	my $html = do { local $/; <$fh> } // die "read(curl $uri): $!";
 	close($fh) or return $lei->child_error($?, "@$cmd failed");
 
@@ -142,7 +142,7 @@ sub run_reap {
 	my ($lei, $cmd, $opt) = @_;
 	$lei->qerr("# @$cmd");
 	$opt->{pgid} = 0;
-	my $pid = spawn($cmd, $lei->{env}, $opt);
+	my $pid = spawn($cmd, undef, $opt);
 	my $reap = PublicInbox::OnDestroy->new($lei->can('sigint_reap'), $pid);
 	my $err = waitpid($pid, 0) == $pid ? undef : "waitpid @$cmd: $!";
 	@$reap = (); # cancel reap
@@ -205,7 +205,7 @@ sub try_manifest {
 	my $cmd = $curl->for_uri($lei, $uri);
 	$lei->qerr("# @$cmd");
 	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
-	my ($fh, $pid) = popen_rd($cmd, $lei->{env}, $opt);
+	my ($fh, $pid) = popen_rd($cmd, undef, $opt);
 	my $reap = PublicInbox::OnDestroy->new($lei->can('sigint_reap'), $pid);
 	my $gz = do { local $/; <$fh> } // die "read(curl $uri): $!";
 	close $fh;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index df813064..d77005fa 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -219,7 +219,7 @@ sub _post_augment_mbox { # open a compressor process
 	my $cmd = zsfx2cmd($zsfx, undef, $lei);
 	my ($r, $w) = @{delete $lei->{zpipe}};
 	my $rdr = { 0 => $r, 1 => $lei->{1}, 2 => $lei->{2} };
-	my $pid = spawn($cmd, $lei->{env}, $rdr);
+	my $pid = spawn($cmd, undef, $rdr);
 	my $pp = gensym;
 	my $dup = bless { "pid.$pid" => $cmd }, ref($lei);
 	$dup->{$_} = $lei->{$_} for qw(2 sock);
@@ -232,7 +232,7 @@ sub _post_augment_mbox { # open a compressor process
 sub decompress_src ($$$) {
 	my ($in, $zsfx, $lei) = @_;
 	my $cmd = zsfx2cmd($zsfx, 1, $lei);
-	popen_rd($cmd, $lei->{env}, { 0 => $in, 2 => $lei->{2} });
+	popen_rd($cmd, undef, { 0 => $in, 2 => $lei->{2} });
 }
 
 sub dup_src ($) {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 6dcadf0a..c46aba3b 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -215,7 +215,7 @@ sub query_remote_mboxrd {
 	local $0 = "$0 query_remote_mboxrd";
 	local $SIG{TERM} = sub { exit(0) }; # for DESTROY (File::Temp, $reap)
 	my $lei = $self->{lei};
-	my ($opt, $env) = @$lei{qw(opt env)};
+	my $opt = $lei->{opt};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{threads};
 	my $verbose = $opt->{verbose};
@@ -241,7 +241,7 @@ sub query_remote_mboxrd {
 		$uri->query_form(@qform);
 		my $cmd = $curl->for_uri($lei, $uri);
 		$lei->qerr("# $cmd");
-		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
+		my ($fh, $pid) = popen_rd($cmd, undef, $rdr);
 		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,

^ permalink raw reply related	[relevance 53%]

* [PATCH] doc: lei: favor "-o format:$PATHNAME" over "-f"
  2021-02-20  8:07 71%             ` Eric Wong
@ 2021-02-23  3:45 51%               ` Kyle Meyer
  2021-02-23  6:03 71%                 ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-23  3:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Maybe not dropped, but probably tweaked for DWIM-ness.
>
> Maybe:
>
>   If somebody wants a Maildir to dump JSON search results in they
>   could use "-o ./json" or "-o json/" or "-o /path/to/json".
>
>   "-o json" (no slashes or colons) would mean JSON output to stdout.
>
> But then, "json" could be the name of an existing directory,
> so if it exists...
>
> Part of me thinks its too magical...

That's kind of my feeling, though I suspect that would at least
consistently do what I mean and be unsurprising.

> On the other hand, maybe only requiring the colon: "-o json:"
> is enough to disambiguate and isn't too much typing.

Yeah, I don't mind that, but I guess that almost gets us back to "-o
json:-".  Then again, I didn't mind that either or really any of the
options proposed in this thread :)

Anyway, no matter where this lands, the manpages should switch to
using/recommending the <format>: prefix, so here's a patch for that.

-- >8 --
Subject: [PATCH] doc: lei: favor "-o format:$PATHNAME" over "-f"

The --format argument is redundant and may be dropped entirely.
Update the lei manpages to prefer the format prefix.

cf. https://public-inbox.org/meta/20210217044032.GA17934@dcvr/
---
 Documentation/lei-import.pod   | 10 ++++++----
 Documentation/lei-overview.pod |  4 ++--
 Documentation/lei-q.pod        | 20 ++++++++++++++------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 14ca2d45d6d8bfa1..2051e6bc86c5fd36 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -11,8 +11,10 @@ lei import [OPTIONS] --stdin
 =head1 DESCRIPTION
 
 Import messages into the local storage of L<lei(1)>.  C<LOCATION> is a
-source of messages: a directory (Maildir) or a file (whose format is
-specified via C<--format>).
+source of messages: a directory (Maildir) or a file.  For a regular
+file, the location must have a C<E<lt>formatE<gt>:> prefix specifying
+one of the following formats: C<eml>, C<mboxrd>, C<mboxcl2>,
+C<mboxcl>, or C<mboxo>.
 
 TODO: Update when URL support is added.
 
@@ -22,8 +24,8 @@ TODO: Update when URL support is added.
 
 =item -f MAIL_FORMAT, --format=MAIL_FORMAT
 
-Message input format: C<eml>, C<mboxrd>, C<mboxcl2>, C<mboxcl>,
-C<mboxo>.
+Message input format.  Unless messages are given on C<stdin>, using a
+format prefix with C<LOCATION> is preferred.
 
 =item --stdin
 
diff --git a/Documentation/lei-overview.pod b/Documentation/lei-overview.pod
index 840d011b27adb088..62b62280ad2ddd69 100644
--- a/Documentation/lei-overview.pod
+++ b/Documentation/lei-overview.pod
@@ -16,7 +16,7 @@ L<public-inbox-v2-format(5)>.
 
 =over
 
-=item $ lei import --format=mboxrd t.mbox
+=item $ lei import mboxrd:t.mbox
 
 Import the messages from an mbox into the local storage.
 
@@ -64,7 +64,7 @@ Search for messages whose subject includes "lei" and "skeleton".
 Do the same, but also report unmatched messages that are in the same
 thread as a matched message.
 
-=item $ lei q -t -o t.mbox -f mboxcl2 --mua=mutt s:lei s:skeleton
+=item $ lei q -t -o mboxcl2:t.mbox --mua=mutt s:lei s:skeleton
 
 Write mboxcl2-formatted results to t.mbox and enter mutt to view the
 file by invoking C<mutt -f %f>.
diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index c8df6fc7244bfae6..75fdc613579cdc18 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -26,17 +26,25 @@ Read search terms from stdin.
 
 =item -o MFOLDER, --output=MFOLDER, --mfolder=MFOLDER
 
-Destination for results (e.g., C<path/to/Maildir> or - for stdout).
+Destination for results (e.g., C<path/to/Maildir> or
+C<mboxcl2:path/to/mbox>).  The format can be specified by adding a
+C<E<lt>formatE<gt>:> prefix with any of these values: C<maildir>,
+C<mboxrd>, C<mboxcl2>, C<mboxcl>, C<mboxo>, C<json>, C<jsonl>, or
+C<concatjson>.
+
+TODO: Provide description of formats?
+
+When a format isn't specified, it's chosen based on the destination.
+C<json> is used for the default destination (stdout), and C<maildir>
+is used for an existing directory or non-existing path.
 
 Default: -
 
 =item -f FORMAT, --format=FORMAT
 
-Format of results: C<maildir>, C<mboxrd>, C<mboxcl2>, C<mboxcl>,
-C<mboxo>, C<json>, C<jsonl>, or C<concatjson>.  The default format
-used depends on C<--output>.
-
-TODO: Provide description of formats?
+Format of results.  This option exists as a convenient way to specify
+the format for the default stdout destination.  Using a C<format:>
+prefix with the C<--output> destination is preferred otherwise.
 
 =item --pretty
 

base-commit: c1ad789a90c274f9912d53bb1c7f1a3cc07cb233
-- 
2.30.1


^ permalink raw reply related	[relevance 51%]

* Re: [PATCH] doc: lei: favor "-o format:$PATHNAME" over "-f"
  2021-02-23  3:45 51%               ` [PATCH] doc: lei: favor "-o format:$PATHNAME" over "-f" Kyle Meyer
@ 2021-02-23  6:03 71%                 ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-23  6:03 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > Maybe not dropped, but probably tweaked for DWIM-ness.
> >
> > Maybe:
> >
> >   If somebody wants a Maildir to dump JSON search results in they
> >   could use "-o ./json" or "-o json/" or "-o /path/to/json".
> >
> >   "-o json" (no slashes or colons) would mean JSON output to stdout.
> >
> > But then, "json" could be the name of an existing directory,
> > so if it exists...
> >
> > Part of me thinks its too magical...
> 
> That's kind of my feeling, though I suspect that would at least
> consistently do what I mean and be unsurprising.
> 
> > On the other hand, maybe only requiring the colon: "-o json:"
> > is enough to disambiguate and isn't too much typing.
> 
> Yeah, I don't mind that, but I guess that almost gets us back to "-o
> json:-".  Then again, I didn't mind that either or really any of the
> options proposed in this thread :)

I'll ponder it more while I work on some other features...
And bash completion still needs to be better in that area.

> Anyway, no matter where this lands, the manpages should switch to
> using/recommending the <format>: prefix, so here's a patch for that.

Yup, thanks, pushed as commit 56b3493c79087979f10f5a3cae7deedaf4ec9fa3

^ permalink raw reply	[relevance 71%]

* [PATCH 0/3] lei -C DIR and more
@ 2021-02-23 10:01 71% Eric Wong
  2021-02-23 10:01 33% ` [PATCH 1/3] lei: support "-C" to chdir in all sub commands Eric Wong
  2021-02-23 10:01 71% ` [PATCH 2/3] lei q: reduce default lei2mail workers Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-02-23 10:01 UTC (permalink / raw)
  To: meta

Like git, make, and tar: "lei -C DIR" now works.

I may add "lei -c config.key=config.val" for IMAP / NNTP
support, too (working on NNTP).

Eric Wong (3):
  lei: support "-C" to chdir in all sub commands
  lei q: reduce default lei2mail workers
  lei_to_mail: remove unused OnDestroy import

 lib/PublicInbox/LEI.pm       | 74 +++++++++++++++++++++---------------
 lib/PublicInbox/LeiQuery.pm  |  6 ++-
 lib/PublicInbox/LeiToMail.pm |  1 -
 t/lei-externals.t            | 22 +++++++++++
 t/lei.t                      |  4 ++
 5 files changed, 74 insertions(+), 33 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/3] lei q: reduce default lei2mail workers
  2021-02-23 10:01 71% [PATCH 0/3] lei -C DIR and more Eric Wong
  2021-02-23 10:01 33% ` [PATCH 1/3] lei: support "-C" to chdir in all sub commands Eric Wong
@ 2021-02-23 10:01 71% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-02-23 10:01 UTC (permalink / raw)
  To: meta

While disk I/O is typically buffered for good scheduling,
git blob decoding uses a non-trivial amount of CPU time
and it helps to leave some CPU available for it.
---
 lib/PublicInbox/LeiQuery.pm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 214267ee..743fa3f7 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -5,6 +5,7 @@
 package PublicInbox::LeiQuery;
 use strict;
 use v5.10.1;
+use POSIX ();
 
 sub prep_ext { # externals_each callback
 	my ($lxs, $exclude, $loc) = @_;
@@ -94,7 +95,10 @@ sub lei_q {
 		return $self->fail("`$mj' writer jobs must be >= 1");
 	}
 	PublicInbox::LeiOverview->new($self) or return;
-	$self->{l2m}->{-wq_nr_workers} = ($mj // $nproc) if $self->{l2m};
+	$self->{l2m} and $self->{l2m}->{-wq_nr_workers} = $mj // do {
+		$mj = POSIX::lround($nproc * 3 / 4); # keep some CPU for git
+		$mj <= 0 ? 1 : $mj;
+	};
 
 	my %mset_opt = map { $_ => $opt->{$_} } qw(threads limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;

^ permalink raw reply related	[relevance 71%]

* [PATCH 1/3] lei: support "-C" to chdir in all sub commands
  2021-02-23 10:01 71% [PATCH 0/3] lei -C DIR and more Eric Wong
@ 2021-02-23 10:01 33% ` Eric Wong
  2021-02-23 10:01 71% ` [PATCH 2/3] lei q: reduce default lei2mail workers Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-02-23 10:01 UTC (permalink / raw)
  To: meta

We'll also support "-C" at the end of most commands to give
users a little more flexibility when building command-lines.
This conflicts with "lei daemon-kill -CHLD", so that's
special-cased since "-C" makes no sense with daemon-kill,
anyways.

Unlike "git show", the to-be-implemented "lei show" will diverge
and enable "--find-copies[=<n>]" by default, so "-C[<n>]" won't
be necessary.
---
 lib/PublicInbox/LEI.pm | 74 ++++++++++++++++++++++++------------------
 t/lei-externals.t      | 22 +++++++++++++
 t/lei.t                |  4 +++
 3 files changed, 69 insertions(+), 31 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 019b3152..8cd95ac2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -112,80 +112,81 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
+	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
-	qw(type=s solve! format|f=s dedupe|d=s threads|t remote local!),
+	qw(type=s solve! format|f=s dedupe|d=s threads|t remote local! C=s@),
 	pass_through('git show') ],
 
 'add-external' => [ 'LOCATION',
 	'add/set priority of a publicinbox|extindex for extra matches',
 	qw(boost=i c=s@ mirror=s no-torsocks torsocks=s inbox-version=i),
-	qw(quiet|q verbose|v+),
+	qw(quiet|q verbose|v+ C=s@),
 	index_opt(), PublicInbox::LeiQuery::curl_opt() ],
 'ls-external' => [ '[FILTER]', 'list publicinbox|extindex locations',
-	qw(format|f=s z|0 globoff|g invert-match|v local remote) ],
+	qw(format|f=s z|0 globoff|g invert-match|v local remote C=s@) ],
 'forget-external' => [ 'LOCATION...|--prune',
 	'exclude further results from a publicinbox|extindex',
-	qw(prune quiet|q) ],
+	qw(prune quiet|q C=s@) ],
 
 'ls-query' => [ '[FILTER...]', 'list saved search queries',
-		qw(name-only format|f=s z) ],
-'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
-'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],
+		qw(name-only format|f=s z C=s@) ],
+'rm-query' => [ 'QUERY_NAME', 'remove a saved search', qw(C=s@) ],
+'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search', qw(C=s@) ],
 
 'plonk' => [ '--threads|--from=IDENT',
 	'exclude mail matching From: or threads from non-Message-ID searches',
-	qw(stdin| threads|t from|f=s mid=s oid=s) ],
+	qw(stdin| threads|t from|f=s mid=s oid=s C=s@) ],
 'mark' => [ 'MESSAGE_FLAGS...',
 	'set/unset keywords on message(s) from stdin',
-	qw(stdin| oid=s exact by-mid|mid:s) ],
+	qw(stdin| oid=s exact by-mid|mid:s C=s@) ],
 'forget' => [ '[--stdin|--oid=OID|--by-mid=MID]',
 	"exclude message(s) on stdin from `q' search results",
-	qw(stdin| oid=s exact by-mid|mid:s quiet|q) ],
+	qw(stdin| oid=s exact by-mid|mid:s quiet|q C=s@) ],
 
 'purge-mailsource' => [ 'LOCATION|--all',
 	'remove imported messages from IMAP, Maildirs, and MH',
-	qw(exact! all jobs:i indexed) ],
+	qw(exact! all jobs:i indexed C=s@) ],
 
 # code repos are used for `show' to solve blobs from patch mails
 'add-coderepo' => [ 'DIRNAME', 'add or set priority of a git code repo',
-	qw(boost=i) ],
+	qw(boost=i C=s@) ],
 'ls-coderepo' => [ '[FILTER_TERMS...]',
-		'list known code repos', qw(format|f=s z) ],
+		'list known code repos', qw(format|f=s z C=s@) ],
 'forget-coderepo' => [ 'DIRNAME',
 	'stop using repo to solve blobs from patches',
-	qw(prune) ],
+	qw(prune C=s@) ],
 
 'add-watch' => [ 'LOCATION', 'watch for new messages and flag changes',
 	qw(import! kw|keywords|flags! interval=s recursive|r
-	exclude=s include=s) ],
+	exclude=s include=s C=s@) ],
 'ls-watch' => [ '[FILTER...]', 'list active watches with numbers and status',
-		qw(format|f=s z) ],
-'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
-'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
+		qw(format|f=s z C=s@) ],
+'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote C=s@) ],
+'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote C=s@) ],
 'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
-	qw(prune) ],
+	qw(prune C=s@) ],
 
 'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
-	format|f=s kw|keywords|flags!),
+	format|f=s kw|keywords|flags! C=s@),
 	],
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
 	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
-	kw|keywords|flags!),
+	kw|keywords|flags! C=s@),
 	],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
-	pass_through('git config') ],
+	 qw(C=s@), pass_through('git config') ],
 'init' => [ '[DIRNAME]', sub {
 	"initialize storage, default: "._store_path($_[0]);
-	}, qw(quiet|q) ],
+	}, qw(quiet|q C=s@) ],
 'daemon-kill' => [ '[-SIGNAL]', 'signal the lei-daemon',
+	# "-C DIR" conflicts with -CHLD, here, and chdir makes no sense, here
 	opt_dash('signal|s=s', '[0-9]+|(?:[A-Z][A-Z0-9]+)') ],
 'daemon-pid' => [ '', 'show the PID of the lei-daemon' ],
 'help' => [ '[SUBCOMMAND]', 'show help' ],
@@ -195,7 +196,7 @@ our %CMD = ( # sorted in order of importance/use:
 
 'reorder-local-store-and-break-history' => [ '[REFNAME]',
 	'rewrite git history in an attempt to improve compression',
-	'gc!' ],
+	qw(gc! C=s@) ],
 
 # internal commands are prefixed with '_'
 '_complete' => [ '[...]', 'internal shell completion helper',
@@ -214,6 +215,7 @@ my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 # we use \x{a0} (non-breaking SP) to avoid wrapping in PublicInbox::LeiHelp
 my %OPTDESC = (
 'help|h' => 'show this built-in help',
+'C=s@' => [ 'DIR', 'chdir to specify to directory' ],
 'quiet|q' => 'be quiet',
 'globoff|g' => "do not match locations using '*?' wildcards ".
 		"and\xa0'[]'\x{a0}ranges",
@@ -497,7 +499,7 @@ sub optparse ($$$) {
 	# allow _complete --help to complete, not show help
 	return 1 if substr($cmd, 0, 1) eq '_';
 	$self->{cmd} = $cmd;
-	$OPT = $self->{opt} = {};
+	$OPT = $self->{opt} //= {};
 	my $info = $CMD{$cmd} // [ '[...]' ];
 	my ($proto, undef, @spec) = @$info;
 	my $glp = ref($spec[-1]) eq ref($GLP) ? pop(@spec) : $GLP;
@@ -566,15 +568,25 @@ sub dispatch {
 	local $current_lei = $self; # for __WARN__
 	dump_and_clear_log("from previous run\n");
 	return _help($self, 'no command given') unless defined($cmd);
+	while ($cmd eq '-C') { # do not support Getopt bundling for this
+		my $d = shift(@argv) // return fail($self, '-C DIRECTORY');
+		push @{$self->{opt}->{C}}, $d;
+		$cmd = shift(@argv) // return _help($self, 'no command given');
+	}
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
 	if (my $cb = __PACKAGE__->can($func)) {
 		optparse($self, $cmd, \@argv) or return;
+		if (my $chdir = $self->{opt}->{C}) {
+			for my $d (@$chdir) {
+				next if $d eq ''; # same as git(1)
+				chdir $d or return fail($self, "cd $d: $!");
+			}
+		}
 		$cb->($self, @argv);
 	} elsif (grep(/\A-/, $cmd, @argv)) { # --help or -h only
-		my $opt = {};
-		$GLP->getoptionsfromarray([$cmd, @argv], $opt, qw(help|h)) or
-			return _help($self, 'bad arguments or options');
+		$GLP->getoptionsfromarray([$cmd, @argv], {}, qw(help|h C=s@))
+			or return _help($self, 'bad arguments or options');
 		_help($self);
 	} else {
 		fail($self, "`$cmd' is not an lei command");
@@ -702,7 +714,7 @@ sub lei_help { _help($_[0]) }
 sub lei__complete {
 	my ($self, @argv) = @_; # argv = qw(lei and any other args...)
 	shift @argv; # ignore "lei", the entire command is sent
-	@argv or return puts $self, grep(!/^_/, keys %CMD), qw(--help -h);
+	@argv or return puts $self, grep(!/^_/, keys %CMD), qw(--help -h -C);
 	my $cmd = shift @argv;
 	my $info = $CMD{$cmd} // do { # filter matching commands
 		@argv or puts $self, grep(/\A\Q$cmd\E/, keys %CMD);
@@ -726,7 +738,7 @@ sub lei__complete {
 			# fall-through
 		}
 		# generate short/long names from Getopt::Long specs
-		puts $self, grep(/$re/, qw(--help -h), map {
+		puts $self, grep(/$re/, qw(--help -h -C), map {
 			if (s/[:=].+\z//) { # req/optional args, e.g output|o=i
 			} elsif (s/\+\z//) { # verbose|v+
 			} elsif (s/!\z//) {
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 233f6092..d422a9d1 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -4,6 +4,7 @@
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 use Fcntl qw(SEEK_SET);
 use PublicInbox::Spawn qw(which);
+use PublicInbox::OnDestroy;
 require_git 2.6;
 require_mods(qw(DBD::SQLite Search::Xapian));
 
@@ -206,6 +207,27 @@ SKIP: {
 	ok(!lei(qw(q --no-local s:see)), '--no-local');
 	is($? >> 8, 1, 'proper exit code');
 	like($lei_err, qr/no local or remote.+? to search/, 'no inbox');
+
+	{
+		opendir my $dh, '.' or BAIL_OUT "opendir(.) $!";
+		my $od = PublicInbox::OnDestroy->new($$, sub {
+			chdir $dh or BAIL_OUT "chdir: $!"
+		});
+		my @q = qw(q -o mboxcl2:rel.mboxcl2 bye);
+		lei_ok('-C', $home, @q);
+		is(unlink("$home/rel.mboxcl2"), 1, '-C works before q');
+
+		# we are more flexible than git, here:
+		lei_ok(@q, '-C', $home);
+		is(unlink("$home/rel.mboxcl2"), 1, '-C works after q');
+		mkdir "$home/deep" or BAIL_OUT $!;
+		lei_ok('-C', $home, @q, '-C', 'deep');
+		is(unlink("$home/deep/rel.mboxcl2"), 1, 'multiple -C works');
+
+		lei_ok('-C', '', '-C', $home, @q, '-C', 'deep', '-C', '');
+		is(unlink("$home/deep/rel.mboxcl2"), 1, "-C '' accepted");
+		ok(!-f "$home/rel.mboxcl2", 'wrong path not created');
+	}
 	my %e = (
 		TEST_LEI_EXTERNAL_HTTPS => 'https://public-inbox.org/meta/',
 		TEST_LEI_EXTERNAL_ONION => $onions[int(rand(scalar(@onions)))],
diff --git a/t/lei.t b/t/lei.t
index 2e0b8a1f..ba179b39 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -129,6 +129,10 @@ my $test_completion = sub {
 };
 
 my $test_fail = sub {
+	lei('q', 'whatever', '-C', '/dev/null');
+	is($? >> 8, 1, 'chdir at end fails to /dev/null');
+	lei('-C', '/dev/null', 'q', 'whatever');
+	is($? >> 8, 1, 'chdir at beginning fails to /dev/null');
 SKIP: {
 	skip 'no curl', 3 unless which('curl');
 	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m));

^ permalink raw reply related	[relevance 33%]

* [PATCH 0/4] lei <import|convert> nntp://
@ 2021-02-24 11:31 71% Eric Wong
  2021-02-24 11:31 15% ` [PATCH 2/4] lei <import|convert>: support NNTP sources Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-24 11:31 UTC (permalink / raw)
  To: meta

lib/PublicInbox/ actually gets smaller with this series :>

Some -watch progress messages are prefixed with "#" (TAP-style)
instead of "I:", but they go to stderr, anyways.

t/watch_nntp.t is replaced by t/uri_nntps.t

Eric Wong (4):
  add PublicInbox::URInntps package
  lei <import|convert>: support NNTP sources
  watch: switch IMAP and NNTP fetch loops to NetReader
  net_reader: trim exports and remove unused uri_new

 MANIFEST                      |   4 +-
 lib/PublicInbox/LeiAuth.pm    |   4 +-
 lib/PublicInbox/LeiConvert.pm |  14 +-
 lib/PublicInbox/LeiImport.pm  |  12 +-
 lib/PublicInbox/NetReader.pm  | 231 +++++++++++++++----
 lib/PublicInbox/URInntps.pm   |  17 ++
 lib/PublicInbox/Watch.pm      | 417 +++++++++-------------------------
 t/imapd.t                     |   2 +-
 t/lei-convert.t               |  31 ++-
 t/lei-import-nntp.t           |  30 +++
 t/nntpd.t                     |   2 +-
 t/uri_nntps.t                 |  40 ++++
 t/watch_nntp.t                |  17 --
 13 files changed, 425 insertions(+), 396 deletions(-)
 create mode 100644 lib/PublicInbox/URInntps.pm
 create mode 100644 t/lei-import-nntp.t
 create mode 100644 t/uri_nntps.t
 delete mode 100644 t/watch_nntp.t

^ permalink raw reply	[relevance 71%]

* [PATCH 2/4] lei <import|convert>: support NNTP sources
  2021-02-24 11:31 71% [PATCH 0/4] lei <import|convert> nntp:// Eric Wong
@ 2021-02-24 11:31 15% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-24 11:31 UTC (permalink / raw)
  To: meta

We can read NNTP in -watch and Net::NNTP is shipped with Perl5,
so lei import and convert have no excuse not to support NNTP
as a client.

Authentication is not tested, yet; but should be close to what
IMAP is like...
---
 MANIFEST                      |   2 +-
 lib/PublicInbox/LeiAuth.pm    |   4 +-
 lib/PublicInbox/LeiConvert.pm |  14 ++-
 lib/PublicInbox/LeiImport.pm  |  12 +-
 lib/PublicInbox/NetReader.pm  | 209 ++++++++++++++++++++++++++------
 lib/PublicInbox/Watch.pm      | 218 +++++++++++++---------------------
 t/lei-convert.t               |  31 +++--
 t/lei-import-nntp.t           |  30 +++++
 t/watch_nntp.t                |  17 ---
 9 files changed, 331 insertions(+), 206 deletions(-)
 create mode 100644 t/lei-import-nntp.t
 delete mode 100644 t/watch_nntp.t

diff --git a/MANIFEST b/MANIFEST
index 9cf97563..4c04eec8 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -369,6 +369,7 @@ t/lei-daemon.t
 t/lei-externals.t
 t/lei-import-imap.t
 t/lei-import-maildir.t
+t/lei-import-nntp.t
 t/lei-import.t
 t/lei-mirror.t
 t/lei.t
@@ -454,7 +455,6 @@ t/watch_imap.t
 t/watch_maildir.t
 t/watch_maildir_v2.t
 t/watch_multiple_headers.t
-t/watch_nntp.t
 t/www_altid.t
 t/www_listing.t
 t/www_static.t
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 099bdaca..927fe550 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -14,9 +14,11 @@ sub do_auth_atfork { # used by IPC WQ workers
 	my $lei = $wq->{lei};
 	my $net = $lei->{net};
 	my $mics = $net->imap_common_init($lei);
+	my $nn = $net->nntp_common_init($lei);
 	pkt_do($lei->{pkt_op_p}, 'net_merge', $net) or
 			die "pkt_do net_merge: $!";
-	$net->{mics_cached} = $mics;
+	$net->{mics_cached} = $mics if $mics;
+	$net->{nn_cached} = $nn if $nn;
 }
 
 sub net_merge_done1 { # bump merge-count in top-level lei-daemon
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 4839dea4..a7e47871 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -18,8 +18,8 @@ sub mbox_cb {
 	$self->{wcb}->(undef, { kw => \@kw }, $eml);
 }
 
-sub imap_cb { # ->imap_each
-	my ($url, $uid, $kw, $eml, $self) = @_;
+sub net_cb { # callback for ->imap_each, ->nntp_each
+	my (undef, undef, $kw, $eml, $self) = @_; # @_[0,1]: url + uid ignored
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
@@ -35,14 +35,18 @@ sub do_convert { # via wq_do
 	my $mics;
 	if (my $nrd = $lei->{nrd}) { # may prompt user once
 		$nrd->{mics_cached} = $nrd->imap_common_init($lei);
+		$nrd->{nn_cached} = $nrd->nntp_common_init($lei);
 	}
 	if (my $stdin = delete $self->{0}) {
 		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
 	}
 	for my $input (@{$self->{inputs}}) {
 		my $ifmt = lc($in_fmt // '');
-		if ($input =~ m!\A(?:imap|nntp)s?://!) { # TODO: nntp
-			$lei->{nrd}->imap_each($input, \&imap_cb, $self);
+		if ($input =~ m!\Aimaps?://!) {
+			$lei->{nrd}->imap_each($input, \&net_cb, $self);
+			next;
+		} elsif ($input =~ m!\A(?:nntps?|s?news)://!) {
+			$lei->{nrd}->nntp_each($input, \&net_cb, $self);
 			next;
 		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
 			$ifmt = lc $1;
@@ -82,7 +86,7 @@ sub call { # the main "lei convert" method
 	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
 	for my $input (@inputs) {
 		my $input_path = $input;
-		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
 			require PublicInbox::NetReader;
 			$nrd //= PublicInbox::NetReader->new;
 			$nrd->add_url($input);
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index b85f4d6c..cbfb3127 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -74,7 +74,7 @@ sub call { # the main "lei import" method
 	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
 	for my $input (@inputs) {
 		my $input_path = $input;
-		if ($input =~ m!\A(?:imap|nntp)s?://!i) {
+		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
 			require PublicInbox::NetReader;
 			$net //= PublicInbox::NetReader->new;
 			$net->add_url($input);
@@ -152,9 +152,8 @@ sub _import_maildir { # maildir_each_file cb
 	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
 }
 
-sub _import_imap { # imap_each cb
+sub _import_net { # imap_each, nntp_each cb
 	my ($url, $uid, $kw, $eml, $sto, $set_kw) = @_;
-	warn "$url $uid";
 	$sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
 }
 
@@ -163,10 +162,13 @@ sub import_path_url {
 	my $lei = $self->{lei};
 	my $ifmt = lc($lei->{opt}->{'format'} // '');
 	# TODO auto-detect?
-	if ($input =~ m!\A(imap|nntp)s?://!i) {
-		$lei->{net}->imap_each($input, \&_import_imap, $lei->{sto},
+	if ($input =~ m!\Aimaps?://!i) {
+		$lei->{net}->imap_each($input, \&_import_net, $lei->{sto},
 					$lei->{opt}->{kw});
 		return;
+	} elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
+		$lei->{net}->nntp_each($input, \&_import_net, $lei->{sto}, 0);
+		return;
 	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
 		$ifmt = lc $1;
 	}
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index ff90468b..2a453217 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -11,26 +11,16 @@ use PublicInbox::Eml;
 our %IMAPflags2kw = map {; "\\\u$_" => $_ } qw(seen answered flagged draft);
 
 # TODO: trim this down, this is huge
-our @EXPORT = qw(uri_new uri_scheme uri_section
-		nn_new nn_for
-		imap_uri nntp_url
-		cfg_bool cfg_intvl imap_common_init
+our @EXPORT = qw(uri_new uri_section
+		nn_new imap_uri nntp_uri
+		cfg_bool cfg_intvl imap_common_init nntp_common_init
 		);
 
-# avoid exposing deprecated "snews" to users.
-my %SCHEME_MAP = ('snews' => 'nntps');
-
-sub uri_scheme ($) {
-	my ($uri) = @_;
-	my $scheme = $uri->scheme;
-	$SCHEME_MAP{$scheme} // $scheme;
-}
-
 # returns the git config section name, e.g [imap "imaps://user@example.com"]
 # without the mailbox, so we can share connections between different inboxes
 sub uri_section ($) {
 	my ($uri) = @_;
-	uri_scheme($uri) . '://' . $uri->authority;
+	$uri->scheme . '://' . $uri->authority;
 }
 
 sub auth_anon_cb { '' }; # for Mail::IMAPClient::Authcallback
@@ -123,8 +113,8 @@ sub try_starttls ($) {
 }
 
 sub nn_new ($$$) {
-	my ($nn_arg, $nntp_opt, $url) = @_;
-	my $nn = Net::NNTP->new(%$nn_arg) or die "E: <$url> new: $!\n";
+	my ($nn_arg, $nntp_opt, $uri) = @_;
+	my $nn = Net::NNTP->new(%$nn_arg) or die "E: <$uri> new: $!\n";
 
 	# default to using STARTTLS if it's available, but allow
 	# it to be disabled for localhost/VPN users
@@ -133,27 +123,26 @@ sub nn_new ($$$) {
 				try_starttls($nn_arg->{Host})) {
 			# soft fail by default
 			$nn->starttls or warn <<"";
-W: <$url> STARTTLS tried and failed (not requested)
+W: <$uri> STARTTLS tried and failed (not requested)
 
 		} elsif ($nntp_opt->{starttls}) {
 			# hard fail if explicitly configured
 			$nn->starttls or die <<"";
-E: <$url> STARTTLS requested and failed
+E: <$uri> STARTTLS requested and failed
 
 		}
 	} elsif ($nntp_opt->{starttls}) {
 		$nn->can('starttls') or
-			die "E: <$url> Net::NNTP too old for STARTTLS\n";
+			die "E: <$uri> Net::NNTP too old for STARTTLS\n";
 		$nn->starttls or die <<"";
-E: <$url> STARTTLS requested and failed
+E: <$uri> STARTTLS requested and failed
 
 	}
 	$nn;
 }
 
 sub nn_for ($$$;$) { # nn = Net::NNTP
-	my ($self, $url, $nn_args, $lei) = @_;
-	my $uri = uri_new($url);
+	my ($self, $uri, $nn_args, $lei) = @_;
 	my $sec = uri_section($uri);
 	my $nntp_opt = $self->{nntp_opt}->{$sec} //= {};
 	my $host = $uri->host;
@@ -165,7 +154,7 @@ sub nn_for ($$$;$) { # nn = Net::NNTP
 		require PublicInbox::GitCredential;
 		$cred = bless {
 			url => $sec,
-			protocol => uri_scheme($uri),
+			protocol => $uri->scheme,
 			host => $host,
 		}, 'PublicInbox::GitCredential';
 		($u, $p) = split(/:/, $ui, 2);
@@ -179,14 +168,13 @@ sub nn_for ($$$;$) { # nn = Net::NNTP
 		SSL => $uri->secure, # snews == nntps
 		%$common, # may Debug ....
 	};
-	my $nn = nn_new($nn_arg, $nntp_opt, $url);
-
+	my $nn = nn_new($nn_arg, $nntp_opt, $uri);
 	if ($cred) {
 		$cred->fill($lei); # may prompt user here
 		if ($nn->authinfo($u, $p)) {
 			push @{$nntp_opt->{-postconn}}, [ 'authinfo', $u, $p ];
 		} else {
-			warn "E: <$url> AUTHINFO $u XXXX failed\n";
+			warn "E: <$uri> AUTHINFO $u XXXX failed\n";
 			$nn = undef;
 		}
 	}
@@ -197,12 +185,12 @@ sub nn_for ($$$;$) { # nn = Net::NNTP
 			if ($nn->compress) {
 				push @{$nntp_opt->{-postconn}}, [ 'compress' ];
 			} else {
-				warn "W: <$url> COMPRESS failed\n";
+				warn "W: <$uri> COMPRESS failed\n";
 			}
 		} else {
 			delete $nntp_opt->{compress};
 			warn <<"";
-W: <$url> COMPRESS not supported by Net::NNTP
+W: <$uri> COMPRESS not supported by Net::NNTP
 W: see https://rt.cpan.org/Ticket/Display.html?id=129967 for updates
 
 		}
@@ -220,15 +208,12 @@ sub imap_uri {
 	$uri ? $uri->canonical : undef;
 }
 
-my %IS_NNTP = (news => 1, snews => 1, nntp => 1);
-sub nntp_url {
+my %IS_NNTP = (news => 1, snews => 1, nntp => 1, nntps => 1);
+sub nntp_uri {
 	my ($url) = @_;
-	my $uri = uri_new($url);
-	return unless $uri && $IS_NNTP{$uri->scheme} && $uri->group;
-	$url = $uri->canonical->as_string;
-	# nntps is IANA registered, snews is deprecated
-	$url =~ s!\Asnews://!nntps://!;
-	$url;
+	require PublicInbox::URInntps;
+	my $uri = PublicInbox::URInntps->new($url);
+	$uri && $IS_NNTP{$uri->scheme} && $uri->group ? $uri->canonical : undef;
 }
 
 sub cfg_intvl ($$$) {
@@ -254,6 +239,7 @@ sub cfg_bool ($$$) {
 # flesh out common IMAP-specific data structures
 sub imap_common_init ($;$) {
 	my ($self, $lei) = @_;
+	return unless $self->{imap_order};
 	$self->{quiet} = 1 if $lei && $lei->{opt}->{quiet};
 	eval { require PublicInbox::IMAPClient } or
 		die "Mail::IMAPClient is required for IMAP:\n$@\n";
@@ -297,10 +283,55 @@ sub imap_common_init ($;$) {
 	$mics;
 }
 
+# flesh out common NNTP-specific data structures
+sub nntp_common_init ($;$) {
+	my ($self, $lei) = @_;
+	return unless $self->{nntp_order};
+	$self->{quiet} = 1 if $lei && $lei->{opt}->{quiet};
+	eval { require Net::NNTP } or
+		die "Net::NNTP is required for NNTP:\n$@\n";
+	eval { require PublicInbox::IMAPTracker } or
+		die "DBD::SQLite is required for NNTP\n:$@\n";
+	my $cfg = $self->{pi_cfg} // $lei->_lei_cfg;
+	my $nn_args = {}; # scheme://authority => Net::NNTP->new arg
+	for my $uri (@{$self->{nntp_order}}) {
+		my $sec = uri_section($uri);
+
+		# Debug and Timeout are passed to Net::NNTP->new
+		my $v = cfg_bool($cfg, 'nntp.Debug', $$uri);
+		$nn_args->{$sec}->{Debug} = $v if defined $v;
+		my $to = cfg_intvl($cfg, 'nntp.Timeout', $$uri);
+		$nn_args->{$sec}->{Timeout} = $to if $to;
+
+		# Net::NNTP post-connect commands
+		for my $k (qw(starttls compress)) {
+			$v = cfg_bool($cfg, "nntp.$k", $$uri) // next;
+			$self->{nntp_opt}->{$sec}->{$k} = $v;
+		}
+
+		# internal option
+		for my $k (qw(pollInterval)) {
+			$to = cfg_intvl($cfg, "nntp.$k", $$uri) // next;
+			$self->{nntp_opt}->{$sec}->{$k} = $to;
+		}
+	}
+	# make sure we can connect and cache the credentials in memory
+	$self->{nn_arg} = {}; # schema://authority => Net::NNTP->new args
+	my %nn; # schema://authority => Net::NNTP object
+	for my $uri (@{$self->{nntp_order}}) {
+		my $sec = uri_section($uri);
+		$nn{$sec} //= nn_for($self, $uri, $nn_args, $lei);
+	}
+	\%nn; # for optional {nn_cached}
+}
+
 sub add_url {
 	my ($self, $arg) = @_;
-	if (my $uri = imap_uri($arg)) {
+	my $uri;
+	if ($uri = imap_uri($arg)) {
 		push @{$self->{imap_order}}, $uri;
+	} elsif ($uri = nntp_uri($arg)) {
+		push @{$self->{nntp_order}}, $uri;
 	} else {
 		push @{$self->{unsupported_url}}, $arg;
 	}
@@ -315,6 +346,10 @@ sub errors {
 		eval { require PublicInbox::IMAPClient } or
 			die "Mail::IMAPClient is required for IMAP:\n$@\n";
 	}
+	if ($self->{nntp_order}) {
+		eval { require Net::NNTP } or
+			die "Net::NNTP is required for NNTP:\n$@\n";
+	}
 	undef;
 }
 
@@ -461,6 +496,106 @@ sub imap_each {
 	$mic;
 }
 
+# may used cached auth info prepared by nn_for once
+sub nn_get {
+	my ($self, $uri) = @_;
+	my $sec = uri_section($uri);
+	# see if caller saved result of nntp_common_init
+	my $cached = $self->{nn_cached} // {};
+	my $nn;
+	$nn = delete($cached->{$sec}) and return $nn;
+	my $nn_arg = $self->{nn_arg}->{$sec} or
+			die "BUG: no Net::NNTP->new arg for $sec";
+	my $nntp_opt = $self->{nntp_opt}->{$sec};
+	$nn = nn_new($nn_arg, $nntp_opt, $uri) or return;
+	if (my $postconn = $nntp_opt->{-postconn}) {
+		for my $m_arg (@$postconn) {
+			my ($method, @args) = @$m_arg;
+			$nn->$method(@args) and next;
+			die "E: <$uri> $method failed\n";
+			return;
+		}
+	}
+	$nn;
+}
+
+sub _nntp_fetch_all ($$$) {
+	my ($self, $nn, $uri) = @_;
+	my ($group, $num_a, $num_b) = $uri->group;
+	my $sec = uri_section($uri);
+	my ($nr, $beg, $end) = $nn->group($group);
+	unless (defined($nr)) {
+		chomp(my $msg = $nn->message);
+		return "E: GROUP $group <$sec> $msg";
+	}
+
+	# IMAPTracker is also used for tracking NNTP, UID == article number
+	# LIST.ACTIVE can get the equivalent of UIDVALIDITY, but that's
+	# expensive.  So we assume newsgroups don't change:
+	my $itrk = $self->{incremental} ?
+			PublicInbox::IMAPTracker->new($$uri) : 0;
+	my (undef, $l_art) = $itrk ? $itrk->get_last : ();
+
+	# allow users to specify articles to refetch
+	# cf. https://tools.ietf.org/id/draft-gilman-news-url-01.txt
+	# nntp://example.com/inbox.foo/$num_a-$num_b
+	$beg = $num_a if defined($num_a) && $num_a < $beg;
+	$end = $num_b if defined($num_b) && $num_b < $end;
+	if (defined $l_art) {
+		return if $l_art >= $end; # nothing to do
+		$beg = $l_art + 1;
+	}
+	my ($err, $art);
+	unless ($self->{quiet}) {
+		warn "# $uri fetching ARTICLE $beg..$end\n";
+	}
+	my $last_art;
+	my $n = $self->{max_batch};
+	for ($beg..$end) {
+		last if $self->{quit};
+		$art = $_;
+		if (--$n < 0) {
+			$itrk->update_last(0, $last_art) if $itrk;
+			$n = $self->{max_batch};
+		}
+		my $raw = $nn->article($art);
+		unless (defined($raw)) {
+			my $msg = $nn->message;
+			if ($nn->code == 421) { # pseudo response from Net::Cmd
+				$err = "E: $msg";
+				last;
+			} else { # probably just a deleted message (spam)
+				warn "W: $msg";
+				next;
+			}
+		}
+		$raw = join('', @$raw);
+		$raw =~ s/\r\n/\n/sg;
+		my ($eml_cb, @args) = @{$self->{eml_each}};
+		$eml_cb->($uri, $art, [], PublicInbox::Eml->new(\$raw), @args);
+		$last_art = $art;
+	}
+	$itrk->update_last(0, $last_art) if $itrk;
+	$err;
+}
+
+sub nntp_each {
+	my ($self, $url, $eml_cb, @args) = @_;
+	my $uri = ref($url) ? $url : PublicInbox::URInntps->new($url);
+	my $sec = uri_section($uri);
+	local $0 = $uri->group ." $sec";
+	my $nn = nn_get($self, $uri);
+	my $err;
+	if ($nn) {
+		local $self->{eml_each} = [ $eml_cb, @args ];
+		$err = _nntp_fetch_all($self, $nn, $uri);
+	} else {
+		$err = "E: not connected: $!";
+	}
+	warn $err if $err;
+	$nn;
+}
+
 sub new { bless {}, shift };
 
 1;
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index 8d13ea35..4b009a28 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -56,16 +56,16 @@ sub new {
 		defined(my $dirs = $cfg->{$k}) or next;
 		$dirs = PublicInbox::Config::_array($dirs);
 		for my $dir (@$dirs) {
-			my $url;
+			my $uri;
 			if (is_maildir($dir)) {
 				# skip "new", no MUA has seen it, yet.
 				$mdmap{"$dir/cur"} = 'watchspam';
-			} elsif (my $uri = imap_uri($dir)) {
+			} elsif ($uri = imap_uri($dir)) {
 				$imap{$$uri} = 'watchspam';
 				push @imap, $uri;
-			} elsif ($url = nntp_url($dir)) {
-				$nntp{$url} = 'watchspam';
-				push @nntp, $url;
+			} elsif ($uri = nntp_uri($dir)) {
+				$nntp{$$uri} = 'watchspam';
+				push @nntp, $uri;
 			} else {
 				warn "unsupported $k=$dir\n";
 			}
@@ -84,7 +84,7 @@ sub new {
 		my $watches = $ibx->{watch} or return;
 		$watches = PublicInbox::Config::_array($watches);
 		for my $watch (@$watches) {
-			my $url;
+			my $uri;
 			if (is_maildir($watch)) {
 				compile_watchheaders($ibx);
 				my ($new, $cur) = ("$watch/new", "$watch/cur");
@@ -92,17 +92,16 @@ sub new {
 				return if is_watchspam($cur, $cur_dst, $ibx);
 				push @{$mdmap{$new} //= []}, $ibx;
 				push @$cur_dst, $ibx;
-			} elsif (my $uri = imap_uri($watch)) {
-				my $url = $$uri;
-				return if is_watchspam($url, $imap{$url}, $ibx);
+			} elsif ($uri = imap_uri($watch)) {
+				my $cur_dst = $imap{$$uri} //= [];
+				return if is_watchspam($uri, $cur_dst, $ibx);
 				compile_watchheaders($ibx);
-				my $n = push @{$imap{$url} ||= []}, $ibx;
-				push @imap, $uri if $n == 1;
-			} elsif ($url = nntp_url($watch)) {
-				return if is_watchspam($url, $nntp{$url}, $ibx);
+				push(@imap, $uri) if 1 == push(@$cur_dst, $ibx);
+			} elsif ($uri = nntp_uri($watch)) {
+				my $cur_dst = $nntp{$$uri} //= [];
+				return if is_watchspam($uri, $cur_dst, $ibx);
 				compile_watchheaders($ibx);
-				my $n = push @{$nntp{$url} ||= []}, $ibx;
-				push @nntp, $url if $n == 1;
+				push(@nntp, $uri) if 1 == push(@$cur_dst, $ibx);
 			} else {
 				warn "watch unsupported: $k=$watch\n";
 			}
@@ -289,11 +288,11 @@ sub watch_fs_init ($) {
 }
 
 sub imap_import_msg ($$$$$) {
-	my ($self, $url, $uid, $raw, $flags) = @_;
+	my ($self, $uri, $uid, $raw, $flags) = @_;
 	# our target audience expects LF-only, save storage
 	$$raw =~ s/\r\n/\n/sg;
 
-	my $inboxes = $self->{imap}->{$url};
+	my $inboxes = $self->{imap}->{$$uri};
 	if (ref($inboxes)) {
 		for my $ibx (@$inboxes) {
 			my $eml = PublicInbox::Eml->new($$raw);
@@ -304,15 +303,14 @@ sub imap_import_msg ($$$$$) {
 		local $SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 		my $eml = PublicInbox::Eml->new($raw);
 		$self->{pi_cfg}->each_inbox(\&remove_eml_i,
-						$self, $eml, "$url UID:$uid");
+						$self, $eml, "$uri UID:$uid");
 	} else {
 		die "BUG: destination unknown $inboxes";
 	}
 }
 
 sub imap_fetch_all ($$$) {
-	my ($self, $mic, $url) = @_;
-	my $uri = PublicInbox::URIimap->new($url);
+	my ($self, $mic, $uri) = @_;
 	my $sec = uri_section($uri);
 	my $mbx = $uri->mailbox;
 	$mic->Clear(1); # trim results history
@@ -324,25 +322,25 @@ sub imap_fetch_all ($$$) {
 		last if $r_uidval && $r_uidnext;
 	}
 	$r_uidval //= $mic->uidvalidity($mbx) //
-		return "E: $url cannot get UIDVALIDITY";
+		return "E: $uri cannot get UIDVALIDITY";
 	$r_uidnext //= $mic->uidnext($mbx) //
-		return "E: $url cannot get UIDNEXT";
-	my $itrk = PublicInbox::IMAPTracker->new($url);
+		return "E: $uri cannot get UIDNEXT";
+	my $itrk = PublicInbox::IMAPTracker->new($$uri);
 	my ($l_uidval, $l_uid) = $itrk->get_last;
 	$l_uidval //= $r_uidval; # first time
 	$l_uid //= 1;
 	if ($l_uidval != $r_uidval) {
-		return "E: $url UIDVALIDITY mismatch\n".
+		return "E: $uri UIDVALIDITY mismatch\n".
 			"E: local=$l_uidval != remote=$r_uidval";
 	}
 	my $r_uid = $r_uidnext - 1;
 	if ($l_uid != 1 && $l_uid > $r_uid) {
-		return "E: $url local UID exceeds remote ($l_uid > $r_uid)\n".
-			"E: $url strangely, UIDVALIDLITY matches ($l_uidval)\n";
+		return "E: $uri local UID exceeds remote ($l_uid > $r_uid)\n".
+			"E: $uri strangely, UIDVALIDLITY matches ($l_uidval)\n";
 	}
 	return if $l_uid >= $r_uid; # nothing to do
 
-	warn "I: $url fetching UID $l_uid:$r_uid\n";
+	warn "I: $uri fetching UID $l_uid:$r_uid\n";
 	$mic->Uid(1); # the default, we hope
 	my $bs = $self->{imap_opt}->{$sec}->{batch_size} // 1;
 	my $req = $mic->imap4rev1 ? 'BODY.PEEK[]' : 'RFC822.PEEK';
@@ -355,7 +353,7 @@ sub imap_fetch_all ($$$) {
 	local $SIG{__WARN__} = sub {
 		my $pfx = ($_[0] // '') =~ /^([A-Z]: )/g ? $1 : '';
 		$batch //= '?';
-		$warn_cb->("$pfx$url UID:$batch\n", @_);
+		$warn_cb->("$pfx$uri UID:$batch\n", @_);
 	};
 	my $err;
 	do {
@@ -363,7 +361,7 @@ sub imap_fetch_all ($$$) {
 		# 1) servers do not need to return results in any order
 		# 2) Mail::IMAPClient doesn't offer a streaming API
 		$uids = $mic->search("UID $l_uid:*") or
-			return "E: $url UID SEARCH $l_uid:* error: $!";
+			return "E: $uri UID SEARCH $l_uid:* error: $!";
 		return if scalar(@$uids) == 0;
 
 		# RFC 3501 doesn't seem to indicate order of UID SEARCH
@@ -389,7 +387,7 @@ sub imap_fetch_all ($$$) {
 			local $0 = "UID:$batch $mbx $sec";
 			my $r = $mic->fetch_hash($batch, $req, 'FLAGS');
 			unless ($r) { # network error?
-				$err = "E: $url UID FETCH $batch error: $!";
+				$err = "E: $uri UID FETCH $batch error: $!";
 				last;
 			}
 			for my $uid (@batch) {
@@ -397,7 +395,7 @@ sub imap_fetch_all ($$$) {
 				my $per_uid = delete $r->{$uid} // next;
 				my $raw = delete($per_uid->{$key}) // next;
 				my $fl = $per_uid->{FLAGS} // '';
-				imap_import_msg($self, $url, $uid, \$raw, $fl);
+				imap_import_msg($self, $uri, $uid, \$raw, $fl);
 				$last_uid = $uid;
 				last if $self->{quit};
 			}
@@ -410,14 +408,14 @@ sub imap_fetch_all ($$$) {
 }
 
 sub imap_idle_once ($$$$) {
-	my ($self, $mic, $intvl, $url) = @_;
+	my ($self, $mic, $intvl, $uri) = @_;
 	my $i = $intvl //= (29 * 60);
 	my $end = now() + $intvl;
-	warn "I: $url idling for ${intvl}s\n";
+	warn "I: $uri idling for ${intvl}s\n";
 	local $0 = "IDLE $0";
 	unless ($mic->idle) {
 		return if $self->{quit};
-		return "E: IDLE failed on $url: $!";
+		return "E: IDLE failed on $uri: $!";
 	}
 	$self->{idle_mic} = $mic; # for ->quit
 	my @res;
@@ -428,16 +426,15 @@ sub imap_idle_once ($$$$) {
 	}
 	delete $self->{idle_mic};
 	unless ($self->{quit}) {
-		$mic->IsConnected or return "E: IDLE disconnected on $url";
-		$mic->done or return "E: IDLE DONE failed on $url: $!";
+		$mic->IsConnected or return "E: IDLE disconnected on $uri";
+		$mic->done or return "E: IDLE DONE failed on $uri: $!";
 	}
 	undef;
 }
 
 # idles on a single URI
 sub watch_imap_idle_1 ($$$) {
-	my ($self, $url, $intvl) = @_;
-	my $uri = PublicInbox::URIimap->new($url);
+	my ($self, $uri, $intvl) = @_;
 	my $sec = uri_section($uri);
 	my $mic_arg = $self->{mic_arg}->{$sec} or
 			die "BUG: no Mail::IMAPClient->new arg for $sec";
@@ -447,8 +444,8 @@ sub watch_imap_idle_1 ($$$) {
 		$mic //= PublicInbox::IMAPClient->new(%$mic_arg);
 		my $err;
 		if ($mic && $mic->IsConnected) {
-			$err = imap_fetch_all($self, $mic, $url);
-			$err //= imap_idle_once($self, $mic, $intvl, $url);
+			$err = imap_fetch_all($self, $mic, $uri);
+			$err //= imap_idle_once($self, $mic, $intvl, $uri);
 		} else {
 			$err = "E: not connected: $!";
 		}
@@ -477,21 +474,21 @@ sub watch_atfork_parent ($) {
 }
 
 sub imap_idle_requeue { # DS::add_timer callback
-	my ($self, $url_intvl) = @_;
+	my ($self, $uri_intvl) = @_;
 	return if $self->{quit};
-	push @{$self->{idle_todo}}, $url_intvl;
+	push @{$self->{idle_todo}}, $uri_intvl;
 	event_step($self);
 }
 
 sub imap_idle_reap { # PublicInbox::DS::dwaitpid callback
 	my ($self, $pid) = @_;
-	my $url_intvl = delete $self->{idle_pids}->{$pid} or
+	my $uri_intvl = delete $self->{idle_pids}->{$pid} or
 		die "BUG: PID=$pid (unknown) reaped: \$?=$?\n";
 
-	my ($url, $intvl) = @$url_intvl;
+	my ($uri, $intvl) = @$uri_intvl;
 	return if $self->{quit};
-	warn "W: PID=$pid on $url died: \$?=$?\n" if $?;
-	add_timer(60, \&imap_idle_requeue, $self, $url_intvl);
+	warn "W: PID=$pid on $uri died: \$?=$?\n" if $?;
+	add_timer(60, \&imap_idle_requeue, $self, $uri_intvl);
 }
 
 sub reap { # callback for EOFpipe
@@ -505,8 +502,8 @@ sub reap { # callback for EOFpipe
 }
 
 sub imap_idle_fork ($$) {
-	my ($self, $url_intvl) = @_;
-	my ($url, $intvl) = @$url_intvl;
+	my ($self, $uri_intvl) = @_;
+	my ($uri, $intvl) = @$uri_intvl;
 	pipe(my ($r, $w)) or die "pipe: $!";
 	my $seed = rand(0xffffffff);
 	my $pid = fork // die "fork: $!";
@@ -515,11 +512,11 @@ sub imap_idle_fork ($$) {
 		eval { Net::SSLeay::randomize() };
 		close $r;
 		watch_atfork_child($self);
-		watch_imap_idle_1($self, $url, $intvl);
+		watch_imap_idle_1($self, $uri, $intvl);
 		close $w;
 		_exit(0);
 	}
-	$self->{idle_pids}->{$pid} = $url_intvl;
+	$self->{idle_pids}->{$pid} = $uri_intvl;
 	PublicInbox::EOFpipe->new($r, \&reap, [$pid, \&imap_idle_reap, $self]);
 }
 
@@ -530,8 +527,8 @@ sub event_step {
 	if ($idle_todo && @$idle_todo) {
 		my $oldset = watch_atfork_parent($self);
 		eval {
-			while (my $url_intvl = shift(@$idle_todo)) {
-				imap_idle_fork($self, $url_intvl);
+			while (my $uri_intvl = shift(@$idle_todo)) {
+				imap_idle_fork($self, $uri_intvl);
 			}
 		};
 		PublicInbox::DS::sig_setmask($oldset);
@@ -541,30 +538,28 @@ sub event_step {
 }
 
 sub watch_imap_fetch_all ($$) {
-	my ($self, $urls) = @_;
-	for my $url (@$urls) {
-		my $uri = PublicInbox::URIimap->new($url);
+	my ($self, $uris) = @_;
+	for my $uri (@$uris) {
 		my $sec = uri_section($uri);
 		my $mic_arg = $self->{mic_arg}->{$sec} or
 			die "BUG: no Mail::IMAPClient->new arg for $sec";
 		my $mic = PublicInbox::IMAPClient->new(%$mic_arg) or next;
-		my $err = imap_fetch_all($self, $mic, $url);
+		my $err = imap_fetch_all($self, $mic, $uri);
 		last if $self->{quit};
 		warn $err, "\n" if $err;
 	}
 }
 
 sub watch_nntp_fetch_all ($$) {
-	my ($self, $urls) = @_;
-	for my $url (@$urls) {
-		my $uri = uri_new($url);
+	my ($self, $uris) = @_;
+	for my $uri (@$uris) {
 		my $sec = uri_section($uri);
 		my $nn_arg = $self->{nn_arg}->{$sec} or
 			die "BUG: no Net::NNTP->new arg for $sec";
 		my $nntp_opt = $self->{nntp_opt}->{$sec};
-		my $nn = nn_new($nn_arg, $nntp_opt, $url);
+		my $nn = nn_new($nn_arg, $nntp_opt, $uri);
 		unless ($nn) {
-			warn "E: $url: \$!=$!\n";
+			warn "E: $uri: \$!=$!\n";
 			next;
 		}
 		last if $self->{quit};
@@ -572,21 +567,21 @@ sub watch_nntp_fetch_all ($$) {
 			for my $m_arg (@$postconn) {
 				my ($method, @args) = @$m_arg;
 				$nn->$method(@args) and next;
-				warn "E: <$url> $method failed\n";
+				warn "E: <$uri> $method failed\n";
 				$nn = undef;
 				last;
 			}
 		}
 		last if $self->{quit};
 		if ($nn) {
-			my $err = nntp_fetch_all($self, $nn, $url);
+			my $err = nntp_fetch_all($self, $nn, $uri);
 			warn $err, "\n" if $err;
 		}
 	}
 }
 
 sub poll_fetch_fork { # DS::add_timer callback
-	my ($self, $intvl, $urls) = @_;
+	my ($self, $intvl, $uris) = @_;
 	return if $self->{quit};
 	pipe(my ($r, $w)) or die "pipe: $!";
 	my $oldset = watch_atfork_parent($self);
@@ -597,47 +592,46 @@ sub poll_fetch_fork { # DS::add_timer callback
 		eval { Net::SSLeay::randomize() };
 		close $r;
 		watch_atfork_child($self);
-		if ($urls->[0] =~ m!\Aimaps?://!i) {
-			watch_imap_fetch_all($self, $urls);
+		if ($uris->[0]->scheme =~ m!\Aimaps?!i) {
+			watch_imap_fetch_all($self, $uris);
 		} else {
-			watch_nntp_fetch_all($self, $urls);
+			watch_nntp_fetch_all($self, $uris);
 		}
 		close $w;
 		_exit(0);
 	}
 	PublicInbox::DS::sig_setmask($oldset);
 	die "fork: $!"  unless defined $pid;
-	$self->{poll_pids}->{$pid} = [ $intvl, $urls ];
+	$self->{poll_pids}->{$pid} = [ $intvl, $uris ];
 	PublicInbox::EOFpipe->new($r, \&reap, [$pid, \&poll_fetch_reap, $self]);
 }
 
 sub poll_fetch_reap {
 	my ($self, $pid) = @_;
-	my $intvl_urls = delete $self->{poll_pids}->{$pid} or
+	my $intvl_uris = delete $self->{poll_pids}->{$pid} or
 		die "BUG: PID=$pid (unknown) reaped: \$?=$?\n";
 	return if $self->{quit};
-	my ($intvl, $urls) = @$intvl_urls;
+	my ($intvl, $uris) = @$intvl_uris;
 	if ($?) {
-		warn "W: PID=$pid died: \$?=$?\n", map { "$_\n" } @$urls;
+		warn "W: PID=$pid died: \$?=$?\n", map { "$_\n" } @$uris;
 	}
-	warn("I: will check $_ in ${intvl}s\n") for @$urls;
-	add_timer($intvl, \&poll_fetch_fork, $self, $intvl, $urls);
+	warn("I: will check $_ in ${intvl}s\n") for @$uris;
+	add_timer($intvl, \&poll_fetch_fork, $self, $intvl, $uris);
 }
 
 sub watch_imap_init ($$) {
 	my ($self, $poll) = @_;
 	my $mics = imap_common_init($self); # read args from config
-	my $idle = []; # [ [ url1, intvl1 ], [url2, intvl2] ]
-	for my $url (keys %{$self->{imap}}) {
-		my $uri = PublicInbox::URIimap->new($url);
+	my $idle = []; # [ [ uri1, intvl1 ], [uri2, intvl2] ]
+	for my $uri (@{$self->{imap_order}}) {
 		my $sec = uri_section($uri);
 		my $mic = $mics->{$sec};
 		my $intvl = $self->{imap_opt}->{$sec}->{pollInterval};
 		if ($mic->has_capability('IDLE') && !$intvl) {
 			$intvl = $self->{imap_opt}->{$sec}->{idleInterval};
-			push @$idle, [ $url, $intvl // () ];
+			push @$idle, [ $uri, $intvl // () ];
 		} else {
-			push @{$poll->{$intvl || 120}}, $url;
+			push @{$poll->{$intvl || 120}}, $uri;
 		}
 	}
 	if (scalar @$idle) {
@@ -646,38 +640,8 @@ sub watch_imap_init ($$) {
 	}
 }
 
-# flesh out common NNTP-specific data structures
-sub nntp_common_init ($) {
-	my ($self) = @_;
-	my $cfg = $self->{pi_cfg};
-	my $nn_args = {}; # scheme://authority => Net::NNTP->new arg
-	for my $url (@{$self->{nntp_order}}) {
-		my $sec = uri_section(uri_new($url));
-
-		# Debug and Timeout are passed to Net::NNTP->new
-		my $v = cfg_bool($cfg, 'nntp.Debug', $url);
-		$nn_args->{$sec}->{Debug} = $v if defined $v;
-		my $to = cfg_intvl($cfg, 'nntp.Timeout', $url);
-		$nn_args->{$sec}->{Timeout} = $to if $to;
-
-		# Net::NNTP post-connect commands
-		for my $k (qw(starttls compress)) {
-			$v = cfg_bool($cfg, "nntp.$k", $url) // next;
-			$self->{nntp_opt}->{$sec}->{$k} = $v;
-		}
-
-		# internal option
-		for my $k (qw(pollInterval)) {
-			$to = cfg_intvl($cfg, "nntp.$k", $url) // next;
-			$self->{nntp_opt}->{$sec}->{$k} = $to;
-		}
-	}
-	$nn_args;
-}
-
 sub nntp_fetch_all ($$$) {
-	my ($self, $nn, $url) = @_;
-	my $uri = uri_new($url);
+	my ($self, $nn, $uri) = @_;
 	my ($group, $num_a, $num_b) = $uri->group;
 	my $sec = uri_section($uri);
 	my ($nr, $beg, $end) = $nn->group($group);
@@ -689,7 +653,7 @@ sub nntp_fetch_all ($$$) {
 	# IMAPTracker is also used for tracking NNTP, UID == article number
 	# LIST.ACTIVE can get the equivalent of UIDVALIDITY, but that's
 	# expensive.  So we assume newsgroups don't change:
-	my $itrk = PublicInbox::IMAPTracker->new($url);
+	my $itrk = PublicInbox::IMAPTracker->new($$uri);
 	my (undef, $l_art) = $itrk->get_last;
 	$l_art //= $beg; # initial import
 
@@ -702,14 +666,14 @@ sub nntp_fetch_all ($$$) {
 	return if $l_art >= $end; # nothing to do
 	$beg = $l_art + 1;
 
-	warn "I: $url fetching ARTICLE $beg..$end\n";
+	warn "I: $uri fetching ARTICLE $beg..$end\n";
 	my $warn_cb = $SIG{__WARN__} || \&CORE::warn;
 	my ($err, $art);
 	local $SIG{__WARN__} = sub {
 		my $pfx = ($_[0] // '') =~ /^([A-Z]: )/g ? $1 : '';
-		$warn_cb->("$pfx$url ", $art ? ("ARTICLE $art") : (), "\n", @_);
+		$warn_cb->("$pfx$uri ", $art ? ("ARTICLE $art") : (), "\n", @_);
 	};
-	my $inboxes = $self->{nntp}->{$url};
+	my $inboxes = $self->{nntp}->{$$uri};
 	my $last_art;
 	my $n = $self->{max_batch};
 	for ($beg..$end) {
@@ -741,7 +705,7 @@ sub nntp_fetch_all ($$$) {
 		} elsif ($inboxes eq 'watchspam') {
 			my $eml = PublicInbox::Eml->new(\$raw);
 			$self->{pi_cfg}->each_inbox(\&remove_eml_i,
-					$self, $eml, "$url ARTICLE $art");
+					$self, $eml, "$uri ARTICLE $art");
 		} else {
 			die "BUG: destination unknown $inboxes";
 		}
@@ -754,23 +718,11 @@ sub nntp_fetch_all ($$$) {
 
 sub watch_nntp_init ($$) {
 	my ($self, $poll) = @_;
-	eval { require Net::NNTP } or
-		die "Net::NNTP is required for NNTP:\n$@\n";
-	eval { require PublicInbox::IMAPTracker } or
-		die "DBD::SQLite is required for NNTP\n:$@\n";
-
-	my $nn_args = nntp_common_init($self); # read args from config
-
-	# make sure we can connect and cache the credentials in memory
-	$self->{nn_arg} = {}; # schema://authority => Net::NNTP->new args
-	for my $url (@{$self->{nntp_order}}) {
-		nn_for($self, $url, $nn_args);
-	}
-	for my $url (@{$self->{nntp_order}}) {
-		my $uri = uri_new($url);
+	nntp_common_init($self); # read args from config
+	for my $uri (@{$self->{nntp_order}}) {
 		my $sec = uri_section($uri);
 		my $intvl = $self->{nntp_opt}->{$sec}->{pollInterval};
-		push @{$poll->{$intvl || 120}}, $url;
+		push @{$poll->{$intvl || 120}}, $uri;
 	}
 }
 
@@ -778,12 +730,12 @@ sub watch { # main entry point
 	my ($self, $sig, $oldset) = @_;
 	$self->{oldset} = $oldset;
 	$self->{sig} = $sig;
-	my $poll = {}; # intvl_seconds => [ url1, url2 ]
+	my $poll = {}; # intvl_seconds => [ uri1, uri2 ]
 	watch_imap_init($self, $poll) if $self->{imap};
 	watch_nntp_init($self, $poll) if $self->{nntp};
-	while (my ($intvl, $urls) = each %$poll) {
-		# poll all URLs for a given interval sequentially
-		add_timer(0, \&poll_fetch_fork, $self, $intvl, $urls);
+	while (my ($intvl, $uris) = each %$poll) {
+		# poll all URIs for a given interval sequentially
+		add_timer(0, \&poll_fetch_fork, $self, $intvl, $uris);
 	}
 	watch_fs_init($self) if $self->{mdre};
 	PublicInbox::DS->SetPostLoopCallback(sub { !$self->quit_done });
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 29f8ba75..2ba62db3 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -6,26 +6,43 @@ use PublicInbox::MboxReader;
 use PublicInbox::MdirReader;
 use PublicInbox::NetReader;
 require_git 2.6;
-require_mods(qw(DBD::SQLite Search::Xapian Mail::IMAPClient));
+require_mods(qw(DBD::SQLite Search::Xapian Mail::IMAPClient Net::NNTP));
 my ($tmpdir, $for_destroy) = tmpdir;
 my $sock = tcp_server;
-my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $cmd = [ '-imapd', '-W0', "--stdout=$tmpdir/i1", "--stderr=$tmpdir/i2" ];
 my ($ro_home, $cfg_path) = setup_public_inboxes;
 my $env = { PI_CONFIG => $cfg_path };
-my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
-my $host_port = tcp_host_port($sock);
+my $tdi = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-imapd: $?");
+my $imap_host_port = tcp_host_port($sock);
+$sock = tcp_server;
+$cmd = [ '-nntpd', '-W0', "--stdout=$tmpdir/n1", "--stderr=$tmpdir/n2" ];
+my $tdn = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-nntpd: $?");
+my $nntp_host_port = tcp_host_port($sock);
 undef $sock;
+
 test_lei({ tmpdir => $tmpdir }, sub {
 	my $d = $ENV{HOME};
-	my $dig = Digest::SHA->new(256);
 	lei_ok('convert', '-o', "mboxrd:$d/foo.mboxrd",
-		"imap://$host_port/t.v2.0");
-	ok(-f "$d/foo.mboxrd", 'mboxrd created');
+		"imap://$imap_host_port/t.v2.0");
+	ok(-f "$d/foo.mboxrd", 'mboxrd created from imap://');
+
+	lei_ok('convert', '-o', "mboxrd:$d/nntp.mboxrd",
+		"nntp://$nntp_host_port/t.v2");
+	ok(-f "$d/nntp.mboxrd", 'mboxrd created from nntp://');
+
 	my (@mboxrd, @mboxcl2);
 	open my $fh, '<', "$d/foo.mboxrd" or BAIL_OUT $!;
 	PublicInbox::MboxReader->mboxrd($fh, sub { push @mboxrd, shift });
 	ok(scalar(@mboxrd) > 1, 'got multiple messages');
 
+	open $fh, '<', "$d/nntp.mboxrd" or BAIL_OUT $!;
+	my $i = 0;
+	PublicInbox::MboxReader->mboxrd($fh, sub {
+		my ($eml) = @_;
+		is($eml->body, $mboxrd[$i]->body, "body matches #$i");
+		$i++;
+	});
+
 	lei_ok('convert', '-o', "mboxcl2:$d/cl2", "mboxrd:$d/foo.mboxrd");
 	ok(-s "$d/cl2", 'mboxcl2 non-empty') or diag $lei_err;
 	open $fh, '<', "$d/cl2" or BAIL_OUT $!;
diff --git a/t/lei-import-nntp.t b/t/lei-import-nntp.t
new file mode 100644
index 00000000..3fb78fbc
--- /dev/null
+++ b/t/lei-import-nntp.t
@@ -0,0 +1,30 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(json DBD::SQLite Search::Xapian Net::NNTP));
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my ($tmpdir, $for_destroy) = tmpdir;
+my $sock = tcp_server;
+my $cmd = [ '-nntpd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-nntpd $?");
+my $host_port = tcp_host_port($sock);
+undef $sock;
+test_lei({ tmpdir => $tmpdir }, sub {
+	lei_ok(qw(q bytes:1..));
+	my $out = json_utf8->decode($lei_out);
+	is_deeply($out, [ undef ], 'nothing imported, yet');
+	lei_ok('import', "nntp://$host_port/t.v2");
+	diag $lei_err;
+	lei_ok(qw(q bytes:1..));
+	diag $lei_err;
+	$out = json_utf8->decode($lei_out);
+	ok(scalar(@$out) > 1, 'got imported messages');
+	is(pop @$out, undef, 'trailing JSON null element was null');
+	my %r;
+	for (@$out) { $r{ref($_)}++ }
+	is_deeply(\%r, { 'HASH' => scalar(@$out) }, 'all hashes');
+});
+done_testing;
diff --git a/t/watch_nntp.t b/t/watch_nntp.t
deleted file mode 100644
index c0ad3098..00000000
--- a/t/watch_nntp.t
+++ /dev/null
@@ -1,17 +0,0 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
-# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict;
-use Test::More;
-use PublicInbox::Config;
-# see t/nntpd*.t for tests against a live NNTP server
-
-use_ok 'PublicInbox::Watch';
-my $nntp_url = \&PublicInbox::Watch::nntp_url;
-is('news://example.com/inbox.foo',
-	$nntp_url->('NEWS://examplE.com/inbox.foo'), 'lowercased');
-is('nntps://example.com/inbox.foo',
-	$nntp_url->('nntps://example.com/inbox.foo'), 'nntps:// accepted');
-is('nntps://example.com/inbox.foo',
-	$nntp_url->('SNEWS://example.com/inbox.foo'), 'snews => nntps');
-
-done_testing;

^ permalink raw reply related	[relevance 15%]

* lei: per-message keywords and externals
@ 2021-02-24 20:49 70% Eric Wong
  2021-02-26  9:26 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-24 20:49 UTC (permalink / raw)
  To: meta

Something I've been pondering for a bit is how to handle
keywords (Seen, Important, Replied, ...) for messages stored in
externals.

I want "kw:" prefix to be a usable search term, like:

	lei q something interesting kw:seen
	lei q something interesting NOT kw:seen

This is no problem for imported messages in ~/.local/share/lei/store.
All the keyword info is stored in line with the rest of the
Xapian index data.

But, I also don't want to be wasting users' space by duplicating
index data if they're already hosting inboxes for public
consumption.  So, it's looking like parsing out kw: ourselves
and do extra filtering on our end when externals are in play is
going to be a requirement...

Or, just don't support searching using "kw:" with externals, for
now; but still stash keywords somewhere when writing to
traditional mail stores.

And there's also HTTP/HTTPS externals, but those will have
transparent caching/memoization into lei/store by default, soon.

^ permalink raw reply	[relevance 70%]

* [PATCH 0/2] "lei q" remote memoization
@ 2021-02-24 23:37 71% Eric Wong
  2021-02-24 23:37 70% ` [PATCH 2/2] lei q: auto-memoize remote messages into lei/store Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-24 23:37 UTC (permalink / raw)
  To: meta

1/2 only happened because I made IPv6 tests the default.
2/2 is a feature I've always wanted.

Eric Wong (2):
  lei_external: don't treat IPv6 URLs as globs
  lei q: auto-memoize remote messages into lei/store

 MANIFEST                       |  1 +
 lib/PublicInbox/LEI.pm         |  2 ++
 lib/PublicInbox/LeiExternal.pm |  8 +++++-
 lib/PublicInbox/LeiQuery.pm    |  1 +
 lib/PublicInbox/LeiXSearch.pm  | 10 +++++--
 t/lei-q-remote-import.t        | 50 ++++++++++++++++++++++++++++++++++
 t/lei_external.t               |  1 +
 7 files changed, 69 insertions(+), 4 deletions(-)
 create mode 100644 t/lei-q-remote-import.t

^ permalink raw reply	[relevance 71%]

* [PATCH 2/2] lei q: auto-memoize remote messages into lei/store
  2021-02-24 23:37 71% [PATCH 0/2] "lei q" remote memoization Eric Wong
@ 2021-02-24 23:37 70% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-24 23:37 UTC (permalink / raw)
  To: meta

This lets users avoid network traffic on subsequent searches at
the expense of local disk space.  --no-import-remote may be
specified to reverse this trade-off for users with little
storage.
---
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        |  2 ++
 lib/PublicInbox/LeiQuery.pm   |  1 +
 lib/PublicInbox/LeiXSearch.pm | 10 ++++---
 t/lei-q-remote-import.t       | 50 +++++++++++++++++++++++++++++++++++
 5 files changed, 61 insertions(+), 3 deletions(-)
 create mode 100644 t/lei-q-remote-import.t

diff --git a/MANIFEST b/MANIFEST
index 4c04eec8..adbd108f 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -372,6 +372,7 @@ t/lei-import-maildir.t
 t/lei-import-nntp.t
 t/lei-import.t
 t/lei-mirror.t
+t/lei-q-remote-import.t
 t/lei.t
 t/lei_dedupe.t
 t/lei_external.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8cd95ac2..50665b3e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -112,6 +112,7 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
+	import-remote!
 	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
@@ -225,6 +226,7 @@ my %OPTDESC = (
 		'whether or not to wrap git and curl commands with torsocks'],
 'no-torsocks' => 'alias for --torsocks=no',
 'save-as=s' => ['NAME', 'save a search terms by given name'],
+'import-remote!' => 'do not memoize remote messages into local store',
 
 'type=s' => [ 'any|mid|git', 'disambiguate type' ],
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 743fa3f7..b57d1cc5 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -51,6 +51,7 @@ sub lei_q {
 	# we'll allow "--only $LOCATION --local"
 	my $sto = $self->_lei_store(1);
 	my $lse = $sto->search;
+	$sto->write_prepare($self) if $opt->{'import-remote'} //= 1;
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		$lxs->prepare_external($lse);
 	}
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index c46aba3b..2d399653 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -189,8 +189,9 @@ sub query_mset { # non-parallel for non-"--threads" users
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
-sub each_eml { # callback for MboxReader->mboxrd
+sub each_remote_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self, $lei, $each_smsg) = @_;
+	$lei->{sto}->ipc_do('set_eml', $eml) if $lei->{sto}; # --import-remote
 	my $smsg = bless {}, 'PublicInbox::Smsg';
 	$smsg->populate($eml);
 	$smsg->parse_references($eml, mids($eml));
@@ -244,14 +245,17 @@ sub query_remote_mboxrd {
 		my ($fh, $pid) = popen_rd($cmd, undef, $rdr);
 		$reap_curl = PublicInbox::OnDestroy->new($sigint_reap, $pid);
 		$fh = IO::Uncompress::Gunzip->new($fh);
-		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
+		PublicInbox::MboxReader->mboxrd($fh, \&each_remote_eml, $self,
 						$lei, $each_smsg);
 		my $err = waitpid($pid, 0) == $pid ? undef
 						: "BUG: waitpid($cmd): $!";
 		@$reap_curl = (); # cancel OnDestroy
 		die $err if $err;
+		my $nr = $lei->{-nr_remote_eml};
+		if ($nr && $lei->{sto}) {
+			my $wait = $lei->{sto}->ipc_do('done');
+		}
 		if ($? == 0) {
-			my $nr = $lei->{-nr_remote_eml};
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
 		}
diff --git a/t/lei-q-remote-import.t b/t/lei-q-remote-import.t
new file mode 100644
index 00000000..f73524cf
--- /dev/null
+++ b/t/lei-q-remote-import.t
@@ -0,0 +1,50 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(json DBD::SQLite Search::Xapian));
+use PublicInbox::MboxReader;
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $sock = tcp_server;
+my ($tmpdir, $for_destroy) = tmpdir;
+my $cmd = [ '-httpd', '-W0', "--stdout=$tmpdir/1", "--stderr=$tmpdir/2" ];
+my $env = { PI_CONFIG => $cfg_path };
+my $td = start_script($cmd, $env, { 3 => $sock }) or BAIL_OUT("-httpd: $?");
+my $host_port = tcp_host_port($sock);
+my $url = "http://$host_port/t2/";
+my $exp1 = [ eml_load('t/plack-qp.eml') ];
+my $exp2 = [ eml_load('t/iso-2202-jp.eml') ];
+my $slurp_emls = sub {
+	open my $fh, '<', $_[0] or BAIL_OUT "open: $!";
+	my @eml;
+	PublicInbox::MboxReader->mboxrd($fh, sub {
+		my $eml = shift;
+		$eml->header_set('Status');
+		push @eml, $eml;
+	});
+	\@eml;
+};
+
+test_lei({ tmpdir => $tmpdir }, sub {
+	my $o = "$ENV{HOME}/o.mboxrd";
+	my @cmd = ('q', '-o', "mboxrd:$o", 'm:qp@example.com');
+	lei_ok(@cmd);
+	ok(-f $o && !-s _, 'output exists but is empty');
+	unlink $o or BAIL_OUT $!;
+	lei_ok(@cmd, '-I', $url);
+	is_deeply($slurp_emls->($o), $exp1, 'got results after remote search');
+	unlink $o or BAIL_OUT $!;
+	lei_ok(@cmd);
+	ok(-f $o && -s _, 'output exists after import but is not empty');
+	is_deeply($slurp_emls->($o), $exp1, 'got results w/o remote search');
+	unlink $o or BAIL_OUT $!;
+
+	$cmd[-1] = 'm:199707281508.AAA24167@hoyogw.example';
+	lei_ok(@cmd, '-I', $url, '--no-import-remote');
+	is_deeply($slurp_emls->($o), $exp2, 'got another after remote search');
+	unlink $o or BAIL_OUT $!;
+	lei_ok(@cmd);
+	ok(-f $o && !-s _, '--no-import-remote did not memoize');
+});
+done_testing;

^ permalink raw reply related	[relevance 70%]

* [PATCH 0/4] lei: fleshing out some existing features
@ 2021-02-25 10:11 68% Eric Wong
  2021-02-25 10:11 45% ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

Managed to get more stuff done while still pondering keyword
 storage with read-only externals(*)

1/4 fleshes out convert, which should be feature-complete as far
as currently supported inputs and outputs (no MH, JMAP, POP3,
MMDF, yet)

2/4 represents a major incompatibility in replacing --format/-f with
--in-format/-F in "lei import" for consistency with "lei convert".
Anyways, this is pre-release software and I discouraged "-f" anyways;
so hopefully nobody's scripts are broken :x

4/4 is another one of the things I've found myself wanting
for a while (it wasn't in mairix).

(*) https://public-inbox.org/meta/20210224204950.GA2076@dcvr/

Eric Wong (4):
  lei convert: support IMAP output and "-F eml" inputs
  lei import: use --in-format/-F for consistency
  test_common: io_modes: always support read/write
  lei q: -tt marks direct hits as "flagged"

 Documentation/lei-import.pod  |  2 +-
 Documentation/lei-q.pod       |  8 ++++++
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        | 12 ++++-----
 lib/PublicInbox/LeiConvert.pm | 51 ++++++++++++++++++++++-------------
 lib/PublicInbox/LeiImport.pm  |  8 +++---
 lib/PublicInbox/LeiXSearch.pm | 21 ++++++++++++---
 lib/PublicInbox/NetWriter.pm  |  3 ++-
 lib/PublicInbox/TestCommon.pm |  4 +--
 t/lei-convert.t               | 15 +++++++++++
 t/lei-import.t                | 12 ++++-----
 t/lei-q-thread.t              | 47 ++++++++++++++++++++++++++++++++
 t/lei_to_mail.t               |  2 +-
 xt/net_writer-imap.t          |  4 +++
 14 files changed, 146 insertions(+), 44 deletions(-)
 create mode 100644 t/lei-q-thread.t


^ permalink raw reply	[relevance 68%]

* [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs
  2021-02-25 10:11 68% [PATCH 0/4] lei: fleshing out some existing features Eric Wong
@ 2021-02-25 10:11 45% ` Eric Wong
  2021-02-25 10:11 44% ` [PATCH 2/4] lei import: use --in-format/-F for consistency Eric Wong
  2021-02-25 10:11 44% ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

eml ("message/rfc822" MIME type) is supported by "lei import",
so it probably makes sense to support via convert, at least
for tests.  And IMAP support is supported in "lei q -o $MFOLDER",
so this only required renaming {nrd} => {net} and initializing
outputs before augment preparation (creating the IMAP folder)
---
 lib/PublicInbox/LeiConvert.pm | 47 +++++++++++++++++++++++------------
 lib/PublicInbox/LeiImport.pm  |  1 -
 lib/PublicInbox/NetWriter.pm  |  3 ++-
 t/lei-convert.t               | 15 +++++++++++
 xt/net_writer-imap.t          |  4 +++
 5 files changed, 52 insertions(+), 18 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index a7e47871..32aa2edb 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -28,25 +28,35 @@ sub mdir_cb {
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
+sub convert_fh ($$$$) {
+	my ($self, $ifmt, $fh, $name) = @_;
+	if ($ifmt eq 'eml') {
+		my $buf = do { local $/; <$fh> } //
+			return $self->{lei}->child_error(1 << 8, <<"");
+error reading $name: $!
+
+		my $eml = PublicInbox::Eml->new(\$buf);
+		$self->{wcb}->(undef, { kw => [] }, $eml);
+	} else {
+		PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+	}
+}
+
 sub do_convert { # via wq_do
 	my ($self) = @_;
 	my $lei = $self->{lei};
 	my $in_fmt = $lei->{opt}->{'in-format'};
 	my $mics;
-	if (my $nrd = $lei->{nrd}) { # may prompt user once
-		$nrd->{mics_cached} = $nrd->imap_common_init($lei);
-		$nrd->{nn_cached} = $nrd->nntp_common_init($lei);
-	}
 	if (my $stdin = delete $self->{0}) {
-		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
+		convert_fh($self, $in_fmt, $stdin, '<stdin>');
 	}
 	for my $input (@{$self->{inputs}}) {
 		my $ifmt = lc($in_fmt // '');
 		if ($input =~ m!\Aimaps?://!) {
-			$lei->{nrd}->imap_each($input, \&net_cb, $self);
+			$lei->{net}->imap_each($input, \&net_cb, $self);
 			next;
 		} elsif ($input =~ m!\A(?:nntps?|s?news)://!) {
-			$lei->{nrd}->nntp_each($input, \&net_cb, $self);
+			$lei->{net}->nntp_each($input, \&net_cb, $self);
 			next;
 		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
 			$ifmt = lc $1;
@@ -54,7 +64,7 @@ sub do_convert { # via wq_do
 		if (-f $input) {
 			open my $fh, '<', $input or
 					return $lei->fail("open $input: $!");
-			PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+			convert_fh($self, $ifmt, $fh, $input);
 		} elsif (-d _) {
 			PublicInbox::MdirReader::maildir_each_eml($input,
 							\&mdir_cb, $self);
@@ -72,11 +82,12 @@ sub call { # the main "lei convert" method
 	$opt->{kw} //= 1;
 	my $self = $lei->{cnv} = bless {}, $cls;
 	my $in_fmt = $opt->{'in-format'};
-	my ($nrd, @f, @d);
+	my (@f, @d);
 	$opt->{dedupe} //= 'none';
 	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
 	$lei->{l2m} or return
 		$lei->fail("output not specified or is not a mail destination");
+	my $net = $lei->{net}; # NetWriter may be created by l2m
 	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	if ($opt->{stdin}) {
 		@inputs and return $lei->fail("--stdin and @inputs do not mix");
@@ -88,8 +99,8 @@ sub call { # the main "lei convert" method
 		my $input_path = $input;
 		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
 			require PublicInbox::NetReader;
-			$nrd //= PublicInbox::NetReader->new;
-			$nrd->add_url($input);
+			$net //= PublicInbox::NetReader->new;
+			$net->add_url($input);
 		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
 			my $ifmt = lc $1;
 			if (($in_fmt // $ifmt) ne $ifmt) {
@@ -117,12 +128,12 @@ sub call { # the main "lei convert" method
 		require PublicInbox::MdirReader;
 	}
 	$self->{inputs} = \@inputs;
-	if ($nrd) {
-		if (my $err = $nrd->errors) {
+	if ($net) {
+		if (my $err = $net->errors) {
 			return $lei->fail($err);
 		}
-		$nrd->{quiet} = $opt->{quiet};
-		$lei->{nrd} = $nrd;
+		$net->{quiet} = $opt->{quiet};
+		$lei->{net} //= $net;
 	}
 	my $op = $lei->workers_start($self, 'lei_convert', 1, {
 		'' => [ $lei->can('dclose'), $lei ]
@@ -137,11 +148,15 @@ sub ipc_atfork_child {
 	my $lei = $self->{lei};
 	$lei->lei_atfork_child;
 	my $l2m = delete $lei->{l2m};
+	if (my $net = $lei->{net}) { # may prompt user once
+		$net->{mics_cached} = $net->imap_common_init($lei);
+		$net->{nn_cached} = $net->nntp_common_init($lei);
+	}
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$l2m->pre_augment($lei);
 	$l2m->do_augment($lei);
 	$l2m->post_augment($lei);
 	$self->{wcb} = $l2m->write_cb($lei);
-	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index cbfb3127..13e817d0 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -7,7 +7,6 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
-use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::PktOp qw(pkt_do);
 
 sub _import_eml { # MboxReader callback
diff --git a/lib/PublicInbox/NetWriter.pm b/lib/PublicInbox/NetWriter.pm
index c68b0669..e26e9815 100644
--- a/lib/PublicInbox/NetWriter.pm
+++ b/lib/PublicInbox/NetWriter.pm
@@ -16,7 +16,8 @@ my %IMAPkw2flags;
 sub imap_append {
 	my ($mic, $folder, $bref, $smsg, $eml) = @_;
 	$bref //= \($eml->as_string);
-	$smsg //= bless { }, 'PublicInbox::Smsg';
+	$smsg //= bless {}, 'PublicInbox::Smsg';
+	bless($smsg, 'PublicInbox::Smsg') if ref($smsg) eq 'HASH';
 	$smsg->{ts} //= msg_timestamp($eml // PublicInbox::Eml->new($$bref));
 	my @f = map { $IMAPkw2flags{$_} } @{$smsg->{kw}};
 	$mic->append_string($folder, $$bref, "@f", $smsg->internaldate) or
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 2ba62db3..20099f65 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -5,6 +5,7 @@ use strict; use v5.10.1; use PublicInbox::TestCommon;
 use PublicInbox::MboxReader;
 use PublicInbox::MdirReader;
 use PublicInbox::NetReader;
+use PublicInbox::Eml;
 require_git 2.6;
 require_mods(qw(DBD::SQLite Search::Xapian Mail::IMAPClient Net::NNTP));
 my ($tmpdir, $for_destroy) = tmpdir;
@@ -84,5 +85,19 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	open $fh, '<', "$d/foo.mboxrd" or BAIL_OUT;
 	my $exp = do { local $/; <$fh> };
 	is($out, $exp, 'stdin => stdout');
+
+	lei_ok qw(convert -F eml -o mboxcl2:/dev/stdout t/plack-qp.eml);
+	open $fh, '<', \$lei_out or BAIL_OUT;
+	@bar = ();
+	PublicInbox::MboxReader->mboxcl2($fh, sub {
+		my $eml = shift;
+		for my $h (qw(Status Content-Length Lines)) {
+			ok(defined($eml->header_raw($h)),
+				"$h defined for mboxcl2");
+			$eml->header_set($h);
+		}
+		push @bar, $eml;
+	});
+	is_deeply(\@bar, [ eml_load('t/plack-qp.eml') ], 'eml => mboxcl2');
 });
 done_testing;
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index 64f822cf..da435926 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -138,6 +138,10 @@ test_lei(sub {
 	$nwr->imap_each($folder_uri, $imap_slurp_all, my $empty = []);
 	is(scalar(@$empty), 0, 'no results w/o augment');
 
+	lei_ok qw(convert -F eml t/msg_iter-order.eml -o), $$folder_uri;
+	$nwr->imap_each($folder_uri, $imap_slurp_all, $empty = []);
+	is_deeply($empty, [ [ [], eml_load('t/msg_iter-order.eml') ] ],
+		'converted to IMAP destination');
 });
 
 undef $cleanup; # remove temporary folder

^ permalink raw reply related	[relevance 45%]

* [PATCH 2/4] lei import: use --in-format/-F for consistency
  2021-02-25 10:11 68% [PATCH 0/4] lei: fleshing out some existing features Eric Wong
  2021-02-25 10:11 45% ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
@ 2021-02-25 10:11 44% ` Eric Wong
  2021-02-25 10:11 44% ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

Since we recommend $IN_FORMAT:$LOCATION, this is hopefully not
intrusive (not that this is released software, yet).  This is
to be consistent with "lei convert" usage.

We'll keep "-f" only for output formats, since that is used
for "lei q" and "lei convert" for outputs
---
 Documentation/lei-import.pod  |  2 +-
 lib/PublicInbox/LEI.pm        |  8 ++++----
 lib/PublicInbox/LeiConvert.pm |  4 ++--
 lib/PublicInbox/LeiImport.pm  |  7 +++----
 t/lei-import.t                | 12 ++++++------
 t/lei_to_mail.t               |  2 +-
 6 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 2051e6bc..ef20e2f6 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -22,7 +22,7 @@ TODO: Update when URL support is added.
 
 =over
 
-=item -f MAIL_FORMAT, --format=MAIL_FORMAT
+=item -F MAIL_FORMAT, --in-format=MAIL_FORMAT
 
 Message input format.  Unless messages are given on C<stdin>, using a
 format prefix with C<LOCATION> is preferred.
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 50665b3e..8eb96e78 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -172,7 +172,7 @@ our %CMD = ( # sorted in order of importance/use:
 'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
-	format|f=s kw|keywords|flags! C=s@),
+	in-format|F=s kw|keywords|flags! C=s@),
 	],
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
@@ -399,9 +399,9 @@ sub fail ($$;$) {
 	undef;
 }
 
-sub check_input_format ($;$$) {
-	my ($self, $files, $opt_key) = @_;
-	$opt_key //= 'format';
+sub check_input_format ($;$) {
+	my ($self, $files) = @_;
+	my $opt_key = 'in-format';
 	my $fmt = $self->{opt}->{$opt_key};
 	if (!$fmt) {
 		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 32aa2edb..45d42c9c 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -91,7 +91,7 @@ sub call { # the main "lei convert" method
 	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	if ($opt->{stdin}) {
 		@inputs and return $lei->fail("--stdin and @inputs do not mix");
-		$lei->check_input_format(undef, 'in-format') or return;
+		$lei->check_input_format(undef) or return;
 		$self->{0} = $lei->{0};
 	}
 	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
@@ -123,7 +123,7 @@ sub call { # the main "lei convert" method
 		elsif (-d _) { push @d, $input }
 		else { return $lei->fail("Unable to handle $input") }
 	}
-	if (@f) { $lei->check_input_format(\@f, 'in-format') or return }
+	if (@f) { $lei->check_input_format(\@f) or return }
 	if (@d) { # TODO: check for MH vs Maildir, here
 		require PublicInbox::MdirReader;
 	}
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 13e817d0..7f247b64 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -68,8 +68,7 @@ sub call { # the main "lei import" method
 		$self->{0} = $lei->{0};
 	}
 
-	# TODO: do we need --format for non-stdin?
-	my $fmt = $lei->{opt}->{'format'};
+	my $fmt = $lei->{opt}->{'in-format'};
 	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
 	for my $input (@inputs) {
 		my $input_path = $input;
@@ -159,7 +158,7 @@ sub _import_net { # imap_each, nntp_each cb
 sub import_path_url {
 	my ($self, $input) = @_;
 	my $lei = $self->{lei};
-	my $ifmt = lc($lei->{opt}->{'format'} // '');
+	my $ifmt = lc($lei->{opt}->{'in-format'} // '');
 	# TODO auto-detect?
 	if ($input =~ m!\Aimaps?://!i) {
 		$lei->{net}->imap_each($input, \&_import_net, $lei->{sto},
@@ -191,7 +190,7 @@ EOM
 sub import_stdin {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'format'});
+	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'in-format'});
 }
 
 no warnings 'once'; # the following works even when LeiAuth is lazy-loaded
diff --git a/t/lei-import.t b/t/lei-import.t
index fa4fc504..edb0cd20 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -3,13 +3,13 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 test_lei(sub {
-ok(!lei(qw(import -f bogus), 't/plack-qp.eml'), 'fails with bogus format');
+ok(!lei(qw(import -F bogus), 't/plack-qp.eml'), 'fails with bogus format');
 like($lei_err, qr/\bbogus unrecognized/, 'gave error message');
 
 lei_ok(qw(q s:boolean), \'search miss before import');
 unlike($lei_out, qr/boolean/i, 'no results, yet');
 open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
-lei_ok([qw(import -f eml -)], undef, { %$lei_opt, 0 => $fh },
+lei_ok([qw(import -F eml -)], undef, { %$lei_opt, 0 => $fh },
 	\'import single file from stdin') or diag $lei_err;
 close $fh;
 lei_ok(qw(q s:boolean), \'search hit after import');
@@ -26,7 +26,7 @@ lei_ok(qw(q s:boolean -f mboxrd), \'blob accessible after import');
 	});
 	is_deeply(\@cmp, $expect, 'got expected message in mboxrd');
 }
-lei_ok(qw(import -f eml), 't/data/message_embed.eml',
+lei_ok(qw(import -F eml), 't/data/message_embed.eml',
 	\'import single file by path');
 
 my $str = <<'';
@@ -35,7 +35,7 @@ Message-ID: <x@y>
 Status: RO
 
 my $opt = { %$lei_opt, 0 => \$str };
-lei_ok([qw(import -f eml -)], undef, $opt,
+lei_ok([qw(import -F eml -)], undef, $opt,
 	\'import single file with keywords from stdin');
 lei_ok(qw(q m:x@y));
 my $res = json_utf8->decode($lei_out);
@@ -43,13 +43,13 @@ is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
 
 $str =~ tr/x/v/; # v@y
-lei_ok([qw(import --no-kw -f eml -)], undef, $opt,
+lei_ok([qw(import --no-kw -F eml -)], undef, $opt,
 	\'import single file with --no-kw from stdin');
 lei(qw(q m:v@y));
 $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, [], 'no keywords set');
 
-# see t/lei_to_mail.t for "import -f mbox*"
+# see t/lei_to_mail.t for "import -F mbox*"
 });
 done_testing;
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 72b90700..7898cc48 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -130,7 +130,7 @@ my $orig = do {
 };
 
 test_lei(sub {
-	ok(lei(qw(import -f), $mbox, $fn), 'imported mbox');
+	ok(lei(qw(import -F), $mbox, $fn), 'imported mbox');
 	ok(lei(qw(q s:x)), 'lei q works') or diag $lei_err;
 	my $res = json_utf8->decode($lei_out);
 	my $x = $res->[0];

^ permalink raw reply related	[relevance 44%]

* [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-25 10:11 68% [PATCH 0/4] lei: fleshing out some existing features Eric Wong
  2021-02-25 10:11 45% ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
  2021-02-25 10:11 44% ` [PATCH 2/4] lei import: use --in-format/-F for consistency Eric Wong
@ 2021-02-25 10:11 44% ` Eric Wong
  2021-02-26  3:38 71%   ` Kyle Meyer
  2 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

This can be used to quickly distinguish messages which were
direct hits when doing thread expansion vs messages that
were merely part of the same thread.

This is NOT mairix-derived behavior, but I occasionally found
it useful when looking at results in an MUA to know whether
a message was a direct hit or not.

This makes "-t" consistent with non-"-t" cases as far as keyword
reading goes.
---
 Documentation/lei-q.pod       |  8 ++++++
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        |  4 +--
 lib/PublicInbox/LeiXSearch.pm | 21 +++++++++++++---
 t/lei-q-thread.t              | 47 +++++++++++++++++++++++++++++++++++
 5 files changed, 75 insertions(+), 6 deletions(-)
 create mode 100644 t/lei-q-thread.t

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 75fdc613..0959beac 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -79,6 +79,14 @@ Augment output destination instead of clobbering it.
 
 Return all messages in the same thread as the actual match(es).
 
+Using this twice (C<-tt>) sets the C<flagged> (AKA "important")
+on messages which were actual messages.  This is useful to distinguish
+messages which were direct hits from messages which were merely part
+of the same thread.
+
+TODO: Warning: this flag may become persistent and saved in
+lei/store unless an MUA unflags it!  (Behavior undecided)
+
 =item -d STRATEGY, --dedupe=STRATEGY
 
 Strategy for deduplicating messages: C<content>, C<oid>, C<mid>, or
diff --git a/MANIFEST b/MANIFEST
index adbd108f..9cf33d48 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -373,6 +373,7 @@ t/lei-import-nntp.t
 t/lei-import.t
 t/lei-mirror.t
 t/lei-q-remote-import.t
+t/lei-q-thread.t
 t/lei.t
 t/lei_dedupe.t
 t/lei_external.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8eb96e78..8825fa43 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -109,7 +109,7 @@ sub index_opt {
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
-	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
+	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+ augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
 	import-remote!
@@ -233,7 +233,7 @@ my %OPTDESC = (
 'dedupe|d=s' => ['STRATEGY|content|oid|mid|none',
 		'deduplication strategy'],
 'show	threads|t' => 'display entire thread a message belongs to',
-'q	threads|t' =>
+'q	threads|t+' =>
 	'return all messages in the same threads as the actual match(es)',
 'alert=s@' => ['CMD,:WINCH,:bell,<any command>',
 	'run command(s) or perform ops when done writing to output ' .
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2d399653..eb015978 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -66,6 +66,13 @@ sub remotes { @{$_[0]->{remotes} // []} }
 # called by PublicInbox::Search::xdb
 sub xdb_shards_flat { @{$_[0]->{shards_flat} // []} }
 
+sub mitem_kw ($$;$) {
+	my ($smsg, $mitem, $flagged) = @_;
+	my $kw = xap_terms('K', $mitem->get_document);
+	$kw->{flagged} = 1 if $flagged;
+	$smsg->{kw} = [ sort keys %$kw ];
+}
+
 # like over->get_art
 sub smsg_for {
 	my ($self, $mitem) = @_;
@@ -76,10 +83,7 @@ sub smsg_for {
 	my $num = int(($docid - 1) / $nshard) + 1;
 	my $ibx = $self->{shard2ibx}->[$shard];
 	my $smsg = $ibx->over->get_art($num);
-	if (ref($ibx->can('msg_keywords'))) {
-		my $kw = xap_terms('K', $mitem->get_document);
-		$smsg->{kw} = [ sort keys %$kw ];
-	}
+	mitem_kw($smsg, $mitem) if $ibx->can('msg_keywords');
 	$smsg->{docid} = $docid;
 	$smsg;
 }
@@ -143,6 +147,8 @@ sub query_thread_mset { # for --threads
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
+	my $can_kw = !!$ibxish->can('msg_keywords');
+	my $fl = $lei->{opt}->{threads} > 1;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		mset_progress($lei, $desc, $mset->size,
@@ -156,6 +162,13 @@ sub query_thread_mset { # for --threads
 				my $smsg = $over->get_art($n) or next;
 				wait_startq($lei);
 				my $mitem = delete $n2item{$smsg->{num}};
+				if ($mitem) {
+					if ($can_kw) {
+						mitem_kw($smsg, $mitem, $fl);
+					} else {
+						$smsg->{kw} = [ 'flagged' ];
+					}
+				}
 				$each_smsg->($smsg, $mitem);
 			}
 			@{$ctx->{xids}} = ();
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
new file mode 100644
index 00000000..66db28a9
--- /dev/null
+++ b/t/lei-q-thread.t
@@ -0,0 +1,47 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(json DBD::SQLite Search::Xapian));
+use PublicInbox::LeiToMail;
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+test_lei(sub {
+	my $eml = eml_load('t/utf8.eml');
+	my $buf = PublicInbox::LeiToMail::eml2mboxrd($eml, { kw => ['seen'] });
+	lei_ok([qw(import -F mboxrd -)], undef, { 0 => $buf, %$lei_opt });
+
+	lei_ok qw(q -t m:testmessage@example.com);
+	my $res = json_utf8->decode($lei_out);
+	is_deeply($res->[0]->{kw}, [ 'seen' ], 'q -t sets keywords');
+
+	$eml = eml_load('t/utf8.eml');
+	$eml->header_set('References', $eml->header('Message-ID'));
+	$eml->header_set('Message-ID', '<a-reply@miss>');
+	$buf = PublicInbox::LeiToMail::eml2mboxrd($eml, { kw => ['draft'] });
+	lei_ok([qw(import -F mboxrd -)], undef, { 0 => $buf, %$lei_opt });
+
+	lei_ok qw(q -t m:testmessage@example.com);
+	$res = json_utf8->decode($lei_out);
+	is(scalar(@$res), 3, 'got 2 results');
+	pop @$res;
+	my %m = map { $_->{'m'} => $_ } @$res;
+	is_deeply($m{'<testmessage@example.com>'}->{kw}, ['seen'],
+		'flag set in direct hit');
+	'TODO' or is_deeply($m{'<a-reply@miss>'}->{kw}, ['draft'],
+		'flag set in thread hit');
+
+	lei_ok qw(q -t -t m:testmessage@example.com);
+	$res = json_utf8->decode($lei_out);
+	is(scalar(@$res), 3, 'got 2 results with -t -t');
+	pop @$res;
+	%m = map { $_->{'m'} => $_ } @$res;
+	is_deeply($m{'<testmessage@example.com>'}->{kw}, ['flagged', 'seen'],
+		'flagged set in direct hit');
+	'TODO' or is_deeply($m{'<testmessage@example.com>'}->{kw}, ['draft'],
+		'flagged set in direct hit');
+	lei_ok qw(q -t -t m:testmessage@example.com --only), "$ro_home/t2";
+	$res = json_utf8->decode($lei_out);
+	is_deeply($res->[0]->{kw}, [ 'flagged' ], 'flagged set on external');
+});
+done_testing;

^ permalink raw reply related	[relevance 44%]

* better "compopt -o nospace" ideas? [was: lei: completion: bash: generalize nospace usage]
  2021-02-18 12:27 71% [PATCH] lei: completion: bash: generalize nospace usage Eric Wong
@ 2021-02-25 10:33 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-25 10:33 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> We'll be completing more options with ':', '//' and '=' in the
> future, so make it easier to disable trailing spaces on
> completions.

<snip>

> +++ b/contrib/completion/lei-completion.bash
> @@ -4,14 +4,12 @@
>  # preliminary bash completion support for lei (Local Email Interface)
>  # Needs a lot of work, see `lei__complete' in lib/PublicInbox::LEI.pm
>  _lei() {
> -	case ${COMP_WORDS[@]} in
> -	*' add-external h'* | *' --mirror h'*)
> -		compopt -o nospace
> -		;;
> +	local wordlist="$(lei _complete ${COMP_WORDS[@]})"
> +	case $wordlist in
> +	*':'* | *'='* | '//'*) compopt -o nospace ;;

While this is nicer than before, I'm still wondering if there's
a better way for lei to communicate "-o nospace" to bash...
(or similar options for other shells)

Thanks in advance for any ideas you might provide.

>  	*) compopt +o nospace ;; # the default
>  	esac
> -	COMPREPLY=($(compgen -W "$(lei _complete ${COMP_WORDS[@]})" \
> -			-- "${COMP_WORDS[COMP_CWORD]}"))
> +	COMPREPLY=($(compgen -W "$wordlist" -- "${COMP_WORDS[COMP_CWORD]}"))
>  	return 0
>  }
>  complete -o default -o bashdefault -F _lei lei

^ permalink raw reply	[relevance 71%]

* Re: [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-25 10:11 44% ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
@ 2021-02-26  3:38 71%   ` Kyle Meyer
  2021-02-26  4:13 71%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-26  3:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> This can be used to quickly distinguish messages which were
> direct hits when doing thread expansion vs messages that
> were merely part of the same thread.

Ah, that's very useful.

> +Using this twice (C<-tt>) sets the C<flagged> (AKA "important")
> +on messages which were actual messages.  This is useful to distinguish
> +messages which were direct hits from messages which were merely part
> +of the same thread.
> +
> +TODO: Warning: this flag may become persistent and saved in
> +lei/store unless an MUA unflags it!  (Behavior undecided)

Oy, I understand even less than I thought I did.  How does the
information about what the MUA unflags get back into the store?  Is
there an implicit additional step (`lei import ...')?

^ permalink raw reply	[relevance 71%]

* Re: [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-26  3:38 71%   ` Kyle Meyer
@ 2021-02-26  4:13 71%     ` Eric Wong
  2021-02-26  4:38 71%       ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-26  4:13 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> > +TODO: Warning: this flag may become persistent and saved in
> > +lei/store unless an MUA unflags it!  (Behavior undecided)
> 
> Oy, I understand even less than I thought I did.  How does the
> information about what the MUA unflags get back into the store?  Is
> there an implicit additional step (`lei import ...')?

lei will watch (via inotify/EVFILT_VNODE) mail stores it knows
about for flag updates.  At least that's the plan...

Also, when overwriting an existing output, I think it would be
wise to do an implicit import of any messages that aren't
already in lei/store or an external.  That would save users
from accidentally trashing their data.

^ permalink raw reply	[relevance 71%]

* Re: [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-26  4:13 71%     ` Eric Wong
@ 2021-02-26  4:38 71%       ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-26  4:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Kyle Meyer <kyle@kyleam.com> wrote:

>> Oy, I understand even less than I thought I did.  How does the
>> information about what the MUA unflags get back into the store?  Is
>> there an implicit additional step (`lei import ...')?
>
> lei will watch (via inotify/EVFILT_VNODE) mail stores it knows
> about for flag updates.  At least that's the plan...
>
> Also, when overwriting an existing output, I think it would be
> wise to do an implicit import of any messages that aren't
> already in lei/store or an external.  That would save users
> from accidentally trashing their data.

Makes sense.  Thanks for the details (especially if you're repeating
yourself).

^ permalink raw reply	[relevance 71%]

* Re: lei: per-message keywords and externals
  2021-02-24 20:49 70% lei: per-message keywords and externals Eric Wong
@ 2021-02-26  9:26 71% ` Eric Wong
  2021-03-02  9:28 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-26  9:26 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> Something I've been pondering for a bit is how to handle
> keywords (Seen, Important, Replied, ...) for messages stored in
> externals.
> 
> I want "kw:" prefix to be a usable search term, like:
> 
> 	lei q something interesting kw:seen
> 	lei q something interesting NOT kw:seen
> 
> This is no problem for imported messages in ~/.local/share/lei/store.
> All the keyword info is stored in line with the rest of the
> Xapian index data.
> 
> But, I also don't want to be wasting users' space by duplicating
> index data if they're already hosting inboxes for public
> consumption.  So, it's looking like parsing out kw: ourselves
> and do extra filtering on our end when externals are in play is
> going to be a requirement...

Something I considered a few weeks ago, but decided against, but
am again coming around to is indexing just the overview header
info in lei/store.  In other words:

	$sto->set_eml($eml->header_obj, @kw)

instead of:

	$sto->set_eml($eml, @kw)

> Or, just don't support searching using "kw:" with externals, for
> now; but still stash keywords somewhere when writing to
> traditional mail stores.

Maybe it'll be another instance of LeiStore in a separate dir
for external keywords: ~/.local/share/lei/xkw-store

> And there's also HTTP/HTTPS externals, but those will have
> transparent caching/memoization into lei/store by default, soon.

Done: https://public-inbox.org/meta/20210224233718.19007-3-e@80x24.org/

^ permalink raw reply	[relevance 71%]

* [PATCH 1/5] lei: style fix for $oldset declaration
  2021-02-26  9:41 71% [PATCH 0/5] lei mbox locking Eric Wong
@ 2021-02-26  9:41 71% ` Eric Wong
  2021-02-26  9:41 36% ` [PATCH 2/5] lei q: support mbox locking by default Eric Wong
  2021-02-26  9:41 54% ` [PATCH 3/5] lei import|convert: support mbox locking on reads Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-26  9:41 UTC (permalink / raw)
  To: meta

We want /^sub oldset/ to match to keep editors and
things like ctags happy.
---
 lib/PublicInbox/LEI.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8825fa43..5cdaabc6 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -27,7 +27,7 @@ use Time::HiRes qw(stat); # ctime comparisons for config cache
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;
-our ($current_lei, $errors_log, $listener);
+our ($current_lei, $errors_log, $listener, $oldset);
 my ($recv_cmd, $send_cmd);
 my $GLP = Getopt::Long::Parser->new;
 $GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
@@ -976,7 +976,7 @@ sub event_step_init {
 
 sub noop {}
 
-our $oldset; sub oldset { $oldset }
+sub oldset { $oldset }
 
 sub dump_and_clear_log {
 	if (defined($errors_log) && -s STDIN && seek(STDIN, 0, SEEK_SET)) {

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/5] lei mbox locking
@ 2021-02-26  9:41 71% Eric Wong
  2021-02-26  9:41 71% ` [PATCH 1/5] lei: style fix for $oldset declaration Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2021-02-26  9:41 UTC (permalink / raw)
  To: meta

mbox locking is in preparation for inotify/EVFILT_VNODE
mbox monitoring and keyword storage updating.  And some
other odds and ends...

Anyways, still not sure how I want to store keywords
for read-only externals:
https://public-inbox.org/meta/20210224204950.GA2076@dcvr/

Eric Wong (5):
  lei: style fix for $oldset declaration
  lei q: support mbox locking by default
  lei import|convert: support mbox locking on reads
  t/lei_store: rename $lst to $sto
  lei_xsearch: more detail about ->xdb call chain

 MANIFEST                      |   2 +
 lib/PublicInbox/LEI.pm        |  19 ++++--
 lib/PublicInbox/LeiConvert.pm |   9 ++-
 lib/PublicInbox/LeiImport.pm  |  13 ++--
 lib/PublicInbox/LeiToMail.pm  |  16 +++--
 lib/PublicInbox/LeiXSearch.pm |   3 +-
 lib/PublicInbox/MboxLock.pm   | 121 ++++++++++++++++++++++++++++++++++
 t/lei-q-remote-import.t       |  12 ++++
 t/lei_store.t                 | 102 ++++++++++++++--------------
 t/mbox_lock.t                 |  90 +++++++++++++++++++++++++
 10 files changed, 315 insertions(+), 72 deletions(-)
 create mode 100644 lib/PublicInbox/MboxLock.pm
 create mode 100644 t/mbox_lock.t


^ permalink raw reply	[relevance 71%]

* [PATCH 3/5] lei import|convert: support mbox locking on reads
  2021-02-26  9:41 71% [PATCH 0/5] lei mbox locking Eric Wong
  2021-02-26  9:41 71% ` [PATCH 1/5] lei: style fix for $oldset declaration Eric Wong
  2021-02-26  9:41 36% ` [PATCH 2/5] lei q: support mbox locking by default Eric Wong
@ 2021-02-26  9:41 54% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-26  9:41 UTC (permalink / raw)
  To: meta

In case somebody is writing non-atomically, ensure we
take read locks when opening mbox files for reading.
---
 lib/PublicInbox/LEI.pm        | 13 +++++++++----
 lib/PublicInbox/LeiConvert.pm |  9 ++++++---
 lib/PublicInbox/LeiImport.pm  | 13 +++++++------
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b5bdda21..e133b357 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -172,12 +172,12 @@ our %CMD = ( # sorted in order of importance/use:
 'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
-	in-format|F=s kw|keywords|flags! C=s@),
+	lock=s@ in-format|F=s kw|keywords|flags! C=s@),
 	],
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
 	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
-	kw|keywords|flags! C=s@),
+	lock=s@ kw|keywords|flags! C=s@),
 	],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
@@ -218,6 +218,9 @@ my %OPTDESC = (
 'help|h' => 'show this built-in help',
 'C=s@' => [ 'DIR', 'chdir to specify to directory' ],
 'quiet|q' => 'be quiet',
+'lock=s@' => [ 'METHOD|dotlock|fcntl|flock|none',
+	'mbox(5) locking method(s) to use (default: fcntl,dotlock)' ],
+
 'globoff|g' => "do not match locations using '*?' wildcards ".
 		"and\xa0'[]'\x{a0}ranges",
 'verbose|v+' => 'be more verbose',
@@ -410,8 +413,10 @@ sub check_input_format ($;$) {
 	return 1 if $fmt eq 'eml';
 	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
 	require PublicInbox::MboxReader;
-	PublicInbox::MboxReader->can($fmt) ||
-				fail($self, "--$opt_key=$fmt unrecognized");
+	PublicInbox::MboxReader->can($fmt) or
+		return fail($self, "--$opt_key=$fmt unrecognized");
+	require PublicInbox::MboxLock if $files;
+	1;
 }
 
 sub out ($;@) {
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 45d42c9c..4c0bbd88 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -62,9 +62,11 @@ sub do_convert { # via wq_do
 			$ifmt = lc $1;
 		}
 		if (-f $input) {
-			open my $fh, '<', $input or
-					return $lei->fail("open $input: $!");
-			convert_fh($self, $ifmt, $fh, $input);
+			my $m = $lei->{opt}->{'lock'} //
+					($ifmt eq 'eml' ? ['none'] :
+					PublicInbox::MboxLock->defaults);
+			my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
+			convert_fh($self, $ifmt, $mbl->{fh}, $input);
 		} elsif (-d _) {
 			PublicInbox::MdirReader::maildir_each_eml($input,
 							\&mdir_cb, $self);
@@ -109,6 +111,7 @@ sub call { # the main "lei convert" method
 
 			}
 			if (-f $input_path) {
+				require PublicInbox::MboxLock;
 				require PublicInbox::MboxReader;
 				PublicInbox::MboxReader->can($ifmt) or return
 					$lei->fail("$ifmt not supported");
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 7f247b64..c2c98030 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -80,10 +80,11 @@ sub call { # the main "lei import" method
 			my $ifmt = lc $1;
 			if (($fmt // $ifmt) ne $ifmt) {
 				return $lei->fail(<<"");
---format=$fmt and `$ifmt:' conflict
+--in-format=$fmt and `$ifmt:' conflict
 
 			}
 			if (-f $input_path) {
+				require PublicInbox::MboxLock;
 				require PublicInbox::MboxReader;
 				PublicInbox::MboxReader->can($ifmt) or return
 					$lei->fail("$ifmt not supported");
@@ -142,7 +143,7 @@ error reading $input: $!
 			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
 		}
 	};
-	$lei->child_error(1 << 8, "<stdin>: $@") if $@;
+	$lei->child_error(1 << 8, "$input: $@") if $@;
 }
 
 sub _import_maildir { # maildir_each_file cb
@@ -171,10 +172,10 @@ sub import_path_url {
 		$ifmt = lc $1;
 	}
 	if (-f $input) {
-		open my $fh, '<', $input or return $lei->child_error(1 << 8, <<"");
-unable to open $input: $!
-
-		_import_fh($lei, $fh, $input, $ifmt);
+		my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
+				PublicInbox::MboxLock->defaults);
+		my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
+		_import_fh($lei, $mbl->{fh}, $input, $ifmt);
 	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
 		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
 $input appears to a be a maildir, not $ifmt

^ permalink raw reply related	[relevance 54%]

* [PATCH 2/5] lei q: support mbox locking by default
  2021-02-26  9:41 71% [PATCH 0/5] lei mbox locking Eric Wong
  2021-02-26  9:41 71% ` [PATCH 1/5] lei: style fix for $oldset declaration Eric Wong
@ 2021-02-26  9:41 36% ` Eric Wong
  2021-02-26  9:41 54% ` [PATCH 3/5] lei import|convert: support mbox locking on reads Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-26  9:41 UTC (permalink / raw)
  To: meta

While this diverges from from mairix(1) behavior, it's the safer
option.  We'll follow Debian policy by supporting fcntl and
dotlocks by default (in that order).  Users who do not want
locking can use "--lock=none"

This will be used in a read-only capacity for watching
mailboxes for keyword updates via inotify or EVFILT_VNODE.
---
 MANIFEST                      |   2 +
 lib/PublicInbox/LEI.pm        |   2 +-
 lib/PublicInbox/LeiToMail.pm  |  16 +++--
 lib/PublicInbox/LeiXSearch.pm |   1 +
 lib/PublicInbox/MboxLock.pm   | 121 ++++++++++++++++++++++++++++++++++
 t/lei-q-remote-import.t       |  12 ++++
 t/mbox_lock.t                 |  90 +++++++++++++++++++++++++
 7 files changed, 239 insertions(+), 5 deletions(-)
 create mode 100644 lib/PublicInbox/MboxLock.pm
 create mode 100644 t/mbox_lock.t

diff --git a/MANIFEST b/MANIFEST
index 9cf33d48..11ec5c01 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -201,6 +201,7 @@ lib/PublicInbox/MIME.pm
 lib/PublicInbox/ManifestJsGz.pm
 lib/PublicInbox/Mbox.pm
 lib/PublicInbox/MboxGz.pm
+lib/PublicInbox/MboxLock.pm
 lib/PublicInbox/MboxReader.pm
 lib/PublicInbox/MdirReader.pm
 lib/PublicInbox/MiscIdx.pm
@@ -383,6 +384,7 @@ t/lei_to_mail.t
 t/lei_xsearch.t
 t/linkify.t
 t/main-bin/spamc
+t/mbox_lock.t
 t/mbox_reader.t
 t/mda-mime.eml
 t/mda.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 5cdaabc6..b5bdda21 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -112,7 +112,7 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+ augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	import-remote!
+	import-remote! lock=s@
 	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 630da67c..de640657 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -463,11 +463,19 @@ sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
 	if ($dst ne '/dev/stdout') {
-		my $mode = -p $dst ? '>' : '+>>';
-		if (-f _ && !$lei->{opt}->{augment} and !unlink($dst)) {
-			$! == ENOENT or die "unlink($dst): $!";
+		my $out;
+		if (-p $dst) {
+			open $out, '>', $dst or die "open($dst): $!";
+		} elsif (-f _ || !-e _) {
+			require PublicInbox::MboxLock;
+			my $m = $lei->{opt}->{'lock'} //
+					PublicInbox::MboxLock->defaults;
+			$self->{mbl} = PublicInbox::MboxLock->acq($dst, 1, $m);
+			$out = $self->{mbl}->{fh};
+			if (!$lei->{opt}->{augment} and !truncate($out, 0)) {
+				die "truncate($dst): $!";
+			}
 		}
-		open my $out, $mode, $dst or die "open($dst): $!";
 		$lei->{old_1} = $lei->{1}; # keep for spawning MUA
 		$lei->{1} = $out;
 	}
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index eb015978..7ec696f4 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -338,6 +338,7 @@ Error closing $lei->{ovv}->{dst}: $!
 			$l2m->poke_dst;
 			$lei->poke_mua;
 		} else { # mbox users
+			delete $l2m->{mbl}; # drop dotlock
 			$lei->start_mua;
 		}
 	}
diff --git a/lib/PublicInbox/MboxLock.pm b/lib/PublicInbox/MboxLock.pm
new file mode 100644
index 00000000..4e2a2d9a
--- /dev/null
+++ b/lib/PublicInbox/MboxLock.pm
@@ -0,0 +1,121 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Various mbox locking methods
+package PublicInbox::MboxLock;
+use strict;
+use v5.10.1;
+use PublicInbox::OnDestroy;
+use Fcntl qw(:flock F_SETLK F_SETLKW F_RDLCK F_WRLCK
+			O_CREAT O_EXCL O_WRONLY SEEK_SET);
+use Carp qw(croak);
+use PublicInbox::DS qw(now); # ugh...
+
+our $TMPL = do {
+	if ($^O eq 'linux') { \'s @32' }
+	elsif ($^O =~ /bsd/) { \'@20 s @256' } # n.b. @32 may be enough...
+	else { eval { require File::FcntlLock; 1 } }
+};
+
+# This order matches Debian policy on Linux systems.
+# See policy/ch-customized-programs.rst in
+# https://salsa.debian.org/dbnpolicy/policy.git
+sub defaults { [ qw(fcntl dotlock) ] }
+
+sub acq_fcntl {
+	my ($self) = @_;
+	my $op = $self->{nb} ? F_SETLK : F_SETLKW;
+	my $t = $self->{rw} ? F_WRLCK : F_RDLCK;
+	my $end = now + $self->{timeout};
+	$TMPL or die <<EOF;
+"struct flock" layout not available on $^O, install File::FcntlLock?
+EOF
+	do {
+		if (ref $TMPL) {
+			return if fcntl($self->{fh}, $op, pack($$TMPL, $t));
+		} else {
+			my $fl = File::FcntlLock->new;
+			$fl->l_type($t);
+			$fl->l_whence(SEEK_SET);
+			$fl->l_start(0);
+			$fl->l_len(0);
+			return if $fl->lock($self->{fh}, $op);
+		}
+		select(undef, undef, undef, $self->{delay});
+	} while (now < $end);
+	croak "fcntl lock $self->{f}: $!";
+}
+
+sub acq_dotlock {
+	my ($self) = @_;
+	my $dot_lock = "$self->{f}.lock";
+	my ($pfx, $base) = ($self->{f} =~ m!(\A.*?/)([^/]+)\z!);
+	$pfx //= '';
+	my $pid = $$;
+	my $end = now + $self->{timeout};
+	do {
+		my $tmp = "$pfx.$base-".sprintf('%x,%x,%x',
+					rand(0xffffffff), $pid, time);
+		if (sysopen(my $fh, $tmp, O_CREAT|O_EXCL|O_WRONLY)) {
+			if (link($tmp, $dot_lock)) {
+				unlink($tmp) or die "unlink($tmp): $!";
+				$self->{".lock$pid"} = $dot_lock;
+				return;
+			}
+			unlink($tmp) or die "unlink($tmp): $!";
+			select(undef, undef, undef, $self->{delay});
+		} else {
+			croak "open $tmp (for $dot_lock): $!" if !$!{EXIST};
+		}
+	} while (now < $end);
+	croak "dotlock $dot_lock";
+}
+
+sub acq_flock {
+	my ($self) = @_;
+	my $op = $self->{rw} ? LOCK_EX : LOCK_SH;
+	$op |= LOCK_NB if $self->{nb};
+	my $end = now + $self->{timeout};
+	do {
+		return if flock($self->{fh}, $op);
+		select(undef, undef, undef, $self->{delay});
+	} while (now < $end);
+	croak "flock $self->{f}: $!";
+}
+
+sub acq {
+	my ($cls, $f, $rw, $methods) = @_;
+	my $fh;
+	unless (open $fh, $rw ? '+>>' : '<', $f) {
+		croak "open($f): $!" if $rw || !$!{ENOENT};
+	}
+	my $self = bless { f => $f, fh => $fh, rw => $rw }, $cls;
+	my $m = "@$methods";
+	if ($m ne 'none') {
+		my @m = map {
+			if (/\A(timeout|delay)=([0-9\.]+)s?\z/) {
+				$self->{$1} = $2 + 0;
+				();
+			} else {
+				$cls->can("acq_$_") // $_
+			}
+		} split(/[, ]/, $m);
+		my @bad = grep { !ref } @m;
+		croak "Unsupported lock methods: @bad\n" if @bad;
+		croak "No lock methods supplied with $m\n" if !@m;
+		$self->{nb} = $#m || defined($self->{timeout});
+		$self->{delay} //= 0.1;
+		$self->{timeout} //= 5;
+		$_->($self) for @m;
+	}
+	$self;
+}
+
+sub DESTROY {
+	my ($self) = @_;
+	if (my $f = $self->{".lock$$"}) {
+		unlink($f) or die "unlink($f): $! (lock stolen?)";
+	}
+}
+
+1;
diff --git a/t/lei-q-remote-import.t b/t/lei-q-remote-import.t
index f73524cf..4088b6ad 100644
--- a/t/lei-q-remote-import.t
+++ b/t/lei-q-remote-import.t
@@ -46,5 +46,17 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	unlink $o or BAIL_OUT $!;
 	lei_ok(@cmd);
 	ok(-f $o && !-s _, '--no-import-remote did not memoize');
+
+	open my $fh, '>', "$o.lock";
+	$cmd[-1] = 'm:qp@example.com';
+	unlink $o or BAIL_OUT $!;
+	lei_ok(@cmd, '--lock=none');
+	ok(-f $o && -s _, '--lock=none respected');
+	unlink $o or BAIL_OUT $!;
+	ok(!lei(@cmd, '--lock=dotlock,timeout=0.000001'), 'dotlock fails');
+	ok(-f $o && !-s _, 'nothing output on lock failure');
+	unlink "$o.lock" or BAIL_OUT $!;
+	lei_ok(@cmd, '--lock=dotlock,timeout=0.000001',
+		\'succeeds after lock removal');
 });
 done_testing;
diff --git a/t/mbox_lock.t b/t/mbox_lock.t
new file mode 100644
index 00000000..3dc3b449
--- /dev/null
+++ b/t/mbox_lock.t
@@ -0,0 +1,90 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use POSIX qw(_exit);
+use PublicInbox::DS qw(now);
+use Errno qw(EAGAIN);
+use_ok 'PublicInbox::MboxLock';
+my ($tmpdir, $for_destroy) = tmpdir();
+my $f = "$tmpdir/f";
+my $mbl = PublicInbox::MboxLock->acq($f, 1, ['dotlock']);
+ok(-f "$f.lock", 'dotlock created');
+undef $mbl;
+ok(!-f "$f.lock", 'dotlock gone');
+$mbl = PublicInbox::MboxLock->acq($f, 1, ['none']);
+ok(!-f "$f.lock", 'no dotlock with none');
+undef $mbl;
+
+eval {
+	PublicInbox::MboxLock->acq($f, 1, ['bogus']);
+        fail "should not succeed with `bogus'";
+};
+ok($@, "fails on `bogus' lock method");
+eval {
+	PublicInbox::MboxLock->acq($f, 1, ['timeout=1']);
+        fail "should not succeed with only timeout";
+};
+ok($@, "fails with only `timeout=' and no lock method");
+
+my $defaults = PublicInbox::MboxLock->defaults;
+is(ref($defaults), 'ARRAY', 'default lock methods');
+my $test_rw_lock = sub {
+	my ($func) = @_;
+	my $m = ["$func,timeout=0.000001"];
+	for my $i (1..2) {
+		pipe(my ($r, $w)) or BAIL_OUT "pipe: $!";
+		my $t0 = now;
+		my $pid = fork // BAIL_OUT "fork $!";
+		if ($pid == 0) {
+			eval { PublicInbox::MboxLock->acq($f, 1, $m) };
+			my $err = $@;
+			syswrite $w, "E: $err";
+			_exit($err ? 0 : 1);
+		}
+		undef $w;
+		waitpid($pid, 0);
+		is($?, 0, "$func r/w lock behaved as expected #$i");
+		my $d = now - $t0;
+		ok($d < 1, "$func r/w timeout #$i") or diag "elapsed=$d";
+		my $err = do { local $/; <$r> };
+		$! = EAGAIN;
+		my $msg = "$!";
+		like($err, qr/\Q$msg\E/, "got EAGAIN in child #$i");
+	}
+};
+
+my $test_ro_lock = sub {
+	my ($func) = @_;
+	for my $i (1..2) {
+		my $t0 = now;
+		my $pid = fork // BAIL_OUT "fork $!";
+		if ($pid == 0) {
+			eval { PublicInbox::MboxLock->acq($f, 0, [ $func ]) };
+			_exit($@ ? 1 : 0);
+		}
+		waitpid($pid, 0);
+		is($?, 0, "$func ro lock behaved as expected #$i");
+		my $d = now - $t0;
+		ok($d < 1, "$func timeout respected #$i") or diag "elapsed=$d";
+	}
+};
+
+SKIP: {
+	grep(/fcntl/, @$defaults) or skip 'File::FcntlLock not available', 1;
+	my $top = PublicInbox::MboxLock->acq($f, 1, $defaults);
+	ok($top, 'fcntl lock acquired');
+	$test_rw_lock->('fcntl');
+	undef $top;
+	$top = PublicInbox::MboxLock->acq($f, 0, $defaults);
+	ok($top, 'fcntl read lock acquired');
+	$test_ro_lock->('fcntl');
+}
+$mbl = PublicInbox::MboxLock->acq($f, 1, ['flock']);
+ok($mbl, 'flock acquired');
+$test_rw_lock->('flock');
+undef $mbl;
+$mbl = PublicInbox::MboxLock->acq($f, 0, ['flock']);
+$test_ro_lock->('flock');
+
+done_testing;

^ permalink raw reply related	[relevance 36%]

* [PATCH 0/3] doc: lei manpages, round 3
@ 2021-02-27 18:03 71% Kyle Meyer
  2021-02-27 18:03 48% ` [PATCH 1/3] doc: lei: update manpages Kyle Meyer
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Kyle Meyer @ 2021-02-27 18:03 UTC (permalink / raw)
  To: meta

This series updates the lei manpages, continuing from
<20210211040415.28557-1-kyle@kyleam.com>.  It covers changes up to the
current tip of master (f310a5054fb8e215..903eac79aa86d17c).

I didn't add a manpage for lei-convert, as I wasn't sure if that
should be considered mostly an internal tool for testing purposes.
Thoughts?

  [1/3] doc: lei: update manpages
  [2/3] doc: lei-import: drop markup of "stdin"
  [3/3] doc: lei-overview: add performance and bash completion sections

 Documentation/lei-import.pod   | 21 +++++++++++++++------
 Documentation/lei-overview.pod | 17 +++++++++++++++++
 Documentation/lei-q.pod        | 32 ++++++++++++++++++++++++--------
 Documentation/lei.pod          | 15 ++++++++++++++-
 Documentation/txt2pre          |  1 +
 5 files changed, 71 insertions(+), 15 deletions(-)


base-commit: 903eac79aa86d17c0b8f888d160d44977899515b
-- 
2.30.1


^ permalink raw reply	[relevance 71%]

* [PATCH 1/3] doc: lei: update manpages
  2021-02-27 18:03 71% [PATCH 0/3] doc: lei manpages, round 3 Kyle Meyer
@ 2021-02-27 18:03 48% ` Kyle Meyer
  2021-02-27 18:03 71% ` [PATCH 2/3] doc: lei-import: drop markup of "stdin" Kyle Meyer
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-27 18:03 UTC (permalink / raw)
  To: meta

Catch up with recent developments.
---
 Documentation/lei-import.pod | 19 ++++++++++++++-----
 Documentation/lei-q.pod      | 32 ++++++++++++++++++++++++--------
 Documentation/lei.pod        | 15 ++++++++++++++-
 Documentation/txt2pre        |  1 +
 4 files changed, 53 insertions(+), 14 deletions(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index ef20e2f6305cf6ff..7d5b2576808fdb61 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -11,12 +11,14 @@ lei import [OPTIONS] --stdin
 =head1 DESCRIPTION
 
 Import messages into the local storage of L<lei(1)>.  C<LOCATION> is a
-source of messages: a directory (Maildir) or a file.  For a regular
-file, the location must have a C<E<lt>formatE<gt>:> prefix specifying
-one of the following formats: C<eml>, C<mboxrd>, C<mboxcl2>,
-C<mboxcl>, or C<mboxo>.
+source of messages: a directory (Maildir), a file, or a URL
+(C<imap://>, C<imaps://>, C<nntp://>, or C<nntps://>).  URLs requiring
+authentication must use L<netrc(5)> and/or L<git-credential(1)> to
+fill in the username and password.
 
-TODO: Update when URL support is added.
+For a regular file, the location must have a C<E<lt>formatE<gt>:>
+prefix specifying one of the following formats: C<eml>, C<mboxrd>,
+C<mboxcl2>, C<mboxcl>, or C<mboxo>.
 
 =head1 OPTIONS
 
@@ -31,6 +33,13 @@ format prefix with C<LOCATION> is preferred.
 
 Read messages from stdin.
 
+=item --lock
+
+L<mbox(5)> locking method(s) to use: C<dotlock>, C<fcntl>, C<flock> or
+C<none>.
+
+Default: fcntl,dotlock
+
 =item --no-kw, --no-keywords, --no-flags
 
 Don't import message keywords (or "flags" in IMAP terminology).
diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 0959beac38504841..e878157d93e2f7c5 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -26,18 +26,22 @@ Read search terms from stdin.
 
 =item -o MFOLDER, --output=MFOLDER, --mfolder=MFOLDER
 
-Destination for results (e.g., C<path/to/Maildir> or
-C<mboxcl2:path/to/mbox>).  The format can be specified by adding a
-C<E<lt>formatE<gt>:> prefix with any of these values: C<maildir>,
+Destination for results (e.g., C<path/to/Maildir>,
+C<imaps://user@mail.example.com/INBOX.test>, or
+C<mboxcl2:path/to/mbox>).  The prefix may be a supported protocol:
+C<imap://>, C<imaps://>, C<nntp://>, or C<nntps://>.  URLs requiring
+authentication must use L<netrc(5)> and/or L<git-credential(1)> to
+fill in the username and password.
+
+The prefix can instead specify the format of the output: C<maildir>,
 C<mboxrd>, C<mboxcl2>, C<mboxcl>, C<mboxo>, C<json>, C<jsonl>, or
-C<concatjson>.
+C<concatjson>.  When a format isn't specified, it's chosen based on
+the destination.  C<json> is used for the default destination
+(stdout), and C<maildir> is used for an existing directory or
+non-existing path.
 
 TODO: Provide description of formats?
 
-When a format isn't specified, it's chosen based on the destination.
-C<json> is used for the default destination (stdout), and C<maildir>
-is used for an existing directory or non-existing path.
-
 Default: -
 
 =item -f FORMAT, --format=FORMAT
@@ -130,6 +134,18 @@ multiple times, in which case the search uses only the specified set.
 Do not match locations using C<*?> wildcards and C<[]> ranges.  This
 option applies to C<--include>, C<--exclude>, and C<--only>.
 
+=item --no-import-remote
+
+Disable the default behavior of memoizing remote messages into the
+local store.
+
+=item --lock
+
+L<mbox(5)> locking method(s) to use: C<dotlock>, C<fcntl>, C<flock> or
+C<none>.
+
+Default: fcntl,dotlock
+
 =item -NUMBER, -n NUMBER, --limit=NUMBER
 
 Limit the number of matches.
diff --git a/Documentation/lei.pod b/Documentation/lei.pod
index 9ce9e9a4dc6cde81..e1502122571ad521 100644
--- a/Documentation/lei.pod
+++ b/Documentation/lei.pod
@@ -4,7 +4,7 @@ lei - local email interface for public-inbox
 
 =head1 SYNOPSIS
 
-lei COMMAND
+lei [OPTIONS] COMMAND
 
 =head1 DESCRIPTION
 
@@ -19,6 +19,19 @@ indices).
 
 Available in public-inbox 1.7.0+.
 
+=head1 OPTIONS
+
+=over
+
+=item -C DIR
+
+Change current working directory to the specified directory before
+running the command.  This option can be given before or after
+C<COMMAND> and is accepted by all lei subcommands except
+L<lei-daemon-kill(1)>.
+
+=back
+
 =head1 COMMANDS
 
 Subcommands for initializing and managing local, writable storage:
diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index 8421cad74e7b4321..3277531f9122b8d6 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -102,6 +102,7 @@ $xurls{'git-filter-repo(1)'} = 'https://github.com/newren/git-filter-repo'.
 $xurls{'ssoma(1)'} = 'https://ssoma.public-inbox.org/ssoma.txt';
 $xurls{'cgitrc(5)'} = 'https://git.zx2c4.com/cgit/tree/cgitrc.5.txt';
 $xurls{'prove(1)'} = 'https://perldoc.perl.org/prove.html';
+$xurls{'mbox(5)'} = 'https://manpages.debian.org/stable/mutt/mbox.5.en.html';
 
 my $str = do { local $/; <STDIN> };
 my ($title) = ($str =~ /\A([^\n]+)/);
-- 
2.30.1


^ permalink raw reply related	[relevance 48%]

* [PATCH 2/3] doc: lei-import: drop markup of "stdin"
  2021-02-27 18:03 71% [PATCH 0/3] doc: lei manpages, round 3 Kyle Meyer
  2021-02-27 18:03 48% ` [PATCH 1/3] doc: lei: update manpages Kyle Meyer
@ 2021-02-27 18:03 71% ` Kyle Meyer
  2021-02-27 18:03 70% ` [PATCH 3/3] doc: lei-overview: add performance and bash completion sections Kyle Meyer
  2021-02-27 20:20 71% ` [PATCH 0/3] doc: lei manpages, round 3 Eric Wong
  3 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-27 18:03 UTC (permalink / raw)
  To: meta

stdin isn't placed in C<> elsewhere.
---
 Documentation/lei-import.pod | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 7d5b2576808fdb61..7f4e37452dcb0671 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -26,7 +26,7 @@ C<mboxcl2>, C<mboxcl>, or C<mboxo>.
 
 =item -F MAIL_FORMAT, --in-format=MAIL_FORMAT
 
-Message input format.  Unless messages are given on C<stdin>, using a
+Message input format.  Unless messages are given on stdin, using a
 format prefix with C<LOCATION> is preferred.
 
 =item --stdin
-- 
2.30.1


^ permalink raw reply related	[relevance 71%]

* [PATCH 3/3] doc: lei-overview: add performance and bash completion sections
  2021-02-27 18:03 71% [PATCH 0/3] doc: lei manpages, round 3 Kyle Meyer
  2021-02-27 18:03 48% ` [PATCH 1/3] doc: lei: update manpages Kyle Meyer
  2021-02-27 18:03 71% ` [PATCH 2/3] doc: lei-import: drop markup of "stdin" Kyle Meyer
@ 2021-02-27 18:03 70% ` Kyle Meyer
  2021-02-27 20:20 71% ` [PATCH 0/3] doc: lei manpages, round 3 Eric Wong
  3 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-27 18:03 UTC (permalink / raw)
  To: meta

Take care of a couple of the items mentioned at
<https://public-inbox.org/meta/20210218202818.GA19443@dcvr>.
---
 Documentation/lei-overview.pod | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/Documentation/lei-overview.pod b/Documentation/lei-overview.pod
index 62b62280ad2ddd69..c3379caa1bd15d69 100644
--- a/Documentation/lei-overview.pod
+++ b/Documentation/lei-overview.pod
@@ -71,6 +71,23 @@ file by invoking C<mutt -f %f>.
 
 =back
 
+=head1 PERFORMANCE NOTES
+
+L<Inline::C> is recommended for performance.  To enable it, create
+C<~/.cache/public-inbox/inline-c/>.
+
+If Socket::MsgHdr is installed (libsocket-msghdr-perl in Debian), the
+first invocation of lei starts a daemon, reducing the startup cost of
+for future invocations (which is particularly important for Bash
+completion).
+
+=head1 BASH COMPLETION
+
+Preliminary Bash completion for lei is provided in
+C<contrib/completion/>.  Contributions adding support for other
+shells, as well as improvements to the existing Bash completion, are
+welcome.
+
 =head1 CONTACT
 
 Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
-- 
2.30.1


^ permalink raw reply related	[relevance 70%]

* Re: [PATCH 0/3] doc: lei manpages, round 3
  2021-02-27 18:03 71% [PATCH 0/3] doc: lei manpages, round 3 Kyle Meyer
                   ` (2 preceding siblings ...)
  2021-02-27 18:03 70% ` [PATCH 3/3] doc: lei-overview: add performance and bash completion sections Kyle Meyer
@ 2021-02-27 20:20 71% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-27 20:20 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Thanks, pushed as commit b4603182d50456dc410093efd4d921078559fd42

^ permalink raw reply	[relevance 71%]

* [PATCH 0/3] lei p2q (patch-to-query)
@ 2021-02-28 12:25 71% Eric Wong
  2021-02-28 12:25 51% ` [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin" Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2021-02-28 12:25 UTC (permalink / raw)
  To: meta

Pipes (the *nix kind) are good.

Eric Wong (3):
  lei p2q: patch-to-query generator for "lei q --stdin"
  lei q: fix "-" shortcut for --stdin
  lei q: improve early aborts w/ remote externals

 MANIFEST                      |   2 +
 lib/PublicInbox/LEI.pm        |  40 ++++++-
 lib/PublicInbox/LeiImport.pm  |   3 +-
 lib/PublicInbox/LeiP2q.pm     | 197 ++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiXSearch.pm |   4 +-
 t/lei-externals.t             |  49 +++++++--
 t/lei-p2q.t                   |  29 +++++
 t/lei-q-thread.t              |   3 +-
 8 files changed, 313 insertions(+), 14 deletions(-)
 create mode 100644 lib/PublicInbox/LeiP2q.pm
 create mode 100644 t/lei-p2q.t

^ permalink raw reply	[relevance 71%]

* [PATCH 2/3] lei q: fix "-" shortcut for --stdin
  2021-02-28 12:25 71% [PATCH 0/3] lei p2q (patch-to-query) Eric Wong
  2021-02-28 12:25 51% ` [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin" Eric Wong
@ 2021-02-28 12:25 64% ` Eric Wong
  2021-02-28 12:25 52% ` [PATCH 3/3] lei q: improve early aborts w/ remote externals Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-28 12:25 UTC (permalink / raw)
  To: meta

Due to the way our option parser handles this special case, it
must be the first option spec.  This helps us document things
better, even, since many command accept either a pathname or
--stdin|-.
---
 lib/PublicInbox/LEI.pm | 7 ++++---
 t/lei-q-thread.t       | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index a2f8ffe7..f5e42869 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -108,10 +108,11 @@ sub index_opt {
 # see lei__complete() and PublicInbox::LeiHelp
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
-'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
-	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+ augment|a
+'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms',
+	'stdin|', # /|\z/ must be first for lone dash
+	qw(save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+
 	sort|s=s reverse|r offset=i remote! local! external! pretty
-	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
+	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g augment|a
 	import-remote! lock=s@
 	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
index 66db28a9..0ddf47a6 100644
--- a/t/lei-q-thread.t
+++ b/t/lei-q-thread.t
@@ -21,7 +21,8 @@ test_lei(sub {
 	$buf = PublicInbox::LeiToMail::eml2mboxrd($eml, { kw => ['draft'] });
 	lei_ok([qw(import -F mboxrd -)], undef, { 0 => $buf, %$lei_opt });
 
-	lei_ok qw(q -t m:testmessage@example.com);
+	lei_ok([qw(q - -t)], undef,
+		{ 0 => \'m:testmessage@example.com', %$lei_opt });
 	$res = json_utf8->decode($lei_out);
 	is(scalar(@$res), 3, 'got 2 results');
 	pop @$res;

^ permalink raw reply related	[relevance 64%]

* [PATCH 3/3] lei q: improve early aborts w/ remote externals
  2021-02-28 12:25 71% [PATCH 0/3] lei p2q (patch-to-query) Eric Wong
  2021-02-28 12:25 51% ` [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin" Eric Wong
  2021-02-28 12:25 64% ` [PATCH 2/3] lei q: fix "-" shortcut for --stdin Eric Wong
@ 2021-02-28 12:25 52% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-28 12:25 UTC (permalink / raw)
  To: meta

We must issue LeiStore->done if a client disconnects
while we're streaming from a remote external.  This
can happen via SIGPIPE, or if a client process is
interrupted by any other means.
---
 lib/PublicInbox/LEI.pm        |  3 +++
 lib/PublicInbox/LeiImport.pm  |  3 ++-
 lib/PublicInbox/LeiXSearch.pm |  4 +--
 t/lei-externals.t             | 49 ++++++++++++++++++++++++++++++-----
 4 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f5e42869..834e399f 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -970,6 +970,9 @@ sub dclose {
 		}
 	}
 	close(delete $self->{1}) if $self->{1}; # may reap_compress
+	if (my $sto = delete $self->{sto}) {
+		$sto->ipc_do('done');
+	}
 	$self->close if $self->{sock}; # PublicInbox::DS::close
 }
 
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index c2c98030..23cecd53 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -18,7 +18,8 @@ sub import_done_wait { # dwaitpid callback
 	my ($arg, $pid) = @_;
 	my ($imp, $lei) = @$arg;
 	$lei->child_error($?, 'non-fatal errors during import') if $?;
-	my $ign = $lei->{sto}->ipc_do('done'); # PublicInbox::LeiStore::done
+	my $sto = delete $lei->{sto};
+	my $wait = $sto->ipc_do('done') if $sto; # PublicInbox::LeiStore::done
 	$lei->dclose;
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 9a6457d7..d4607e16 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -349,7 +349,7 @@ Error closing $lei->{ovv}->{dst}: $!
 
 sub do_post_augment {
 	my ($lei) = @_;
-	my $l2m = $lei->{l2m} or die 'BUG: unexpected do_post_augment';
+	my $l2m = $lei->{l2m} or return; # client disconnected
 	my $err;
 	eval { $l2m->post_augment($lei) };
 	$err = $@;
@@ -368,7 +368,7 @@ sub do_post_augment {
 
 sub incr_post_augment { # called whenever an l2m shard finishes augment
 	my ($lei) = @_;
-	my $l2m = $lei->{l2m} or die 'BUG: unexpected incr_post_augment';
+	my $l2m = $lei->{l2m} or return; # client disconnected
 	return if ++$lei->{nr_post_augment} != $l2m->{-wq_nr_workers};
 	do_post_augment($lei);
 }
diff --git a/t/lei-externals.t b/t/lei-externals.t
index d422a9d1..b78b5580 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -6,7 +6,8 @@ use Fcntl qw(SEEK_SET);
 use PublicInbox::Spawn qw(which);
 use PublicInbox::OnDestroy;
 require_git 2.6;
-require_mods(qw(DBD::SQLite Search::Xapian));
+require_mods(qw(json DBD::SQLite Search::Xapian));
+use POSIX qw(WTERMSIG WIFSIGNALED SIGPIPE);
 
 my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
 	http://czquwvybam4bgbro.onion/meta/
@@ -15,19 +16,55 @@ my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
 my $test_external_remote = sub {
 	my ($url, $k) = @_;
 SKIP: {
-	my $nr = 5;
-	skip "$k unset", $nr if !$url;
-	which('curl') or skip 'no curl', $nr;
-	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
+	skip "$k unset", 1 if !$url;
+	state $curl = which('curl');
+	$curl or skip 'no curl', 1;
+	which('torsocks') or skip 'no torsocks', 1 if $url =~ m!\.onion/!;
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
 	my @cmd = ('q', '--only', $url, '-q', "m:$mid");
 	lei_ok(@cmd, \"query $url");
 	is($lei_err, '', "no errors on $url");
 	my $res = json_utf8->decode($lei_out);
-	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
+	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url") or
+		skip 'further remote tests', 1;
 	lei_ok(@cmd, 'd:..20101002', \'no results, no error');
 	is($lei_err, '', 'no output on 404, matching local FS behavior');
 	is($lei_out, "[null]\n", 'got null results');
+	my ($pid_before, $pid_after);
+	if (-d $ENV{XDG_RUNTIME_DIR} && -w _) {
+		lei_ok 'daemon-pid';
+		chomp($pid_before = $lei_out);
+		ok($pid_before, 'daemon is live');
+	}
+	for my $out ([], [qw(-f mboxcl2)]) {
+		pipe(my ($r, $w)) or BAIL_OUT $!;
+		open my $err, '+>', undef or BAIL_OUT $!;
+		my $opt = { run_mode => 0, 1 => $w, 2 => $err };
+		my $cmd = [qw(lei q -qt), @$out, 'bytes:1..'];
+		my $tp = start_script($cmd, undef, $opt);
+		close $w;
+		sysread($r, my $buf, 1);
+		close $r; # trigger SIGPIPE
+		$tp->join;
+		ok(WIFSIGNALED($?), "signaled @$out");
+		is(WTERMSIG($?), SIGPIPE, "got SIGPIPE @$out");
+		seek($err, 0, 0);
+		my @err = grep(!m{mkdir .*sun_path\b}, <$err>);
+		is_deeply(\@err, [], "no errors @$out");
+	}
+	if (-d $ENV{XDG_RUNTIME_DIR} && -w _) {
+		lei_ok 'daemon-pid';
+		chomp(my $pid_after = $lei_out);
+		is($pid_after, $pid_before, 'pid unchanged') or
+			skip 'daemon died', 1;
+		lei_ok 'daemon-kill';
+		my $alive = 1;
+		for (1..100) {
+			$alive = kill(0, $pid_after) or last;
+			tick();
+		}
+		ok(!$alive, 'daemon-kill worked');
+	}
 } # /SKIP
 }; # /sub
 

^ permalink raw reply related	[relevance 52%]

* [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin"
  2021-02-28 12:25 71% [PATCH 0/3] lei p2q (patch-to-query) Eric Wong
@ 2021-02-28 12:25 51% ` Eric Wong
  2021-02-28 21:40 90%   ` Kyle Meyer
  2021-02-28 12:25 64% ` [PATCH 2/3] lei q: fix "-" shortcut for --stdin Eric Wong
  2021-02-28 12:25 52% ` [PATCH 3/3] lei q: improve early aborts w/ remote externals Eric Wong
  2 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-28 12:25 UTC (permalink / raw)
  To: meta

Instead of teaching the to-be-implemented "lei show" to search
threads/messages based commits, this orthogonal sub-command is
designed to generate queries for use with "lei q --stdin".

URI-escaped query parameters may be generated with --uri for
HTTP(S) public-inbox instances, but otherwise the output is
designed for "lei q --stdin".

To find threads for a given git commit from a git worktree:

	lei p2q $COMMIT_OID | lei q --stdin -t ...

It can also read via --stdin|-

	curl $INBOX_URL/$MSGID/raw | lei p2q - | lei q --stdin -t

Or from the filesystem:

	lei p2q $(git format-patch -1) | lei q --stdin -t

This defaults to only generating "dfpost:"-prefixed terms since
I've found those most useful for finding messages relating to a
commit.  This is subject to change.

--want=s@ is a comma-separated or multi-value list of prefixes
that defaults to "dfpost7".  Not all are implemented, yet, but
s, dfn, dfpre, and dfpost all seem to mostly work.  Phrase
handling may need to be tweaked to work with Xapian.

OR, NEAR, ADJ, AND, NOT may be used with --want
(e.g. --want=dfpost,OR,dfn)

Prefixing the field prefix with '+' or '-' (e.g. --want=+dfpost)
generates "+dfpost:$EXTRACTED_OID" for Xapian.   For non-boolean
search prefixes, wildcard (*) may also be supplied: (--want=dfn*)

For boolean search prefixes, suffixing the field prefix with a
digit (e.g. --want=dfpost7) provides a minimum length, allowing
truncated variations to be searched.  This is helpful for
finding older messages as git chooses longer dfpost|dfpre
abbreviations as repos get larger.

Automatic date range generation is not implemented, yet.
---
 MANIFEST                  |   2 +
 lib/PublicInbox/LEI.pm    |  30 +++++-
 lib/PublicInbox/LeiP2q.pm | 197 ++++++++++++++++++++++++++++++++++++++
 t/lei-p2q.t               |  29 ++++++
 4 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100644 lib/PublicInbox/LeiP2q.pm
 create mode 100644 t/lei-p2q.t

diff --git a/MANIFEST b/MANIFEST
index 11ec5c01..5044e21c 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -187,6 +187,7 @@ lib/PublicInbox/LeiHelp.pm
 lib/PublicInbox/LeiImport.pm
 lib/PublicInbox/LeiMirror.pm
 lib/PublicInbox/LeiOverview.pm
+lib/PublicInbox/LeiP2q.pm
 lib/PublicInbox/LeiQuery.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
@@ -373,6 +374,7 @@ t/lei-import-maildir.t
 t/lei-import-nntp.t
 t/lei-import.t
 t/lei-mirror.t
+t/lei-p2q.t
 t/lei-q-remote-import.t
 t/lei-q-thread.t
 t/lei.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0da24499..a2f8ffe7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -179,6 +179,9 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
 	lock=s@ kw|keywords|flags! C=s@),
 	],
+'p2q' => [ 'FILE|COMMIT_OID|--stdin',
+	"use a patch to generate a query for `lei q --stdin'",
+	qw(stdin| want|w=s@ uri debug) ],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
@@ -238,6 +241,10 @@ my %OPTDESC = (
 'show	threads|t' => 'display entire thread a message belongs to',
 'q	threads|t+' =>
 	'return all messages in the same threads as the actual match(es)',
+
+'want|w=s@' => [ 'PREFIX|dfpost|dfn', # common ones in help...
+		'search prefixes to extract (default: dfpost7)' ],
+
 'alert=s@' => ['CMD,:WINCH,:bell,<any command>',
 	'run command(s) or perform ops when done writing to output ' .
 	'(default: ":WINCH,:bell" with --mua and Maildir/IMAP output, ' .
@@ -331,7 +338,7 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-my @WQ_KEYS = qw(lxs l2m imp mrr cnv); # internal workers
+my @WQ_KEYS = qw(lxs l2m imp mrr cnv p2q); # internal workers
 
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
@@ -673,6 +680,11 @@ sub lei_convert {
 	PublicInbox::LeiConvert->call(@_);
 }
 
+sub lei_p2q {
+	require PublicInbox::LeiP2q;
+	PublicInbox::LeiP2q->call(@_);
+}
+
 sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
@@ -854,6 +866,22 @@ sub poke_mua { # forces terminal MUAs to wake up and hopefully notice new mail
 	}
 }
 
+my %path_to_fd = ('/dev/stdin' => 0, '/dev/stdout' => 1, '/dev/stderr' => 2);
+$path_to_fd{"/dev/fd/$_"} = $path_to_fd{"/proc/self/fd/$_"} for (0..2);
+sub fopen {
+	my ($self, $mode, $path) = @_;
+	rel2abs($self, $path);
+	$path =~ tr!/!/!s;
+	if (defined(my $fd = $path_to_fd{$path})) {
+		return $self->{$fd};
+	}
+	if ($path =~ m!\A/(?:dev|proc/self)/fd/[0-9]+\z!) {
+		return fail($self, "cannot open $path from daemon");
+	}
+	open my $fh, $mode, $path or return;
+	$fh;
+}
+
 # caller needs to "-t $self->{1}" to check if tty
 sub start_pager {
 	my ($self) = @_;
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
new file mode 100644
index 00000000..d1dd125e
--- /dev/null
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -0,0 +1,197 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# front-end for the "lei patch-to-query" sub-command
+package PublicInbox::LeiP2q;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::Eml;
+use PublicInbox::Smsg;
+use PublicInbox::MsgIter qw(msg_part_text);
+use PublicInbox::Git qw(git_unquote);
+use PublicInbox::Spawn qw(popen_rd);
+use URI::Escape qw(uri_escape_utf8);
+
+sub xphrase ($) {
+	my ($s) = @_;
+	return () unless $s =~ /\S/;
+	# cf. xapian-core/queryparser/queryparser.lemony
+	# [\./:\\\@] - is_phrase_generator (implicit phrase search)
+	# FIXME not really sure about these..., we basically want to
+	# extract the longest phrase possible that Xapian can handle
+	map {
+		s/\A\s*//;
+		s/\s+\z//;
+		/[\|=><,\sA-Z]/ && !m![\./:\\\@]! ? qq("$_") : $_;
+	} ($s =~ m!(\w[\|=><,\./:\\\@\-\w\s]+)!g);
+}
+
+sub extract_terms { # eml->each_part callback
+	my ($p, $lei) = @_;
+	my $part = $p->[0]; # ignore $depth and @idx;
+	my $ct = $part->content_type || 'text/plain';
+	my ($s, undef) = msg_part_text($part, $ct);
+	defined $s or return;
+	my $in_diff;
+	# TODO: b: nq: q:
+	for (split(/\n/, $s)) {
+		if ($in_diff && s/^ //) { # diff context
+			push @{$lei->{qterms}->{dfctx}}, xphrase($_);
+		} elsif (/^-- $/) { # email signature begins
+			$in_diff = undef;
+		} elsif (m!^diff --git "?[^/]+/.+ "?[^/]+/.+\z!) {
+			# wait until "---" and "+++" to capture filenames
+			$in_diff = 1;
+		} elsif (/^index ([a-f0-9]+)\.\.([a-f0-9]+)\b/) {
+			my ($oa, $ob) = ($1, $2);
+			push @{$lei->{qterms}->{dfpre}}, $oa;
+			push @{$lei->{qterms}->{dfpost}}, $ob;
+			# who uses dfblob?
+		} elsif (m!^(?:---|\+{3}) ("?[^/]+/.+)!) {
+			my $fn = (split(m!/!, git_unquote($1.''), 2))[1];
+			push @{$lei->{qterms}->{dfn}}, xphrase($fn);
+		} elsif ($in_diff && s/^\+//) { # diff added
+			push @{$lei->{qterms}->{dfb}}, xphrase($_);
+		} elsif ($in_diff && s/^-//) { # diff removed
+			push @{$lei->{qterms}->{dfa}}, xphrase($_);
+		} elsif (/^@@ (?:\S+) (?:\S+) @@\s*(\S+.*)/) {
+			push @{$lei->{qterms}->{dfhh}}, xphrase($1);
+		} elsif (/^(?:dis)similarity index/ ||
+				/^(?:old|new) mode/ ||
+				/^(?:deleted|new) file mode/ ||
+				/^(?:copy|rename) (?:from|to) / ||
+				/^(?:dis)?similarity index / ||
+				/^\\ No newline at end of file/ ||
+				/^Binary files .* differ/) {
+		} elsif ($_ eq '') {
+			# possible to be in diff context, some mail may be
+			# stripped by MUA or even GNU diff(1).  "git apply"
+			# treats a bare "\n" as diff context, too
+		} else {
+			$in_diff = undef;
+		}
+	}
+}
+
+my %pfx2smsg = (
+	t => [ qw(to) ],
+	c => [ qw(cc) ],
+	f => [ qw(from) ],
+	tc => [ qw(to cc) ],
+	tcf => [ qw(to cc from) ],
+	a => [ qw(to cc from) ],
+	s => [ qw(subject) ],
+	bs => [ qw(subject) ], # body handled elsewhere
+	d => [ qw(ds) ], # nonsense?
+	dt => [ qw(ds) ], # ditto...
+	rt => [ qw(ts) ], # ditto...
+);
+
+sub do_p2q { # via wq_do
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	my $want = $lei->{opt}->{want} // [ qw(dfpost7) ];
+	my @want = split(/[, ]+/, "@$want");
+	for (@want) {
+		/\A(?:(d|dt|rt):)?([0-9]+)(\.(?:day|weeks)s?)?\z/ or next;
+		my ($pfx, $n, $unit) = ($1, $2, $3);
+		$n *= 86400 * ($unit =~ /week/i ? 7 : 1);
+		$_ = [ $pfx, $n ];
+	}
+	my $smsg = bless {}, 'PublicInbox::Smsg';
+	my $in = $self->{0};
+	unless ($in) {
+		my $input = $self->{input};
+		if (-e $input) {
+			$in = $lei->fopen('<', $input) or
+				return $lei->fail("open < $input: $!");
+		} else {
+			my @cmd = (qw(git format-patch --stdout -1), $input);
+			$in = popen_rd(\@cmd, undef, { 2 => $lei->{2} });
+		}
+	};
+	my $eml = PublicInbox::Eml->new(\(do { local $/; <$in> }));
+	$lei->{diff_want} = +{ map { $_ => 1 } @want };
+	$smsg->populate($eml);
+	while (my ($pfx, $fields) = each %pfx2smsg) {
+		next unless $lei->{diff_want}->{$pfx};
+		for my $f (@$fields) {
+			my $v = $smsg->{$f} // next;
+			push @{$lei->{qterms}->{$pfx}}, xphrase($v);
+		}
+	}
+	$eml->each_part(\&extract_terms, $lei, 1);
+	if ($lei->{opt}->{debug}) {
+		my $json = ref(PublicInbox::Config->json)->new;
+		$json->utf8->canonical->pretty;
+		$lei->err($json->encode($lei->{qterms}));
+	}
+	my (@q, %seen);
+	for my $pfx (@want) {
+		if (ref($pfx) eq 'ARRAY') {
+			my ($p, $t_range) = @$pfx; # TODO
+
+		} elsif ($pfx =~ m!\A(?:OR|XOR|AND|NOT)\z! ||
+				$pfx =~ m!\A(?:ADJ|NEAR)(?:/[0-9]+)?\z!) {
+			push @q, $pfx;
+		} else {
+			my $plusminus = ($pfx =~ s/\A([\+\-])//) ? $1 : '';
+			my $end = ($pfx =~ s/([0-9\*]+)\z//) ? $1 : '';
+			my $x = delete($lei->{qterms}->{$pfx}) or next;
+			my $star = $end =~ tr/*//d ? '*' : '';
+			my $min_len = ($end // 0) + 0;
+
+			# no wildcards for bool_pfx_external
+			$star = '' if $pfx =~ /\A(dfpre|dfpost|mid)\z/;
+			$pfx = "$plusminus$pfx:";
+			if ($min_len) {
+				push @q, map {
+					my @t = ($pfx.$_.$star);
+					while (length > $min_len) {
+						chop $_;
+						push @t, 'OR', $pfx.$_.$star;
+					}
+					@t;
+				} @$x;
+			} else {
+				push @q, map {
+					my $k = $pfx.$_.$star;
+					$seen{$k}++ ? () : $k
+				} @$x;
+			}
+		}
+	}
+	if ($lei->{opt}->{uri}) {
+		@q = (join('+', map { uri_escape_utf8($_) } @q));
+	} else {
+		@q = (join(' ', @q));
+	}
+	$lei->out(@q, "\n");
+}
+
+sub call { # the "lei patch-to-query" entry point
+	my ($cls, $lei, $input) = @_;
+	my $self = $lei->{p2q} = bless {}, $cls;
+	if ($lei->{opt}->{stdin}) {
+		$self->{0} = delete $lei->{0}; # guard from lei_atfork_child
+	} else {
+		$self->{input} = $input;
+	}
+	my $op = $lei->workers_start($self, 'lei patch2query', 1, {
+		'' => [ $lei->{p2q_done} // $lei->can('dclose'), $lei ]
+	});
+	$self->wq_io_do('do_p2q', []);
+	$self->wq_close(1);
+	while ($op && $op->{sock}) { $op->event_step }
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	$lei->lei_atfork_child;
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
+	$self->SUPER::ipc_atfork_child;
+}
+
+1;
diff --git a/t/lei-p2q.t b/t/lei-p2q.t
new file mode 100644
index 00000000..1a2c2e4f
--- /dev/null
+++ b/t/lei-p2q.t
@@ -0,0 +1,29 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(json DBD::SQLite Search::Xapian));
+
+test_lei(sub {
+	lei_ok(qw(p2q -w dfpost t/data/0001.patch));
+	is($lei_out, "dfpost:6e006fd73b1d\n", 'pathname');
+	open my $fh, '+<', 't/data/0001.patch';
+	lei_ok([qw(p2q -w dfpost -)], undef, { %$lei_opt, 0 => $fh });
+	is($lei_out, "dfpost:6e006fd73b1d\n", '--stdin');
+
+	lei_ok(qw(p2q --uri t/data/0001.patch -w), 'dfpost,dfn');
+	is($lei_out, "dfpost%3A6e006fd73b1d+".
+		"dfn%3Alib%2FPublicInbox%2FSearch.pm\n",
+		'--uri -w dfpost,dfn');
+	lei_ok(qw(p2q t/data/0001.patch), '--want=dfpost,OR,dfn');
+	is($lei_out, "dfpost:6e006fd73b1d OR dfn:lib/PublicInbox/Search.pm\n",
+		'--want=OR');
+	lei_ok(qw(p2q t/data/0001.patch --want=dfpost9));
+	is($lei_out, "dfpost:6e006fd73b1d OR " .
+			"dfpost:6e006fd73b1 OR " .
+			"dfpost:6e006fd73b OR " .
+			"dfpost:6e006fd73\n",
+		'3-byte chop');
+});
+done_testing;

^ permalink raw reply related	[relevance 51%]

* Re: [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin"
  2021-02-28 12:25 51% ` [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin" Eric Wong
@ 2021-02-28 21:40 90%   ` Kyle Meyer
  2021-03-01  5:47 58%     ` [PATCH 4/3] lei p2q: fix /dev/null filenames, fix phrase quoting rules Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-28 21:40 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Instead of teaching the to-be-implemented "lei show" to search
> threads/messages based commits, this orthogonal sub-command is
> designed to generate queries for use with "lei q --stdin".
>
> URI-escaped query parameters may be generated with --uri for
> HTTP(S) public-inbox instances, but otherwise the output is
> designed for "lei q --stdin".
>
> To find threads for a given git commit from a git worktree:
>
> 	lei p2q $COMMIT_OID | lei q --stdin -t ...
>
> It can also read via --stdin|-
>
> 	curl $INBOX_URL/$MSGID/raw | lei p2q - | lei q --stdin -t
>
> Or from the filesystem:
>
> 	lei p2q $(git format-patch -1) | lei q --stdin -t

Very nice :)

> diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
[...]
> +		} elsif (m!^(?:---|\+{3}) ("?[^/]+/.+)!) {
> +			my $fn = (split(m!/!, git_unquote($1.''), 2))[1];
> +			push @{$lei->{qterms}->{dfn}}, xphrase($fn);
> +		} elsif ($in_diff && s/^\+//) { # diff added
> +			push @{$lei->{qterms}->{dfb}}, xphrase($_);
> +		} elsif ($in_diff && s/^-//) { # diff removed
> +			push @{$lei->{qterms}->{dfa}}, xphrase($_);

I noticed an unexpected term when trying dfa:

  $ curl -fSs \
    https://public-inbox.org/meta/20210228122528.18552-2-e@80x24.org/raw >msg
  $ lei p2q --want=dfa msg
  dfa:my @WQ_KEYS = qw dfa:"lxs l2m imp mrr cnv" dfa:"internal workers" dfa:dev/null

So I think the upstream "--- " filename regexp needs to be adjusted to
account for "/dev/null".

^ permalink raw reply	[relevance 90%]

* [PATCH 4/3] lei p2q: fix /dev/null filenames, fix phrase quoting rules
  2021-02-28 21:40 90%   ` Kyle Meyer
@ 2021-03-01  5:47 58%     ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-01  5:47 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> I noticed an unexpected term when trying dfa:
> 
>   $ curl -fSs \
>     https://public-inbox.org/meta/20210228122528.18552-2-e@80x24.org/raw >msg
>   $ lei p2q --want=dfa msg
>   dfa:my @WQ_KEYS = qw dfa:"lxs l2m imp mrr cnv" dfa:"internal workers" dfa:dev/null
> 
> So I think the upstream "--- " filename regexp needs to be adjusted to
> account for "/dev/null".

Thanks.  Also, "my @WQ_KEYS = qw" needs to be quoted, at least,
(and maybe '(' and ')', need to check Xapian more closely....

And I'll have to fix them in SearchIdx (and probably switch to
use a common parser for indexing + term generation).

On a side note: I find myself mega-confused using public-inbox
patches as test data.  I thought Perl was choking and spitting
code back out at me :x

---8<---
Subject: [PATCH] lei p2q: fix /dev/null filenames, fix phrase quoting rules

/dev/null mis-handling was reported by Kyle Meyer.

Phrases quoting rules are also refined to avoid leaving spaces
unquoted when "phrase generator" characters exist.  Also,
context-free hunk headers no longer clobber the in_diff
state of the parser, since git can still generate those.

Link: https://public-inbox.org/meta/87k0qrrhve.fsf@kyleam.com/
---
 lib/PublicInbox/LeiP2q.pm | 10 +++++++---
 t/lei-p2q.t               |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index d1dd125e..e7ddc852 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -12,6 +12,7 @@ use PublicInbox::MsgIter qw(msg_part_text);
 use PublicInbox::Git qw(git_unquote);
 use PublicInbox::Spawn qw(popen_rd);
 use URI::Escape qw(uri_escape_utf8);
+my $FN = qr!((?:"?[^/\n]+/[^\r\n]+)|/dev/null)!;
 
 sub xphrase ($) {
 	my ($s) = @_;
@@ -23,7 +24,7 @@ sub xphrase ($) {
 	map {
 		s/\A\s*//;
 		s/\s+\z//;
-		/[\|=><,\sA-Z]/ && !m![\./:\\\@]! ? qq("$_") : $_;
+		m![^\./:\\\@\-\w]! ? qq("$_") : $_ ;
 	} ($s =~ m!(\w[\|=><,\./:\\\@\-\w\s]+)!g);
 }
 
@@ -40,7 +41,7 @@ sub extract_terms { # eml->each_part callback
 			push @{$lei->{qterms}->{dfctx}}, xphrase($_);
 		} elsif (/^-- $/) { # email signature begins
 			$in_diff = undef;
-		} elsif (m!^diff --git "?[^/]+/.+ "?[^/]+/.+\z!) {
+		} elsif (m!^diff --git $FN $FN!) {
 			# wait until "---" and "+++" to capture filenames
 			$in_diff = 1;
 		} elsif (/^index ([a-f0-9]+)\.\.([a-f0-9]+)\b/) {
@@ -48,13 +49,16 @@ sub extract_terms { # eml->each_part callback
 			push @{$lei->{qterms}->{dfpre}}, $oa;
 			push @{$lei->{qterms}->{dfpost}}, $ob;
 			# who uses dfblob?
-		} elsif (m!^(?:---|\+{3}) ("?[^/]+/.+)!) {
+		} elsif (m!^(?:---|\+{3}) ($FN)!) {
+			next if $1 eq '/dev/null';
 			my $fn = (split(m!/!, git_unquote($1.''), 2))[1];
 			push @{$lei->{qterms}->{dfn}}, xphrase($fn);
 		} elsif ($in_diff && s/^\+//) { # diff added
 			push @{$lei->{qterms}->{dfb}}, xphrase($_);
 		} elsif ($in_diff && s/^-//) { # diff removed
 			push @{$lei->{qterms}->{dfa}}, xphrase($_);
+		} elsif (/^@@ (?:\S+) (?:\S+) @@\s*$/) {
+			# traditional diff w/o -p
 		} elsif (/^@@ (?:\S+) (?:\S+) @@\s*(\S+.*)/) {
 			push @{$lei->{qterms}->{dfhh}}, xphrase($1);
 		} elsif (/^(?:dis)similarity index/ ||
diff --git a/t/lei-p2q.t b/t/lei-p2q.t
index 1a2c2e4f..87cf9fa7 100644
--- a/t/lei-p2q.t
+++ b/t/lei-p2q.t
@@ -25,5 +25,8 @@ test_lei(sub {
 			"dfpost:6e006fd73b OR " .
 			"dfpost:6e006fd73\n",
 		'3-byte chop');
+
+	lei_ok(qw(p2q t/data/message_embed.eml --want=dfb));
+	like($lei_out, qr/\bdfb:\S+/, 'got dfb off /dev/null file');
 });
 done_testing;

^ permalink raw reply related	[relevance 58%]

* Re: lei: per-message keywords and externals
  2021-02-26  9:26 71% ` Eric Wong
@ 2021-03-02  9:28 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-02  9:28 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> Eric Wong <e@80x24.org> wrote:
> > Something I've been pondering for a bit is how to handle
> > keywords (Seen, Important, Replied, ...) for messages stored in
> > externals.
> > 
> > I want "kw:" prefix to be a usable search term, like:
> > 
> > 	lei q something interesting kw:seen
> > 	lei q something interesting NOT kw:seen
> > 
> > This is no problem for imported messages in ~/.local/share/lei/store.
> > All the keyword info is stored in line with the rest of the
> > Xapian index data.
> > 
> > But, I also don't want to be wasting users' space by duplicating
> > index data if they're already hosting inboxes for public
> > consumption.  So, it's looking like parsing out kw: ourselves
> > and do extra filtering on our end when externals are in play is
> > going to be a requirement...
> 
> Something I considered a few weeks ago, but decided against, but
> am again coming around to is indexing just the overview header
> info in lei/store.  In other words:
> 
> 	$sto->set_eml($eml->header_obj, @kw)
> 
> instead of:
> 
> 	$sto->set_eml($eml, @kw)
> 
> > Or, just don't support searching using "kw:" with externals, for
> > now; but still stash keywords somewhere when writing to
> > traditional mail stores.
> 
> Maybe it'll be another instance of LeiStore in a separate dir
> for external keywords: ~/.local/share/lei/xkw-store

I'm leaning that way.  For deduplication purposes (that is:
merging keywords from cross-posted messages), OID will be
indexed as a boolean term for repeat lookups (along with
Message-ID).  I'm not 100% sure if I want this to be SQLite or
Xapian, yet.  Leaning towards Xapian since that would give us
more flexibility w.r.t keyword searches and also let us do
filtering on common headers without too much cost.

^ permalink raw reply	[relevance 71%]

* read-write JMAP for lei?
@ 2021-03-02 23:04 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-02 23:04 UTC (permalink / raw)
  To: meta

I'm already planning on supporting read-only JMAP in PublicInbox::WWW;
but lei may be able to run via .cgi/PSGI for read-write JMAP
off individual user accounts...

^ permalink raw reply	[relevance 71%]

* should lei attempt to index mail outside of git?
@ 2021-03-03  3:53 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-03  3:53 UTC (permalink / raw)
  To: meta

Currently, every mail lei indexes has a git blob associated with it.

I understand some folks might want to keep using their existing
storage and not have a redundant, expensive-to-erase copy of the
mail in git; but just want an indexing-only solution like mairix.

So, is this a feature worth implementing?

^ permalink raw reply	[relevance 71%]

* [PATCH 0/4] lei q: avoiding accidental data loss
@ 2021-03-03 13:48 71% Eric Wong
  2021-03-03 13:48 55% ` [PATCH 3/4] lei: use maildir_each_eml in more places Eric Wong
  2021-03-03 13:48 33% ` [PATCH 4/4] lei q: import flags when clobbering/augmenting Maildirs Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-03-03 13:48 UTC (permalink / raw)
  To: meta

"lei q" mimicking mairix(1) could cause some grief if somebody
accidentally sets the output destination to one containing
precious mail or keyword changes.  Start by stashing keyword
changes into our store during the augment/unlink phase.

Eric Wong (4):
  eml: each_part: document IMAP user of the $all parameter
  lei_xsearch: add_eml for remote mboxrd, not set_eml
  lei: use maildir_each_eml in more places
  lei q: import flags when clobbering/augmenting Maildirs

 MANIFEST                        |  1 +
 lib/PublicInbox/Eml.pm          |  1 +
 lib/PublicInbox/ExtSearchIdx.pm |  1 +
 lib/PublicInbox/LEI.pm          |  2 +-
 lib/PublicInbox/LeiConvert.pm   |  3 +--
 lib/PublicInbox/LeiQuery.pm     |  5 +++-
 lib/PublicInbox/LeiSearch.pm    | 47 +++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiStore.pm     | 27 +++++++++----------
 lib/PublicInbox/LeiToMail.pm    | 45 ++++++++++++++++++++-----------
 lib/PublicInbox/LeiXSearch.pm   |  2 +-
 lib/PublicInbox/MdirReader.pm   | 10 ++++---
 t/lei-convert.t                 |  2 +-
 t/lei-q-kw.t                    | 33 +++++++++++++++++++++++
 t/lei.t                         |  3 ++-
 t/lei_store.t                   | 10 ++++++-
 15 files changed, 149 insertions(+), 43 deletions(-)
 create mode 100644 t/lei-q-kw.t


^ permalink raw reply	[relevance 71%]

* [PATCH 3/4] lei: use maildir_each_eml in more places
  2021-03-03 13:48 71% [PATCH 0/4] lei q: avoiding accidental data loss Eric Wong
@ 2021-03-03 13:48 55% ` Eric Wong
  2021-03-03 13:48 33% ` [PATCH 4/4] lei q: import flags when clobbering/augmenting Maildirs Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-03-03 13:48 UTC (permalink / raw)
  To: meta

This saves us some code and redundant callsites for
eml_from_path.  We'll change maildir_each_eml to include the
filename to facilitate an upcoming change to "lei q" without
--augment
---
 lib/PublicInbox/LeiConvert.pm |  3 +--
 lib/PublicInbox/LeiToMail.pm  | 18 ++++++------------
 lib/PublicInbox/MdirReader.pm | 10 ++++++----
 t/lei-convert.t               |  2 +-
 4 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 4c0bbd88..0c705ba4 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -7,7 +7,6 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
-use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::LeiStore;
 use PublicInbox::LeiOverview;
 
@@ -24,7 +23,7 @@ sub net_cb { # callback for ->imap_each, ->nntp_each
 }
 
 sub mdir_cb {
-	my ($kw, $eml, $self) = @_;
+	my ($f, $kw, $eml, $self) = @_;
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index de640657..31b8aba8 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -19,7 +19,6 @@ use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
 use Errno qw(EEXIST ESPIPE ENOENT EPIPE);
 use Digest::SHA qw(sha256_hex);
-my ($maildir_each_file);
 
 # struggles with short-lived repos, Gcf2Client makes little sense with lei;
 # but we may use in-process libgit2 in the future.
@@ -268,8 +267,8 @@ sub _mbox_write_cb ($$) {
 	}
 }
 
-sub _augment_file { # maildir_each_file cb
-	my ($f, $lei, $mod, $shard) = @_;
+sub _augment_file { # maildir_each_eml cb
+	my ($f, undef, $eml, $lei, $mod, $shard) = @_;
 	if ($mod) {
 		# can't get dirent.d_ino w/ pure Perl, so we extract the OID
 		# if it looks like one:
@@ -278,7 +277,6 @@ sub _augment_file { # maildir_each_file cb
 		my $recno = hex(substr($hex, 0, 8));
 		return if ($recno % $mod) != $shard;
 	}
-	my $eml = PublicInbox::InboxWritable::eml_from_path($f) or return;
 	_augment($eml, $lei);
 }
 
@@ -375,12 +373,7 @@ sub new {
 	my $dst = $lei->{ovv}->{dst};
 	my $self = bless {}, $cls;
 	if ($fmt eq 'maildir') {
-		$maildir_each_file //= do {
-			require PublicInbox::MdirReader;
-			PublicInbox::MdirReader->can('maildir_each_file');
-		};
-		$lei->{opt}->{augment} and
-			require PublicInbox::InboxWritable; # eml_from_path
+		require PublicInbox::MdirReader;
 		$self->{base_type} = 'maildir';
 		-e $dst && !-d _ and die
 				"$dst exists and is not a directory\n";
@@ -430,12 +423,13 @@ sub _do_augment_maildir {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
 			my ($mod, $shard) = @{$self->{shard_info} // []};
-			$maildir_each_file->($dst, \&_augment_file,
+			PublicInbox::MdirReader::maildir_each_eml($dst,
+						\&_augment_file,
 						$lei, $mod, $shard);
 			$dedupe->pause_dedupe;
 		}
 	} else { # clobber existing Maildir
-		$maildir_each_file->($dst, \&_unlink);
+		PublicInbox::MdirReader::maildir_each_file($dst, \&_unlink);
 	}
 }
 
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index 5fa534f5..44724af1 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -48,17 +48,19 @@ sub maildir_each_eml ($$;@) {
 			next if substr($bn, 0, 1) eq '.';
 			my @f = split(/:/, $bn, -1);
 			next if scalar(@f) != 1;
-			my $eml = eml_from_path($pfx.$bn) or next;
-			$cb->([], $eml, @arg);
+			my $f = $pfx.$bn;
+			my $eml = eml_from_path($f) or next;
+			$cb->($f, [], $eml, @arg);
 		}
 	}
 	$pfx = "$dir/cur/";
 	opendir my $dh, $pfx or return;
 	while (defined(my $bn = readdir($dh))) {
 		my $fl = maildir_basename_flags($bn) // next;
-		my $eml = eml_from_path($pfx.$bn) or next;
+		my $f = $pfx.$bn;
+		my $eml = eml_from_path($f) or next;
 		my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
-		$cb->(\@kw, $eml, @arg);
+		$cb->($f, \@kw, $eml, @arg);
 	}
 }
 
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 20099f65..186cfb13 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -58,7 +58,7 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	ok(-d "$d/md", 'Maildir created');
 	my @md;
 	PublicInbox::MdirReader::maildir_each_eml("$d/md", sub {
-		push @md, $_[1];
+		push @md, $_[2];
 	});
 	is(scalar(@md), scalar(@mboxrd), 'got expected emails in Maildir');
 	@md = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @md;

^ permalink raw reply related	[relevance 55%]

* [PATCH 4/4] lei q: import flags when clobbering/augmenting Maildirs
  2021-03-03 13:48 71% [PATCH 0/4] lei q: avoiding accidental data loss Eric Wong
  2021-03-03 13:48 55% ` [PATCH 3/4] lei: use maildir_each_eml in more places Eric Wong
@ 2021-03-03 13:48 33% ` Eric Wong
  2021-03-03 22:29 71%   ` RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting] Eric Wong
  1 sibling, 1 reply; 200+ results
From: Eric Wong @ 2021-03-03 13:48 UTC (permalink / raw)
  To: meta

This will eventually be supported for other mail stores,
but Maildir is the easiest to test and support, here.

This lets us avoid a situation where flag changes get
lost between search results.
---
 MANIFEST                        |  1 +
 lib/PublicInbox/ExtSearchIdx.pm |  1 +
 lib/PublicInbox/LEI.pm          |  2 +-
 lib/PublicInbox/LeiQuery.pm     |  5 +++-
 lib/PublicInbox/LeiSearch.pm    | 47 +++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiStore.pm     | 27 +++++++++----------
 lib/PublicInbox/LeiToMail.pm    | 33 ++++++++++++++++++-----
 lib/PublicInbox/LeiXSearch.pm   |  2 +-
 t/lei-q-kw.t                    | 33 +++++++++++++++++++++++
 t/lei.t                         |  3 ++-
 t/lei_store.t                   | 10 ++++++-
 11 files changed, 137 insertions(+), 27 deletions(-)
 create mode 100644 t/lei-q-kw.t

diff --git a/MANIFEST b/MANIFEST
index 5044e21c..8c9c86a0 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -375,6 +375,7 @@ t/lei-import-nntp.t
 t/lei-import.t
 t/lei-mirror.t
 t/lei-p2q.t
+t/lei-q-kw.t
 t/lei-q-remote-import.t
 t/lei-q-thread.t
 t/lei.t
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index d0c9c2f7..a17e7579 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -1128,5 +1128,6 @@ no warnings 'once';
 *atfork_child = \&PublicInbox::V2Writable::atfork_child;
 *idx_shard = \&PublicInbox::V2Writable::idx_shard;
 *reindex_checkpoint = \&PublicInbox::V2Writable::reindex_checkpoint;
+*checkpoint = \&PublicInbox::V2Writable::checkpoint;
 
 1;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 834e399f..1e5b04ca 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -113,7 +113,7 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g augment|a
-	import-remote! lock=s@
+	import-remote! import-augment! lock=s@
 	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index b57d1cc5..c630d628 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -51,7 +51,10 @@ sub lei_q {
 	# we'll allow "--only $LOCATION --local"
 	my $sto = $self->_lei_store(1);
 	my $lse = $sto->search;
-	$sto->write_prepare($self) if $opt->{'import-remote'} //= 1;
+	if (($opt->{'import-remote'} //= 1) |
+			($opt->{'import-augment'} //= 1)) {
+		$sto->write_prepare($self);
+	}
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		$lxs->prepare_external($lse);
 	}
diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm
index 440bacf5..ceb3624b 100644
--- a/lib/PublicInbox/LeiSearch.pm
+++ b/lib/PublicInbox/LeiSearch.pm
@@ -1,11 +1,14 @@
 # Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
+# read-only counterpart for PublicInbox::LeiStore
 package PublicInbox::LeiSearch;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::ExtSearch);
 use PublicInbox::Search qw(xap_terms);
+use PublicInbox::ContentHash qw(content_digest content_hash);
+use PublicInbox::MID qw(mids mids_in);
 
 # get combined docid from over.num:
 # (not generic Xapian, only works with our sharding scheme)
@@ -24,4 +27,48 @@ sub msg_keywords {
 	wantarray ? sort(keys(%$kw)) : $kw;
 }
 
+# when a message has no Message-IDs at all, this is needed for
+# unsent Draft messages, at least
+sub content_key ($) {
+	my ($eml) = @_;
+	my $dig = content_digest($eml);
+	my $chash = $dig->clone->digest;
+	my $mids = mids_in($eml,
+			qw(Message-ID X-Alt-Message-ID Resent-Message-ID));
+	unless (@$mids) {
+		$eml->{-lei_fake_mid} = $mids->[0] =
+				PublicInbox::Import::digest2mid($dig, $eml);
+	}
+	($chash, $mids);
+}
+
+sub _cmp_1st { # git->cat_async callback
+	my ($bref, $oid, $type, $size, $cmp) = @_; # cmp: [chash, found, smsg]
+	return if defined($cmp->[1]->[0]); # $found->[0]
+	if (content_hash(PublicInbox::Eml->new($bref)) eq $cmp->[0]) {
+		push @{$cmp->[1]}, $cmp->[2]->{num};
+	}
+}
+
+# returns true if $eml is indexed by lei/store and keywords don't match
+sub kw_changed {
+	my ($self, $eml, $new_kw_sorted) = @_;
+	my ($chash, $mids) = content_key($eml);
+	my $over = $self->over;
+	my $git = $self->git;
+	my $found = [];
+	for my $mid (@$mids) {
+		my ($id, $prev);
+		while (my $cur = $over->next_by_mid($mid, \$id, \$prev)) {
+			$git->cat_async($cur->{blob}, \&_cmp_1st,
+					[ $chash, $found, $cur ]);
+			last if scalar(@$found);
+		}
+	}
+	$git->cat_async_wait;
+	my $num = $found->[0] // return;
+	my @cur_kw = msg_keywords($self, $num);
+	join("\0", @$new_kw_sorted) eq join("\0", @cur_kw) ? 0 : 1;
+}
+
 1;
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 77601828..92c29100 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -14,8 +14,8 @@ use PublicInbox::ExtSearchIdx;
 use PublicInbox::Import;
 use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::V2Writable;
-use PublicInbox::ContentHash qw(content_hash content_digest);
-use PublicInbox::MID qw(mids mids_in);
+use PublicInbox::ContentHash qw(content_hash);
+use PublicInbox::MID qw(mids);
 use PublicInbox::LeiSearch;
 use PublicInbox::MDA;
 use List::Util qw(max);
@@ -104,25 +104,13 @@ sub eidx_init {
 	$eidx;
 }
 
-# when a message has no Message-IDs at all, this is needed for
-# unsent Draft messages, at least
-sub _fake_mid_for ($$) {
-	my ($eml, $dig) = @_;
-	my $mids = mids_in($eml, qw(X-Alt-Message-ID Resent-Message-ID));
-	$eml->{-lei_fake_mid} =
-		$mids->[0] // PublicInbox::Import::digest2mid($dig, $eml);
-}
-
 sub _docids_for ($$) {
 	my ($self, $eml) = @_;
 	my %docids;
-	my $dig = content_digest($eml);
-	my $chash = $dig->clone->digest;
+	my ($chash, $mids) = PublicInbox::LeiSearch::content_key($eml);
 	my $eidx = eidx_init($self);
 	my $oidx = $eidx->{oidx};
 	my $im = $self->{im};
-	my $mids = mids($eml);
-	$mids->[0] //= _fake_mid_for($eml, $dig);
 	for my $mid (@$mids) {
 		my ($id, $prev);
 		while (my $cur = $oidx->next_by_mid($mid, \$id, \$prev)) {
@@ -183,6 +171,7 @@ sub mbox_keywords {
 	sort(keys %kw);
 }
 
+# TODO: move this to MdirReader, maybe...
 # cf: https://cr.yp.to/proto/maildir.html
 my %c2kw = ('D' => 'draft', F => 'flagged', R => 'answered', S => 'seen');
 sub maildir_keywords {
@@ -230,6 +219,14 @@ sub set_eml_from_maildir {
 	set_eml($self, $eml, $set_kw ? maildir_keywords($f) : ());
 }
 
+sub checkpoint {
+	my ($self, $wait) = @_;
+	if (my $im = $self->{im}) {
+		$wait ? $im->barrier : $im->checkpoint;
+	}
+	$self->{priv_eidx}->checkpoint($wait);
+}
+
 sub done {
 	my ($self) = @_;
 	my $err = '';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 31b8aba8..3420b06e 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -267,8 +267,8 @@ sub _mbox_write_cb ($$) {
 	}
 }
 
-sub _augment_file { # maildir_each_eml cb
-	my ($f, undef, $eml, $lei, $mod, $shard) = @_;
+sub _augment_or_unlink { # maildir_each_eml cb
+	my ($f, $kw, $eml, $lei, $lse, $mod, $shard, $unlink) = @_;
 	if ($mod) {
 		# can't get dirent.d_ino w/ pure Perl, so we extract the OID
 		# if it looks like one:
@@ -276,8 +276,16 @@ sub _augment_file { # maildir_each_eml cb
 				$1 : sha256_hex($f);
 		my $recno = hex(substr($hex, 0, 8));
 		return if ($recno % $mod) != $shard;
+		if ($lse) {
+			my $x = $lse->kw_changed($eml, $kw);
+			if ($x) {
+				$lei->{sto}->ipc_do('set_eml', $eml, @$kw);
+			} elsif (!defined($x)) {
+				# TODO: xkw
+			}
+		}
 	}
-	_augment($eml, $lei);
+	$unlink ? unlink($f) : _augment($eml, $lei);
 }
 
 # maildir_each_file callback, \&CORE::unlink doesn't work with it
@@ -419,20 +427,31 @@ sub _pre_augment_maildir {
 sub _do_augment_maildir {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
+	my $lse = $lei->{sto}->search if $lei->{opt}->{'import-augment'};
+	my ($mod, $shard) = @{$self->{shard_info} // []};
 	if ($lei->{opt}->{augment}) {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
-			my ($mod, $shard) = @{$self->{shard_info} // []};
 			PublicInbox::MdirReader::maildir_each_eml($dst,
-						\&_augment_file,
-						$lei, $mod, $shard);
+						\&_augment_or_unlink,
+						$lei, $lse, $mod, $shard);
 			$dedupe->pause_dedupe;
 		}
-	} else { # clobber existing Maildir
+	} elsif ($lse) {
+		PublicInbox::MdirReader::maildir_each_eml($dst,
+					\&_augment_or_unlink,
+					$lei, $lse, $mod, $shard, 1);
+	} else {# clobber existing Maildir
 		PublicInbox::MdirReader::maildir_each_file($dst, \&_unlink);
 	}
 }
 
+sub _post_augment_maildir {
+	my ($self, $lei) = @_;
+	$lei->{opt}->{'import-augment'} or return;
+	my $wait = $lei->{sto}->ipc_do('checkpoint', 1);
+}
+
 sub _augment_imap { # PublicInbox::NetReader::imap_each cb
 	my ($url, $uid, $kw, $eml, $lei) = @_;
 	_augment($eml, $lei);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index dcc48806..45815180 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -204,7 +204,7 @@ sub query_mset { # non-parallel for non-"--threads" users
 
 sub each_remote_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self, $lei, $each_smsg) = @_;
-	$lei->{sto}->ipc_do('add_eml', $eml) if $lei->{sto}; # --import-remote
+	$lei->{sto}->ipc_do('add_eml', $eml) if $lei->{opt}->{'import-remote'};
 	my $smsg = bless {}, 'PublicInbox::Smsg';
 	$smsg->populate($eml);
 	$smsg->parse_references($eml, mids($eml));
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
new file mode 100644
index 00000000..97b2e08f
--- /dev/null
+++ b/t/lei-q-kw.t
@@ -0,0 +1,33 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+test_lei(sub {
+lei_ok(qw(import -F eml t/plack-qp.eml));
+my $o = "$ENV{HOME}/dst";
+lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
+my @fn = glob("$o/cur/*:2,");
+scalar(@fn) == 1 or BAIL_OUT "wrote multiple or zero files: ".explain(\@fn);
+rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
+
+lei_ok(qw(q -o), "maildir:$o", qw(m:bogus-noresults@example.com));
+ok(!glob("$o/cur/*"), 'last result cleared after augment-import');
+
+lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
+@fn = glob("$o/cur/*:2,S");
+is(scalar(@fn), 1, "`seen' flag set on Maildir file");
+
+# ensure --no-import-augment works
+my $n = $fn[0];
+$n =~ s/,S\z/,RS/;
+rename($fn[0], $n) or BAIL_OUT "rename $!";
+lei_ok(qw(q --no-import-augment -o), "maildir:$o",
+	qw(m:bogus-noresults@example.com));
+ok(!glob("$o/cur/*"), '--no-import-augment cleared destination');
+lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
+@fn = glob("$o/cur/*:2,S");
+is(scalar(@fn), 1, "`seen' flag (but not `replied') set on Maildir file");
+
+# TODO: other destination types
+});
+done_testing;
diff --git a/t/lei.t b/t/lei.t
index ba179b39..74a775ca 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -138,7 +138,8 @@ SKIP: {
 	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
 	is($? >> 8, 3, 'got curl exit for bogus URL');
 	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
-	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir');
+	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir') or
+		diag $lei_err;
 	is($lei_out, '', 'no output');
 }; # /SKIP
 };
diff --git a/t/lei_store.t b/t/lei_store.t
index e93fe779..1c3f7841 100644
--- a/t/lei_store.t
+++ b/t/lei_store.t
@@ -124,8 +124,16 @@ SKIP: {
 	$ids = $sto->ipc_do('set_eml', $eml, qw(seen answered));
 	is_deeply($ids, [ $no_mid->{num} ], 'docid returned w/o mid w/o ipc');
 	$wait = $sto->ipc_do('done');
-	@kw = $sto->search->msg_keywords($no_mid->{num});
+
+	my $lse = $sto->search;
+	@kw = $lse->msg_keywords($no_mid->{num});
 	is_deeply(\@kw, [qw(answered seen)], 'set changed kw w/o ipc');
+	is($lse->kw_changed($eml, [qw(answered seen)]), 0,
+		'kw_changed false when unchanged');
+	is($lse->kw_changed($eml, [qw(answered seen flagged)]), 1,
+		'kw_changed true when +flagged');
+	is($lse->kw_changed(eml_load('t/plack-qp.eml'), ['seen']), undef,
+		'kw_changed undef on unknown message');
 }
 
 done_testing;

^ permalink raw reply related	[relevance 33%]

* RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting]
  2021-03-03 13:48 33% ` [PATCH 4/4] lei q: import flags when clobbering/augmenting Maildirs Eric Wong
@ 2021-03-03 22:29 71%   ` Eric Wong
  2021-03-04  2:39 71%     ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-03-03 22:29 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> This will eventually be supported for other mail stores,
> but Maildir is the easiest to test and support, here.
> 
> This lets us avoid a situation where flag changes get
> lost between search results.

> --- a/lib/PublicInbox/LEI.pm
> +++ b/lib/PublicInbox/LEI.pm
> @@ -113,7 +113,7 @@ our %CMD = ( # sorted in order of importance/use:
>  	qw(save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+
>  	sort|s=s reverse|r offset=i remote! local! external! pretty
>  	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g augment|a
> -	import-remote! lock=s@
> +	import-remote! import-augment! lock=s@

--import-augment is the wrong name for this option,
since the import happens even when --augment isn't specified.

How about "--import-before" ?

^ permalink raw reply	[relevance 71%]

* Re: RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting]
  2021-03-03 22:29 71%   ` RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting] Eric Wong
@ 2021-03-04  2:39 71%     ` Kyle Meyer
  2021-03-04  3:31 71%       ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-03-04  2:39 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> --import-augment is the wrong name for this option,
> since the import happens even when --augment isn't specified.
>
> How about "--import-before" ?

I don't have a good understanding of the internals, but fwiw that sounds
fine to me.  Given your description of "[stash] keyword changes" and
"import flags", --stash-keywords or --import-keywords came to mind, but
perhaps those aren't quite accurate.

^ permalink raw reply	[relevance 71%]

* Re: RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting]
  2021-03-04  2:39 71%     ` Kyle Meyer
@ 2021-03-04  3:31 71%       ` Eric Wong
  2021-03-04  4:10 71%         ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-03-04  3:31 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > --import-augment is the wrong name for this option,
> > since the import happens even when --augment isn't specified.
> >
> > How about "--import-before" ?
> 
> I don't have a good understanding of the internals, but fwiw that sounds
> fine to me.  Given your description of "[stash] keyword changes" and
> "import flags", --stash-keywords or --import-keywords came to mind, but
> perhaps those aren't quite accurate.

Oh, I forgot to note it will probably import more than just
keywords (but maybe it can be tweaked(*)).

The thing I want to protect against is somebody forgetting
--augment when using "lei q -o imaps://example.com/INBOX ..."
which would delete mail that hasn't been imported to lei or
backed-up by another tool.

Causing data loss in the above scenario would be a nightmare,
even if it's technically user error.

There's also cases where someone will want to edit a patch in
the search results mailbox before applying it (e.g. adding
Acked-by, fixing whitespace, trivial errors, etc...) and
it might be good to preserve a copy of the edited message.

(*) possible directions:

	--import-$FOO=kw-only
	--import-$FOO=not-in-git
	--import-$FOO  # same as "not-in-git", this should be the default
	--import-$FOO=none / --no-import-$FOO

^ permalink raw reply	[relevance 71%]

* Re: RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting]
  2021-03-04  3:31 71%       ` Eric Wong
@ 2021-03-04  4:10 71%         ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-03-04  4:10 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Oh, I forgot to note it will probably import more than just
> keywords (but maybe it can be tweaked(*)).
>
> The thing I want to protect against is somebody forgetting
> --augment when using "lei q -o imaps://example.com/INBOX ..."
> which would delete mail that hasn't been imported to lei or
> backed-up by another tool.
>
> Causing data loss in the above scenario would be a nightmare,
> even if it's technically user error.

Oy, indeed.

> There's also cases where someone will want to edit a patch in
> the search results mailbox before applying it (e.g. adding
> Acked-by, fixing whitespace, trivial errors, etc...) and
> it might be good to preserve a copy of the edited message.
>
> (*) possible directions:
>
> 	--import-$FOO=kw-only
> 	--import-$FOO=not-in-git
> 	--import-$FOO  # same as "not-in-git", this should be the default
> 	--import-$FOO=none / --no-import-$FOO

Make sense.  Your suggested "before" seems like a good choice for $FOO.

^ permalink raw reply	[relevance 71%]

* [PATCH 0/6] lei q --import-augment => --import-before; mbox + IMAP
@ 2021-03-04  9:03 71% Eric Wong
  2021-03-04  9:03 43% ` [PATCH 1/6] lei q: support --import-augment for IMAP Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-03-04  9:03 UTC (permalink / raw)
  To: meta

mbox support was the trickiest, and necessitated
PATCH 2/6 and 3/6 in addition to
https://public-inbox.org/meta/20210304012039.26900-1-e@80x24.org/
("ds: import croak properly")

6/6 completes the renaming.

Eric Wong (6):
  lei q: support --import-augment for IMAP
  lei: dclose: do not EPOLL_CTL_DEL w/o event_init
  lei_xsearch: cleanup {pkt_op_p} on exceptions
  lei q: --import-augment for mbox and mbox.gz
  t/lei_to_mail: no need to cat in FIFO test
  lei q: s/import-augment/import-before/g

 lib/PublicInbox/LEI.pm        |   4 +-
 lib/PublicInbox/LeiQuery.pm   |   2 +-
 lib/PublicInbox/LeiToMail.pm  | 115 +++++++++++++++++++++++-----------
 lib/PublicInbox/LeiXSearch.pm |   6 ++
 lib/PublicInbox/NetReader.pm  |   9 ++-
 lib/PublicInbox/NetWriter.pm  |  41 ++++++++++--
 t/lei-q-kw.t                  |  80 +++++++++++++++++++++--
 t/lei_to_mail.t               |   7 ++-
 xt/net_writer-imap.t          |  36 +++++++++--
 9 files changed, 241 insertions(+), 59 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/6] lei: dclose: do not EPOLL_CTL_DEL w/o event_init
  2021-03-04  9:03 71% [PATCH 0/6] lei q --import-augment => --import-before; mbox + IMAP Eric Wong
  2021-03-04  9:03 43% ` [PATCH 1/6] lei q: support --import-augment for IMAP Eric Wong
@ 2021-03-04  9:03 71% ` Eric Wong
  2021-03-04  9:03 43% ` [PATCH 4/6] lei q: --import-augment for mbox and mbox.gz Eric Wong
  2021-03-04  9:03 47% ` [PATCH 6/6] lei q: s/import-augment/import-before/g Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-04  9:03 UTC (permalink / raw)
  To: meta

It's possible we'll hit a die() statement which triggers
lei->dclose, but aren't in the event loop, yet.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1e5b04ca..fdd9f8c8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -973,7 +973,7 @@ sub dclose {
 	if (my $sto = delete $self->{sto}) {
 		$sto->ipc_do('done');
 	}
-	$self->close if $self->{sock}; # PublicInbox::DS::close
+	$self->close if $self->{-event_init_done}; # PublicInbox::DS::close
 }
 
 # for long-running results

^ permalink raw reply related	[relevance 71%]

* [PATCH 6/6] lei q: s/import-augment/import-before/g
  2021-03-04  9:03 71% [PATCH 0/6] lei q --import-augment => --import-before; mbox + IMAP Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-04  9:03 43% ` [PATCH 4/6] lei q: --import-augment for mbox and mbox.gz Eric Wong
@ 2021-03-04  9:03 47% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-04  9:03 UTC (permalink / raw)
  To: meta

Since this importing of keywords is active even when --augment
isn't specified, calling it --import-before seems more
appropriate.

In the future, this will likely default to adding unseen emails
to lei/store, not just updating keywords.

Link: https://public-inbox.org/meta/20210303222930.GA18597@dcvr/T/
---
 lib/PublicInbox/LEI.pm       |  2 +-
 lib/PublicInbox/LeiQuery.pm  |  2 +-
 lib/PublicInbox/LeiToMail.pm | 16 ++++++++--------
 t/lei-q-kw.t                 | 10 +++++-----
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index fdd9f8c8..50276a50 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -113,7 +113,7 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g augment|a
-	import-remote! import-augment! lock=s@
+	import-remote! import-before! lock=s@
 	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index c630d628..493a8382 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -52,7 +52,7 @@ sub lei_q {
 	my $sto = $self->_lei_store(1);
 	my $lse = $sto->search;
 	if (($opt->{'import-remote'} //= 1) |
-			($opt->{'import-augment'} //= 1)) {
+			($opt->{'import-before'} //= 1)) {
 		$sto->write_prepare($self);
 	}
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 6290f35e..1e2060fe 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -438,7 +438,7 @@ sub _pre_augment_maildir {
 sub _do_augment_maildir {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
-	my $lse = $lei->{sto}->search if $lei->{opt}->{'import-augment'};
+	my $lse = $lei->{sto}->search if $lei->{opt}->{'import-before'};
 	my ($mod, $shard) = @{$self->{shard_info} // []};
 	if ($lei->{opt}->{augment}) {
 		my $dedupe = $lei->{dedupe};
@@ -470,7 +470,7 @@ sub _imap_augment_or_delete { # PublicInbox::NetReader::imap_each cb
 sub _do_augment_imap {
 	my ($self, $lei) = @_;
 	my $net = $lei->{net};
-	my $lse = $lei->{sto}->search if $lei->{opt}->{'import-augment'};
+	my $lse = $lei->{sto}->search if $lei->{opt}->{'import-before'};
 	if ($lei->{opt}->{augment}) {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
@@ -511,8 +511,8 @@ sub _pre_augment_mbox {
 		die "seek($dst): $!\n";
 	}
 	if (!$self->{seekable}) {
-		my $ia = $lei->{opt}->{'import-augment'};
-		die "--import-augment specified but $dst is not seekable\n"
+		my $ia = $lei->{opt}->{'import-before'};
+		die "--import-before specified but $dst is not seekable\n"
 			if $ia && !ref($ia);
 		die "--augment specified but $dst is not seekable\n" if
 			$lei->{opt}->{augment};
@@ -533,7 +533,7 @@ sub _do_augment_mbox {
 	my $out = $lei->{1};
 	my ($fmt, $dst) = @{$lei->{ovv}}{qw(fmt dst)};
 	return unless -s $out;
-	unless ($opt->{augment} || $opt->{'import-augment'}) {
+	unless ($opt->{augment} || $opt->{'import-before'}) {
 		truncate($out, 0) or die "truncate($dst): $!";
 		return;
 	}
@@ -544,14 +544,14 @@ sub _do_augment_mbox {
 		$dedupe = $lei->{dedupe};
 		$dedupe->prepare_dedupe if $dedupe;
 	}
-	if ($opt->{'import-augment'}) { # the default
+	if ($opt->{'import-before'}) { # the default
 		my $lse = $lei->{sto}->search;
 		PublicInbox::MboxReader->$fmt($rd, \&_mbox_augment_kw_maybe,
 						$lei, $lse, $opt->{augment});
 		if (!$opt->{augment} and !truncate($out, 0)) {
 			die "truncate($dst): $!";
 		}
-	} else { # --augment --no-import-augment
+	} else { # --augment --no-import-before
 		PublicInbox::MboxReader->$fmt($rd, \&_augment, $lei);
 	}
 	# maybe some systems don't honor O_APPEND, Perl does this:
@@ -576,7 +576,7 @@ sub do_augment { # slow, runs in wq worker
 # fast (spawn compressor or mkdir), runs in same process as pre_augment
 sub post_augment {
 	my ($self, $lei, @args) = @_;
-	my $wait = $lei->{opt}->{'import-augment'} ?
+	my $wait = $lei->{opt}->{'import-before'} ?
 			$lei->{sto}->ipc_do('checkpoint', 1) : 0;
 	# _post_augment_mbox
 	my $m = $self->can("_post_augment_$self->{base_type}") or return;
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index babe9749..9daeb5b1 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -23,13 +23,13 @@ lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
 @fn = glob("$o/cur/*:2,S");
 is(scalar(@fn), 1, "`seen' flag set on Maildir file");
 
-# ensure --no-import-augment works
+# ensure --no-import-before works
 my $n = $fn[0];
 $n =~ s/,S\z/,RS/;
 rename($fn[0], $n) or BAIL_OUT "rename $!";
-lei_ok(qw(q --no-import-augment -o), "maildir:$o",
+lei_ok(qw(q --no-import-before -o), "maildir:$o",
 	qw(m:bogus-noresults@example.com));
-ok(!glob("$o/cur/*"), '--no-import-augment cleared destination');
+ok(!glob("$o/cur/*"), '--no-import-before cleared destination');
 lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
 @fn = glob("$o/cur/*:2,S");
 is(scalar(@fn), 1, "`seen' flag (but not `replied') set on Maildir file");
@@ -39,8 +39,8 @@ SKIP: {
 	mkfifo($o, 0600) or skip("mkfifo not supported: $!", 1);
 	# cat(1) since lei() may not execve for FD_CLOEXEC to work
 	my $cat = popen_rd(['cat', $o]);
-	ok(!lei(qw(q --import-augment bogus -o), "mboxrd:$o"),
-		'--import-augment fails on non-seekable output');
+	ok(!lei(qw(q --import-before bogus -o), "mboxrd:$o"),
+		'--import-before fails on non-seekable output');
 	is(do { local $/; <$cat> }, '', 'no output on FIFO');
 };
 

^ permalink raw reply related	[relevance 47%]

* [PATCH 4/6] lei q: --import-augment for mbox and mbox.gz
  2021-03-04  9:03 71% [PATCH 0/6] lei q --import-augment => --import-before; mbox + IMAP Eric Wong
  2021-03-04  9:03 43% ` [PATCH 1/6] lei q: support --import-augment for IMAP Eric Wong
  2021-03-04  9:03 71% ` [PATCH 2/6] lei: dclose: do not EPOLL_CTL_DEL w/o event_init Eric Wong
@ 2021-03-04  9:03 43% ` Eric Wong
  2021-03-04  9:03 47% ` [PATCH 6/6] lei q: s/import-augment/import-before/g Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-04  9:03 UTC (permalink / raw)
  To: meta

The trickiest output formats we support due to the possibility
of filesystem FIFOS and pipes for <gzip|xz|bzip2>.

This completes another phase of keyword sync support.
---
 lib/PublicInbox/LeiToMail.pm | 65 ++++++++++++++++++++++---------
 t/lei-q-kw.t                 | 74 +++++++++++++++++++++++++++++++++++-
 2 files changed, 119 insertions(+), 20 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index b3228a59..6290f35e 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -246,6 +246,13 @@ sub _augment { # MboxReader eml_cb
 	$lei->{dedupe}->is_dup($eml);
 }
 
+sub _mbox_augment_kw_maybe {
+	my ($eml, $lei, $lse, $augment) = @_;
+	my @kw = PublicInbox::LeiStore::mbox_keywords($eml);
+	update_kw_maybe($lei, $lse, $eml, \@kw);
+	_augment($eml, $lei) if $augment;
+}
+
 sub _mbox_write_cb ($$) {
 	my ($self, $lei) = @_;
 	my $ovv = $lei->{ovv};
@@ -391,7 +398,7 @@ sub new {
 				"$dst exists and is not a directory\n";
 		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
 	} elsif (substr($fmt, 0, 4) eq 'mbox') {
-		require PublicInbox::MboxReader if $lei->{opt}->{augment};
+		require PublicInbox::MboxReader;
 		(-d $dst || (-e _ && !-w _)) and die
 			"$dst exists and is not a writable file\n";
 		$self->can("eml2$fmt") or die "bad mbox format: $fmt\n";
@@ -485,8 +492,8 @@ sub _do_augment_imap {
 sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
+	my $out = $lei->{1};
 	if ($dst ne '/dev/stdout') {
-		my $out;
 		if (-p $dst) {
 			open $out, '>', $dst or die "open($dst): $!";
 		} elsif (-f _ || !-e _) {
@@ -495,36 +502,56 @@ sub _pre_augment_mbox {
 					PublicInbox::MboxLock->defaults;
 			$self->{mbl} = PublicInbox::MboxLock->acq($dst, 1, $m);
 			$out = $self->{mbl}->{fh};
-			if (!$lei->{opt}->{augment} and !truncate($out, 0)) {
-				die "truncate($dst): $!";
-			}
 		}
 		$lei->{old_1} = $lei->{1}; # keep for spawning MUA
-		$lei->{1} = $out;
 	}
 	# Perl does SEEK_END even with O_APPEND :<
-	$self->{seekable} = seek($lei->{1}, 0, SEEK_SET);
+	$self->{seekable} = seek($out, 0, SEEK_SET);
 	if (!$self->{seekable} && $! != ESPIPE && $dst ne '/dev/stdout') {
 		die "seek($dst): $!\n";
 	}
+	if (!$self->{seekable}) {
+		my $ia = $lei->{opt}->{'import-augment'};
+		die "--import-augment specified but $dst is not seekable\n"
+			if $ia && !ref($ia);
+		die "--augment specified but $dst is not seekable\n" if
+			$lei->{opt}->{augment};
+	}
 	state $zsfx_allow = join('|', keys %zsfx2cmd);
-	($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/) or return;
-	pipe(my ($r, $w)) or die "pipe: $!";
-	$lei->{zpipe} = [ $r, $w ];
+	if (($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/)) {
+		pipe(my ($r, $w)) or die "pipe: $!";
+		$lei->{zpipe} = [ $r, $w ];
+	}
+	$lei->{1} = $out;
+	undef;
 }
 
 sub _do_augment_mbox {
 	my ($self, $lei) = @_;
-	return if !$lei->{opt}->{augment};
-	my $dedupe = $lei->{dedupe};
-	my $dst = $lei->{ovv}->{dst};
-	die "cannot augment $dst, not seekable\n" if !$self->{seekable};
+	return unless $self->{seekable};
+	my $opt = $lei->{opt};
 	my $out = $lei->{1};
-	if (-s $out && $dedupe && $dedupe->prepare_dedupe) {
-		my $zsfx = $self->{zsfx};
-		my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) :
-				dup_src($out);
-		my $fmt = $lei->{ovv}->{fmt};
+	my ($fmt, $dst) = @{$lei->{ovv}}{qw(fmt dst)};
+	return unless -s $out;
+	unless ($opt->{augment} || $opt->{'import-augment'}) {
+		truncate($out, 0) or die "truncate($dst): $!";
+		return;
+	}
+	my $zsfx = $self->{zsfx};
+	my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) : dup_src($out);
+	my $dedupe;
+	if ($opt->{augment}) {
+		$dedupe = $lei->{dedupe};
+		$dedupe->prepare_dedupe if $dedupe;
+	}
+	if ($opt->{'import-augment'}) { # the default
+		my $lse = $lei->{sto}->search;
+		PublicInbox::MboxReader->$fmt($rd, \&_mbox_augment_kw_maybe,
+						$lei, $lse, $opt->{augment});
+		if (!$opt->{augment} and !truncate($out, 0)) {
+			die "truncate($dst): $!";
+		}
+	} else { # --augment --no-import-augment
 		PublicInbox::MboxReader->$fmt($rd, \&_augment, $lei);
 	}
 	# maybe some systems don't honor O_APPEND, Perl does this:
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index 97b2e08f..babe9749 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -2,6 +2,12 @@
 # Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
+use POSIX qw(mkfifo);
+use Fcntl qw(SEEK_SET O_RDONLY O_NONBLOCK);
+use IO::Uncompress::Gunzip qw(gunzip);
+use IO::Compress::Gzip qw(gzip);
+use PublicInbox::MboxReader;
+use PublicInbox::Spawn qw(popen_rd);
 test_lei(sub {
 lei_ok(qw(import -F eml t/plack-qp.eml));
 my $o = "$ENV{HOME}/dst";
@@ -28,6 +34,72 @@ lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
 @fn = glob("$o/cur/*:2,S");
 is(scalar(@fn), 1, "`seen' flag (but not `replied') set on Maildir file");
 
-# TODO: other destination types
+SKIP: {
+	$o = "$ENV{HOME}/fifo";
+	mkfifo($o, 0600) or skip("mkfifo not supported: $!", 1);
+	# cat(1) since lei() may not execve for FD_CLOEXEC to work
+	my $cat = popen_rd(['cat', $o]);
+	ok(!lei(qw(q --import-augment bogus -o), "mboxrd:$o"),
+		'--import-augment fails on non-seekable output');
+	is(do { local $/; <$cat> }, '', 'no output on FIFO');
+};
+
+lei_ok qw(import -F eml t/utf8.eml), \'for augment test';
+my $read_file = sub {
+	if ($_[0] =~ /\.gz\z/) {
+		gunzip($_[0] => \(my $buf = ''), MultiStream => 1) or
+			BAIL_OUT 'gunzip';
+		$buf;
+	} else {
+		open my $fh, '+<', $_[0] or BAIL_OUT $!;
+		do { local $/; <$fh> };
+	}
+};
+
+my $write_file = sub {
+	if ($_[0] =~ /\.gz\z/) {
+		gzip(\($_[1]), $_[0]) or BAIL_OUT 'gzip';
+	} else {
+		open my $fh, '>', $_[0] or BAIL_OUT $!;
+		print $fh $_[1] or BAIL_OUT $!;
+		close $fh or BAIL_OUT;
+	}
+};
+
+my $exp = {
+	'<qp@example.com>' => eml_load('t/plack-qp.eml'),
+	'<testmessage@example.com>' => eml_load('t/utf8.eml'),
+};
+$exp->{'<qp@example.com>'}->header_set('Status', 'OR');
+$exp->{'<testmessage@example.com>'}->header_set('Status', 'O');
+for my $sfx ('', '.gz') {
+	$o = "$ENV{HOME}/dst.mboxrd$sfx";
+	lei_ok(qw(q -o), "mboxrd:$o", qw(m:qp@example.com));
+	my $buf = $read_file->($o);
+	$buf =~ s/^Status: [^\n]*\n//sm or BAIL_OUT "no status in $buf";
+	$write_file->($o, $buf);
+	lei_ok(qw(q -o), "mboxrd:$o", qw(rereadandimportkwchange));
+	$buf = $read_file->($o);
+	is($buf, '', 'emptied');
+	lei_ok(qw(q -o), "mboxrd:$o", qw(m:qp@example.com));
+	$buf = $read_file->($o);
+	$buf =~ s/\nStatus: O\n\n/\nStatus: OR\n\n/s or
+		BAIL_OUT "no Status in $buf";
+	$write_file->($o, $buf);
+	lei_ok(qw(q -a -o), "mboxrd:$o", qw(m:testmessage@example.com));
+	$buf = $read_file->($o);
+	open my $fh, '<', \$buf or BAIL_OUT "PerlIO::scalar $!";
+	my %res;
+	PublicInbox::MboxReader->mboxrd($fh, sub {
+		my ($eml) = @_;
+		$res{$eml->header_raw('Message-ID')} = $eml;
+	});
+	is_deeply(\%res, $exp, '--augment worked');
+
+	lei_ok(qw(q -o), "mboxrd:/dev/stdout", qw(m:qp@example.com)) or
+		diag $lei_err;
+	like($lei_out, qr/^Status: OR\n/sm, 'Status set by previous augment');
+}
+
 });
 done_testing;

^ permalink raw reply related	[relevance 43%]

* [PATCH 1/6] lei q: support --import-augment for IMAP
  2021-03-04  9:03 71% [PATCH 0/6] lei q --import-augment => --import-before; mbox + IMAP Eric Wong
@ 2021-03-04  9:03 43% ` Eric Wong
  2021-03-04  9:03 71% ` [PATCH 2/6] lei: dclose: do not EPOLL_CTL_DEL w/o event_init Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-04  9:03 UTC (permalink / raw)
  To: meta

IMAP is similar to Maildir and we can now preserve keyword
updates done on IMAP folders.
---
 lib/PublicInbox/LeiToMail.pm | 48 ++++++++++++++++++++++--------------
 lib/PublicInbox/NetReader.pm |  9 +++++--
 lib/PublicInbox/NetWriter.pm | 41 ++++++++++++++++++++++++++----
 xt/net_writer-imap.t         | 36 ++++++++++++++++++++++++---
 4 files changed, 105 insertions(+), 29 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 3420b06e..b3228a59 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -267,6 +267,17 @@ sub _mbox_write_cb ($$) {
 	}
 }
 
+sub update_kw_maybe ($$$$) {
+	my ($lei, $lse, $eml, $kw) = @_;
+	return unless $lse;
+	my $x = $lse->kw_changed($eml, $kw);
+	if ($x) {
+		$lei->{sto}->ipc_do('set_eml', $eml, @$kw);
+	} elsif (!defined($x)) {
+		# TODO: xkw
+	}
+}
+
 sub _augment_or_unlink { # maildir_each_eml cb
 	my ($f, $kw, $eml, $lei, $lse, $mod, $shard, $unlink) = @_;
 	if ($mod) {
@@ -276,14 +287,7 @@ sub _augment_or_unlink { # maildir_each_eml cb
 				$1 : sha256_hex($f);
 		my $recno = hex(substr($hex, 0, 8));
 		return if ($recno % $mod) != $shard;
-		if ($lse) {
-			my $x = $lse->kw_changed($eml, $kw);
-			if ($x) {
-				$lei->{sto}->ipc_do('set_eml', $eml, @$kw);
-			} elsif (!defined($x)) {
-				# TODO: xkw
-			}
-		}
+		update_kw_maybe($lei, $lse, $eml, $kw);
 	}
 	$unlink ? unlink($f) : _augment($eml, $lei);
 }
@@ -446,26 +450,32 @@ sub _do_augment_maildir {
 	}
 }
 
-sub _post_augment_maildir {
-	my ($self, $lei) = @_;
-	$lei->{opt}->{'import-augment'} or return;
-	my $wait = $lei->{sto}->ipc_do('checkpoint', 1);
-}
-
-sub _augment_imap { # PublicInbox::NetReader::imap_each cb
-	my ($url, $uid, $kw, $eml, $lei) = @_;
-	_augment($eml, $lei);
+sub _imap_augment_or_delete { # PublicInbox::NetReader::imap_each cb
+	my ($url, $uid, $kw, $eml, $lei, $lse, $delete_mic) = @_;
+	update_kw_maybe($lei, $lse, $eml, $kw);
+	if ($delete_mic) {
+		$lei->{net}->imap_delete_1($url, $uid, $delete_mic);
+	} else {
+		_augment($eml, $lei);
+	}
 }
 
 sub _do_augment_imap {
 	my ($self, $lei) = @_;
 	my $net = $lei->{net};
+	my $lse = $lei->{sto}->search if $lei->{opt}->{'import-augment'};
 	if ($lei->{opt}->{augment}) {
 		my $dedupe = $lei->{dedupe};
 		if ($dedupe && $dedupe->prepare_dedupe) {
-			$net->imap_each($self->{uri}, \&_augment_imap, $lei);
+			$net->imap_each($self->{uri}, \&_imap_augment_or_delete,
+					$lei, $lse);
 			$dedupe->pause_dedupe;
 		}
+	} elsif ($lse) {
+		my $delete_mic;
+		$net->imap_each($self->{uri}, \&_imap_augment_or_delete,
+					$lei, $lse, \$delete_mic);
+		$delete_mic->expunge if $delete_mic;
 	} elsif (!$self->{-wq_worker_nr}) { # undef or 0
 		# clobber existing IMAP folder
 		$net->imap_delete_all($self->{uri});
@@ -539,6 +549,8 @@ sub do_augment { # slow, runs in wq worker
 # fast (spawn compressor or mkdir), runs in same process as pre_augment
 sub post_augment {
 	my ($self, $lei, @args) = @_;
+	my $wait = $lei->{opt}->{'import-augment'} ?
+			$lei->{sto}->ipc_do('checkpoint', 1) : 0;
 	# _post_augment_mbox
 	my $m = $self->can("_post_augment_$self->{base_type}") or return;
 	$m->($self, $lei, @args);
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 96d3b2ed..f5f71005 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -346,9 +346,14 @@ sub _imap_do_msg ($$$$$) {
 	$$raw =~ s/\r\n/\n/sg;
 	my $kw = [];
 	for my $f (split(/ /, $flags)) {
-		my $k = $IMAPflags2kw{$f} // next; # TODO: X-Label?
-		push @$kw, $k;
+		if (my $k = $IMAPflags2kw{$f}) {
+			push @$kw, $k;
+		} elsif ($f eq "\\Recent") { # not in JMAP
+		} elsif ($self->{verbose}) {
+			warn "# unknown IMAP flag $f <$uri;uid=$uid>\n";
+		}
 	}
+	@$kw = sort @$kw; # for all UI/UX purposes
 	my ($eml_cb, @args) = @{$self->{eml_each}};
 	$eml_cb->($uri, $uid, $kw, PublicInbox::Eml->new($raw), @args);
 }
diff --git a/lib/PublicInbox/NetWriter.pm b/lib/PublicInbox/NetWriter.pm
index e26e9815..49ac02a6 100644
--- a/lib/PublicInbox/NetWriter.pm
+++ b/lib/PublicInbox/NetWriter.pm
@@ -13,27 +13,58 @@ my %IMAPkw2flags;
 @IMAPkw2flags{values %PublicInbox::NetReader::IMAPflags2kw} =
 				keys %PublicInbox::NetReader::IMAPflags2kw;
 
+sub kw2flags ($) { join(' ', map { $IMAPkw2flags{$_} } @{$_[0]}) }
+
 sub imap_append {
 	my ($mic, $folder, $bref, $smsg, $eml) = @_;
 	$bref //= \($eml->as_string);
 	$smsg //= bless {}, 'PublicInbox::Smsg';
 	bless($smsg, 'PublicInbox::Smsg') if ref($smsg) eq 'HASH';
 	$smsg->{ts} //= msg_timestamp($eml // PublicInbox::Eml->new($$bref));
-	my @f = map { $IMAPkw2flags{$_} } @{$smsg->{kw}};
-	$mic->append_string($folder, $$bref, "@f", $smsg->internaldate) or
+	my $f = kw2flags($smsg->{kw});
+	$mic->append_string($folder, $$bref, $f, $smsg->internaldate) or
 		die "APPEND $folder: $@";
 }
 
+sub mic_for_folder {
+	my ($self, $uri) = @_;
+	if (!ref($uri)) {
+		my $u = PublicInbox::URIimap->new($uri);
+		$_[1] = $uri = $u;
+	}
+	my $mic = $self->mic_get($uri) or die "E: not connected: $@";
+	$mic->select($uri->mailbox) or return;
+	$mic;
+}
+
 sub imap_delete_all {
 	my ($self, $url) = @_;
-	my $uri = PublicInbox::URIimap->new($url);
+	my $mic = mic_for_folder($self, my $uri = $url) or return;
 	my $sec = $self->can('uri_section')->($uri);
 	local $0 = $uri->mailbox." $sec";
-	my $mic = $self->mic_get($uri) or die "E: not connected: $@";
-	$mic->select($uri->mailbox) or return; # non-existent
 	if ($mic->delete_message('1:*')) {
 		$mic->expunge;
 	}
 }
 
+sub imap_delete_1 {
+	my ($self, $url, $uid, $delete_mic) = @_;
+	$$delete_mic //= mic_for_folder($self, my $uri = $url) or return;
+	$$delete_mic->delete_message($uid);
+}
+
+sub imap_set_kw {
+	my ($self, $url, $uid, $kw) = @_;
+	my $mic = mic_for_folder($self, my $uri = $url) or return;
+	$mic->set_flag(kw2flags($kw), $uid);
+	$mic; # caller must ->expunge
+}
+
+sub imap_unset_kw {
+	my ($self, $url, $uid, $kw) = @_;
+	my $mic = mic_for_folder($self, my $uri = $url) or return;
+	$mic->unset_flag(kw2flags($kw), $uid);
+	$mic; # caller must ->expunge
+}
+
 1;
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index da435926..c24fa993 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -91,7 +91,7 @@ my $smsg = bless { kw => [ 'seen' ] }, 'PublicInbox::Smsg';
 $imap_append->($mic, $folder, undef, $smsg, eml_load('t/plack-qp.eml'));
 $nwr->{quiet} = 1;
 my $imap_slurp_all = sub {
-	my ($u, $uid, $kw, $eml, $res) = @_;
+	my ($url, $uid, $kw, $eml, $res) = @_;
 	push @$res, [ $kw, $eml ];
 };
 $nwr->imap_each($folder_uri, $imap_slurp_all, my $res = []);
@@ -138,10 +138,38 @@ test_lei(sub {
 	$nwr->imap_each($folder_uri, $imap_slurp_all, my $empty = []);
 	is(scalar(@$empty), 0, 'no results w/o augment');
 
-	lei_ok qw(convert -F eml t/msg_iter-order.eml -o), $$folder_uri;
+	my $f = 't/utf8.eml'; # <testmessage@example.com>
+	$exp = eml_load($f);
+	lei_ok qw(convert -F eml -o), $$folder_uri, $f;
+	my (@uid, @res);
+	$nwr->imap_each($folder_uri, sub {
+		my ($u, $uid, $kw, $eml) = @_;
+		push @uid, $uid;
+		push @res, [ $kw, $eml ];
+	});
+	is_deeply(\@res, [ [ [], $exp ] ], 'converted to IMAP destination');
+	is(scalar(@uid), 1, 'got one UID back');
+	lei_ok qw(q -o /dev/stdout m:testmessage@example.com --no-external);
+	is_deeply(json_utf8->decode($lei_out), [undef],
+		'no results before import');
+
+	lei_ok qw(import -F eml), $f, \'import local copy w/o keywords';
+
+	$nwr->imap_set_kw($folder_uri, $uid[0], [ 'seen' ])->expunge
+		or BAIL_OUT "expunge $@";
+	@res = ();
+	$nwr->imap_each($folder_uri, $imap_slurp_all, \@res);
+	is_deeply(\@res, [ [ ['seen'], $exp ] ], 'seen flag set') or
+		diag explain(\@res);
+
+	lei_ok qw(q s:thisbetternotgiveanyresult -o), $folder_uri->as_string,
+		\'clobber folder but import flag';
 	$nwr->imap_each($folder_uri, $imap_slurp_all, $empty = []);
-	is_deeply($empty, [ [ [], eml_load('t/msg_iter-order.eml') ] ],
-		'converted to IMAP destination');
+	is_deeply($empty, [], 'clobbered folder');
+	lei_ok qw(q -o /dev/stdout m:testmessage@example.com --no-external);
+	$res = json_utf8->decode($lei_out)->[0];
+	is_deeply([@$res{qw(m kw)}], ['<testmessage@example.com>', ['seen']],
+		'kw set');
 });
 
 undef $cleanup; # remove temporary folder

^ permalink raw reply related	[relevance 43%]

* angle brackets in "m:" and "refs:" in "lei q" JSON
@ 2021-03-04 18:43 71% Eric Wong
  2021-03-06 18:26 71% ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-03-04 18:43 UTC (permalink / raw)
  To: meta

I'm thinking these shouldn't include angle brackets:

  "m": "<20210228122528.18552-2-e@80x24.org>",
  "refs": ["<20210228122528.18552-1-e@80x24.org>"],

Using angle brackets on the command-line requires quoting to
disambiguate against redirects, so it's a pain.  Leaving the
brackets in still works because of how Xapian's query parser
works, not because of anything we do on our end.

Since the actual headers are "Message-ID" and "References", (and
not "m" or "refs"), I think it's clear that we don't have to
match the raw mail contents exactly.  We RFC 2047 decode
"f|t|c|s" fields anyways instead of showing the raw values,
so more precedence for leaving out <>.

^ permalink raw reply	[relevance 71%]

* [PATCH] lei q: fix --import-before default and FIFO output
@ 2021-03-05  1:38 51% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-05  1:38 UTC (permalink / raw)
  To: meta

commit 6c551bffd75afb41d9b5e4774068abe7e06ed0e7
("lei q: --import-augment for mbox and mbox.gz") added a check to
in _pre_augment_mbox for the option being a ref() to distinguish
between default values and user-supplied values (which are
non-ref SCALARs from Getopt::Long).

However, LeiQuery failed to use a SCALAR ref as the default
value, making the check in _pre_augment_mbox useless.  We
now update LeiQuery to use \1 instead of 1 as the default
value so "lei q -f mboxrd ..." to stdout works once again.

Unfortunately, testing with redirects pointed to regular
files didn't trigger the code paths being updated.  Testing
with a FIFO revealed further bugs in the FIFO handling code
which are also fixed in this commit.

We'll also update the $lei->out error message to be
less-specific about "stdout" and use the term "output", instead,
since LeiToMail replaces stdout for all mbox outputs.
---
 lib/PublicInbox/LEI.pm       |  2 +-
 lib/PublicInbox/LeiQuery.pm  |  2 +-
 lib/PublicInbox/LeiToMail.pm |  8 ++++++--
 t/lei-q-kw.t                 | 25 +++++++++++++++++++------
 4 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 50276a50..50c0a885 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -431,7 +431,7 @@ sub out ($;@) {
 	my $self = shift;
 	return if print { $self->{1} // return } @_; # likely
 	return note_sigpipe($self, 1) if $! == EPIPE;
-	my $err = "error writing to stdout: $!";
+	my $err = "error writing to output: $!";
 	delete $self->{1};
 	fail($self, $err);
 }
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 493a8382..623b92cd 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -52,7 +52,7 @@ sub lei_q {
 	my $sto = $self->_lei_store(1);
 	my $lse = $sto->search;
 	if (($opt->{'import-remote'} //= 1) |
-			($opt->{'import-before'} //= 1)) {
+			(($opt->{'import-before'} //= \1) ? 1 : 0)) {
 		$sto->write_prepare($self);
 	}
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 1e2060fe..13b4f672 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -223,8 +223,6 @@ sub _post_augment_mbox { # open a compressor process
 	$dup->{$_} = $lei->{$_} for qw(2 sock);
 	tie *$pp, 'PublicInbox::ProcessPipe', $pid, $w, \&reap_compress, $dup;
 	$lei->{1} = $pp;
-	die 'BUG: unexpected {ovv}->{lock_path}' if $lei->{ovv}->{lock_path};
-	$lei->{ovv}->ovv_out_lk_init;
 }
 
 sub decompress_src ($$$) {
@@ -495,6 +493,7 @@ sub _pre_augment_mbox {
 	my $out = $lei->{1};
 	if ($dst ne '/dev/stdout') {
 		if (-p $dst) {
+			$out = undef;
 			open $out, '>', $dst or die "open($dst): $!";
 		} elsif (-f _ || !-e _) {
 			require PublicInbox::MboxLock;
@@ -521,6 +520,11 @@ sub _pre_augment_mbox {
 	if (($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/)) {
 		pipe(my ($r, $w)) or die "pipe: $!";
 		$lei->{zpipe} = [ $r, $w ];
+		$lei->{ovv}->{lock_path} and
+			die 'BUG: unexpected {ovv}->{lock_path}';
+		$lei->{ovv}->ovv_out_lk_init;
+	} elsif (!$self->{seekable} && !$lei->{ovv}->{lock_path}) {
+		$lei->{ovv}->ovv_out_lk_init;
 	}
 	$lei->{1} = $out;
 	undef;
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index 9daeb5b1..917a2c53 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -7,7 +7,15 @@ use Fcntl qw(SEEK_SET O_RDONLY O_NONBLOCK);
 use IO::Uncompress::Gunzip qw(gunzip);
 use IO::Compress::Gzip qw(gzip);
 use PublicInbox::MboxReader;
+use PublicInbox::LeiToMail;
 use PublicInbox::Spawn qw(popen_rd);
+my $exp = {
+	'<qp@example.com>' => eml_load('t/plack-qp.eml'),
+	'<testmessage@example.com>' => eml_load('t/utf8.eml'),
+};
+$exp->{'<qp@example.com>'}->header_set('Status', 'OR');
+$exp->{'<testmessage@example.com>'}->header_set('Status', 'O');
+
 test_lei(sub {
 lei_ok(qw(import -F eml t/plack-qp.eml));
 my $o = "$ENV{HOME}/dst";
@@ -42,6 +50,17 @@ SKIP: {
 	ok(!lei(qw(q --import-before bogus -o), "mboxrd:$o"),
 		'--import-before fails on non-seekable output');
 	is(do { local $/; <$cat> }, '', 'no output on FIFO');
+	close $cat;
+	$cat = popen_rd(['cat', $o]);
+	lei_ok(qw(q m:qp@example.com -o), "mboxrd:$o");
+	my $buf = do { local $/; <$cat> };
+	open my $fh, '<', \$buf or BAIL_OUT $!;
+	PublicInbox::MboxReader->mboxrd($fh, sub {
+		my ($eml) = @_;
+		$eml->header_set('Status', 'OR');
+		is_deeply($eml, $exp->{'<qp@example.com>'},
+			'FIFO output works as expected');
+	});
 };
 
 lei_ok qw(import -F eml t/utf8.eml), \'for augment test';
@@ -66,12 +85,6 @@ my $write_file = sub {
 	}
 };
 
-my $exp = {
-	'<qp@example.com>' => eml_load('t/plack-qp.eml'),
-	'<testmessage@example.com>' => eml_load('t/utf8.eml'),
-};
-$exp->{'<qp@example.com>'}->header_set('Status', 'OR');
-$exp->{'<testmessage@example.com>'}->header_set('Status', 'O');
 for my $sfx ('', '.gz') {
 	$o = "$ENV{HOME}/dst.mboxrd$sfx";
 	lei_ok(qw(q -o), "mboxrd:$o", qw(m:qp@example.com));

^ permalink raw reply related	[relevance 51%]

* "lei q" vs mairix notes...
@ 2021-03-05  2:22 63% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-05  2:22 UTC (permalink / raw)
  To: meta

I'm not sure if this should be in the lei-q(1) manpage or
another manpage, probably another.  There ought to be a similar
doc for notmuch and any other existing mail things I'm not
familiar with.

This is intended to be a neutral document to help and set
expectations for mairix users should they attempt to use lei.
It is NOT intended as advocacy document.

mairix and "lei q" share some similarities around common search
prefixes ("f:", "s:", "nq:") but there are several differences
users familiar with mairix should be aware of.

- lei (Xapian) uses ".." for date and size ranges, mairix uses "-".
  This is due to how the Xapian query parser works.

- lei uses git(1) for date and time parsing; mairix has its own
  syntax documented in mairix(1).

- lei does not support MH, yet

- lei currently requires mail to be imported into git ("lei import");
  mairix indexes mail in IMAP, Maildir, MH, mbox directly
  lei may attempt to index mail outside of git if there's interest:
  https://public-inbox.org/meta/20210303035359.GA14438@dcvr/

- mairix can use symlinks and/or hardlinks to speed up writing
  results when using Maildirs; lei must always extract messages
  from git, which will always be slower.

- mairix has different rules around substring matches, negation,
  combining, etc. than Xapian <https://xapian.org/docs/queryparser.html>

- lei doesn't yet support config file entries for output
  (but will support saved searches)

- --raw-output and --excerpt-output in mairix aren't yet
  supported, but the default JSON output in "lei q" may be
  similar

- lei indexes positional data by default (and currently lacks a
  configuration knob in the CLI), so indices use significantly
  more space.

- lei is still in its infancy and far from complete

Again, this is intended to be a neutral document and not
advocacy.  Help appreciated with corrections and addendums.

^ permalink raw reply	[relevance 63%]

* [PATCH] lei q: one -t shouldn't set `flagged' on external mail
@ 2021-03-05  4:03 66% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-05  4:03 UTC (permalink / raw)
  To: meta

We only want to set `flagged' if a user requests it via
a two '-t' switches.

Fixes: 232f8e376fe2856c ("lei q: -tt marks direct hits as flagged")
---
 lib/PublicInbox/LeiXSearch.pm | 6 +++---
 t/lei-q-thread.t              | 8 ++++++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 3270b420..f2c8c02e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -148,7 +148,7 @@ sub query_thread_mset { # for --threads
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
 	my $can_kw = !!$ibxish->can('msg_keywords');
-	my $fl = $lei->{opt}->{threads} > 1;
+	my $fl = $lei->{opt}->{threads} > 1 ? [ 'flagged' ] : undef;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		mset_progress($lei, $desc, $mset->size,
@@ -165,8 +165,8 @@ sub query_thread_mset { # for --threads
 				if ($mitem) {
 					if ($can_kw) {
 						mitem_kw($smsg, $mitem, $fl);
-					} else {
-						$smsg->{kw} = [ 'flagged' ];
+					} elsif ($fl) {
+						$smsg->{kw} = $fl;
 					}
 				}
 				$each_smsg->($smsg, $mitem);
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
index 0ddf47a6..28c639f5 100644
--- a/t/lei-q-thread.t
+++ b/t/lei-q-thread.t
@@ -41,8 +41,12 @@ test_lei(sub {
 		'flagged set in direct hit');
 	'TODO' or is_deeply($m{'<testmessage@example.com>'}->{kw}, ['draft'],
 		'flagged set in direct hit');
-	lei_ok qw(q -t -t m:testmessage@example.com --only), "$ro_home/t2";
+	lei_ok qw(q -tt m:testmessage@example.com --only), "$ro_home/t2";
 	$res = json_utf8->decode($lei_out);
-	is_deeply($res->[0]->{kw}, [ 'flagged' ], 'flagged set on external');
+	is_deeply($res->[0]->{kw}, [ 'flagged' ],
+		'flagged set on external with -tt');
+	lei_ok qw(q -t m:testmessage@example.com --only), "$ro_home/t2";
+	$res = json_utf8->decode($lei_out);
+	ok(!exists($res->[0]->{kw}), 'flagged not set on external with 1 -t');
 });
 done_testing;

^ permalink raw reply related	[relevance 66%]

* release timelines (-extindex, JMAP, lei)
@ 2021-03-05 22:20 64% Eric Wong
  2021-03-06 18:31 71% ` Kyle Meyer
  2021-03-08 21:33 71% ` Konstantin Ryabitsev
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-03-05 22:20 UTC (permalink / raw)
  To: meta

So I think the -extindex stuff might be stable and suitable for
general consumption.  The HTML WWW UI around -extindex has some
rough edges but nothing that would take too much effort to fix.

But I'm deeply worried about unleashing a new on-disk format
that's insufficient and being stuck supporting it forever
(as I am with v1 inboxes)...

* JMAP is going to be a more effort, but I think our current
  on-disk data model is OK or at least extensible enough for it.
  I might delay JMAP until 1.8.

  JMAP will be significantly less effort than inventing something
  new (and one-off) with GraphQL or REST.  And it would be less
  effort for client authors, as well; since client code can be
  used for non-public-inbox servers, too.

  I'm excited that it allows vendor extensions, which means our
  git-show|solver /$INBOX_URL/$OBJECT_ID/s/ blob-reconstruction
  endpoint can be exposed via JMAP.

* lei saved searches can probably done quick for 1.7,
  but without full keywords support for externals...

* Dealing with per-message keywords and externals in lei
  is going to be tough and delayed, I think:
  https://public-inbox.org/meta/20210224204950.GA2076@dcvr/
  ("lei: per-message keywords and externals")
  It may involve spinnning up a Python daemon to use
  custom PostingSource since Search::Xapian doesn't support it;
  and I don't expect anybody outside of FreeBSD to use SWIG Xapian.pm.

* lei has a bunch of rough edges and I'm not comfortable declaring
  it as supported, especially when there's a risk of data loss to
  users.

* .mailmap + Xapian synonym support would be nice in lei and
  HTTP/IMAP/JMAP endpoints

In any case, I need to take a few days away to clear my head.

^ permalink raw reply	[relevance 64%]

* Re: angle brackets in "m:" and "refs:" in "lei q" JSON
  2021-03-04 18:43 71% angle brackets in "m:" and "refs:" in "lei q" JSON Eric Wong
@ 2021-03-06 18:26 71% ` Kyle Meyer
  2021-03-08  8:08 53%   ` [PATCH] lei q: remove angle brackets around Message-IDs Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-03-06 18:26 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> I'm thinking these shouldn't include angle brackets:
>
>   "m": "<20210228122528.18552-2-e@80x24.org>",
>   "refs": ["<20210228122528.18552-1-e@80x24.org>"],
>
> Using angle brackets on the command-line requires quoting to
> disambiguate against redirects, so it's a pain.  Leaving the
> brackets in still works because of how Xapian's query parser
> works, not because of anything we do on our end.

I think it'd be nice to drop the brackets from a noise perspective too.

Also, does m: work with brackets?  Trying it out with a recent message
ID:

  $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
    '20210304203352.pd5mcg5pw4u2epzl@pengutronix.de'
  {"blob":"87304c8a8cae8ce400443b56309427aeee601505",...}

  $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
    m:'20210304203352.pd5mcg5pw4u2epzl@pengutronix.de'
  {"blob":"87304c8a8cae8ce400443b56309427aeee601505",...}

  $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
    '<20210304203352.pd5mcg5pw4u2epzl@pengutronix.de>'
  {"blob":"87304c8a8cae8ce400443b56309427aeee601505",...}

  $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
    m:'<20210304203352.pd5mcg5pw4u2epzl@pengutronix.de>'
  # no results

> Since the actual headers are "Message-ID" and "References", (and
> not "m" or "refs"), I think it's clear that we don't have to
> match the raw mail contents exactly.  We RFC 2047 decode
> "f|t|c|s" fields anyways instead of showing the raw values,
> so more precedence for leaving out <>.

Fwiw I don't think leaving out the brackets would be a source of
confusion.

^ permalink raw reply	[relevance 71%]

* Re: release timelines (-extindex, JMAP, lei)
  2021-03-05 22:20 64% release timelines (-extindex, JMAP, lei) Eric Wong
@ 2021-03-06 18:31 71% ` Kyle Meyer
  2021-03-08  2:54 71%   ` Eric Wong
  2021-03-08 21:33 71% ` Konstantin Ryabitsev
  1 sibling, 1 reply; 200+ results
From: Kyle Meyer @ 2021-03-06 18:31 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> But I'm deeply worried about unleashing a new on-disk format
> that's insufficient and being stuck supporting it forever
> (as I am with v1 inboxes)...
[...]
> * lei has a bunch of rough edges and I'm not comfortable declaring
>   it as supported, especially when there's a risk of data loss to
>   users.

What are your thoughts on marking all things lei with a big
experimental / subject-to-change / may-eat-your-data warning?

^ permalink raw reply	[relevance 71%]

* Re: release timelines (-extindex, JMAP, lei)
  2021-03-06 18:31 71% ` Kyle Meyer
@ 2021-03-08  2:54 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-08  2:54 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > But I'm deeply worried about unleashing a new on-disk format
> > that's insufficient and being stuck supporting it forever
> > (as I am with v1 inboxes)...
> [...]
> > * lei has a bunch of rough edges and I'm not comfortable declaring
> >   it as supported, especially when there's a risk of data loss to
> >   users.
> 
> What are your thoughts on marking all things lei with a big
> experimental / subject-to-change / may-eat-your-data warning?

Yes, definitely experimental and may-eat-your-data.
I hope the subject-to-change bit can be minimized, though I'll
drop the '<>' in "lei q" JSON before release.

I'm also considering diverging from mairix in making --augment
the default behavior in "lei q".  That would make things
significantly safer, and --no-augment would be required to
clobber existing outputs.

^ permalink raw reply	[relevance 71%]

* [PATCH] lei q: remove angle brackets around Message-IDs
  2021-03-06 18:26 71% ` Kyle Meyer
@ 2021-03-08  8:08 53%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-08  8:08 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > I'm thinking these shouldn't include angle brackets:
> >
> >   "m": "<20210228122528.18552-2-e@80x24.org>",
> >   "refs": ["<20210228122528.18552-1-e@80x24.org>"],
> >
> > Using angle brackets on the command-line requires quoting to
> > disambiguate against redirects, so it's a pain.  Leaving the
> > brackets in still works because of how Xapian's query parser
> > works, not because of anything we do on our end.
> 
> I think it'd be nice to drop the brackets from a noise perspective too.

Yes, we don't include them in name email address pairs, either.

> Also, does m: work with brackets?  Trying it out with a recent message
> ID:
> 
>   $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
>     '20210304203352.pd5mcg5pw4u2epzl@pengutronix.de'
>   {"blob":"87304c8a8cae8ce400443b56309427aeee601505",...}
> 
>   $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
>     m:'20210304203352.pd5mcg5pw4u2epzl@pengutronix.de'
>   {"blob":"87304c8a8cae8ce400443b56309427aeee601505",...}
> 
>   $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
>     '<20210304203352.pd5mcg5pw4u2epzl@pengutronix.de>'
>   {"blob":"87304c8a8cae8ce400443b56309427aeee601505",...}
> 
>   $ lei q -q -I https://public-inbox.org/meta/ -f ldjson \
>     m:'<20210304203352.pd5mcg5pw4u2epzl@pengutronix.de>'
>   # no results

Odd, I'm not sure about that one...  It's probably something
the Xapian query parser is doing internally and nothing on
our end...

> > Since the actual headers are "Message-ID" and "References", (and
> > not "m" or "refs"), I think it's clear that we don't have to
> > match the raw mail contents exactly.  We RFC 2047 decode
> > "f|t|c|s" fields anyways instead of showing the raw values,
> > so more precedence for leaving out <>.
> 
> Fwiw I don't think leaving out the brackets would be a source of
> confusion.

Agreed.  And I'm now wondering if we should start indexing
"References:" to be a searchable header with the "refs:" prefix
(but also wary about increasing disk space usage as a result...)

In any case, this denoises the output a bit:
------------8<----------
Subject: [PATCH] lei q: remove angle brackets around Message-IDs

They're unnecessary visual noise, and angle brackets don't
always work as intended when going through Xapian's query
parser.

Since we already use "m:" and "refs:" instead of the actual
header names, it should be obvious we're at liberty to
abbreviate such things

Link: https://public-inbox.org/meta/20210304184348.GA19350@dcvr/
---
 lib/PublicInbox/LeiOverview.pm | 5 ++---
 t/lei-externals.t              | 2 +-
 t/lei-q-thread.t               | 8 ++++----
 xt/net_writer-imap.t           | 2 +-
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 4db1d8c8..01556273 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -141,17 +141,16 @@ sub _unbless_smsg {
 	$smsg->{dt} = _iso8601(delete $smsg->{ds}); # JMAP UTCDate
 	$smsg->{pct} = get_pct($mitem) if $mitem;
 	if (my $r = delete $smsg->{references}) {
-		$smsg->{refs} = [ map { "<$_>" } ($r =~ m/$MID_EXTRACT/go) ];
+		$smsg->{refs} = [ map { $_ } ($r =~ m/$MID_EXTRACT/go) ];
 	}
 	if (my $m = delete($smsg->{mid})) {
-		$smsg->{'m'} = "<$m>";
+		$smsg->{'m'} = $m;
 	}
 	for my $f (qw(from to cc)) {
 		my $v = delete $smsg->{$f} or next;
 		$smsg->{substr($f, 0, 1)} = pairs($v);
 	}
 	$smsg->{'s'} = delete $smsg->{subject};
-	# can we be bothered to parse From/To/Cc into arrays?
 	scalar { %$smsg }; # unbless
 }
 
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 29667640..2a92d101 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -25,7 +25,7 @@ SKIP: {
 	lei_ok(@cmd, \"query $url");
 	is($lei_err, '', "no errors on $url");
 	my $res = json_utf8->decode($lei_out);
-	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url") or
+	is($res->[0]->{'m'}, $mid, "got expected mid from $url") or
 		skip 'further remote tests', 1;
 	lei_ok(@cmd, 'd:..20101002', \'no results, no error');
 	is($lei_err, '', 'no output on 404, matching local FS behavior');
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
index 28c639f5..e24fb2cb 100644
--- a/t/lei-q-thread.t
+++ b/t/lei-q-thread.t
@@ -27,9 +27,9 @@ test_lei(sub {
 	is(scalar(@$res), 3, 'got 2 results');
 	pop @$res;
 	my %m = map { $_->{'m'} => $_ } @$res;
-	is_deeply($m{'<testmessage@example.com>'}->{kw}, ['seen'],
+	is_deeply($m{'testmessage@example.com'}->{kw}, ['seen'],
 		'flag set in direct hit');
-	'TODO' or is_deeply($m{'<a-reply@miss>'}->{kw}, ['draft'],
+	'TODO' or is_deeply($m{'a-reply@miss'}->{kw}, ['draft'],
 		'flag set in thread hit');
 
 	lei_ok qw(q -t -t m:testmessage@example.com);
@@ -37,9 +37,9 @@ test_lei(sub {
 	is(scalar(@$res), 3, 'got 2 results with -t -t');
 	pop @$res;
 	%m = map { $_->{'m'} => $_ } @$res;
-	is_deeply($m{'<testmessage@example.com>'}->{kw}, ['flagged', 'seen'],
+	is_deeply($m{'testmessage@example.com'}->{kw}, ['flagged', 'seen'],
 		'flagged set in direct hit');
-	'TODO' or is_deeply($m{'<testmessage@example.com>'}->{kw}, ['draft'],
+	'TODO' or is_deeply($m{'testmessage@example.com'}->{kw}, ['draft'],
 		'flagged set in direct hit');
 	lei_ok qw(q -tt m:testmessage@example.com --only), "$ro_home/t2";
 	$res = json_utf8->decode($lei_out);
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index c24fa993..3631d932 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -168,7 +168,7 @@ test_lei(sub {
 	is_deeply($empty, [], 'clobbered folder');
 	lei_ok qw(q -o /dev/stdout m:testmessage@example.com --no-external);
 	$res = json_utf8->decode($lei_out)->[0];
-	is_deeply([@$res{qw(m kw)}], ['<testmessage@example.com>', ['seen']],
+	is_deeply([@$res{qw(m kw)}], ['testmessage@example.com', ['seen']],
 		'kw set');
 });
 

^ permalink raw reply related	[relevance 53%]

* Re: release timelines (-extindex, JMAP, lei)
  2021-03-05 22:20 64% release timelines (-extindex, JMAP, lei) Eric Wong
  2021-03-06 18:31 71% ` Kyle Meyer
@ 2021-03-08 21:33 71% ` Konstantin Ryabitsev
  2021-03-08 22:16 71%   ` Eric Wong
  1 sibling, 1 reply; 200+ results
From: Konstantin Ryabitsev @ 2021-03-08 21:33 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Fri, Mar 05, 2021 at 10:20:19PM +0000, Eric Wong wrote:
> So I think the -extindex stuff might be stable and suitable for
> general consumption.  The HTML WWW UI around -extindex has some
> rough edges but nothing that would take too much effort to fix.

\o/

> But I'm deeply worried about unleashing a new on-disk format
> that's insufficient and being stuck supporting it forever
> (as I am with v1 inboxes)...
> 
> * JMAP is going to be a more effort, but I think our current
>   on-disk data model is OK or at least extensible enough for it.
>   I might delay JMAP until 1.8.
> 
>   JMAP will be significantly less effort than inventing something
>   new (and one-off) with GraphQL or REST.  And it would be less
>   effort for client authors, as well; since client code can be
>   used for non-public-inbox servers, too.

I don't really have any specific opinion on this matter. I don't really know
of any other provider outside of Fastmail that uses JMAP, but it *is* a
published IETF standard around RFC2822 messages, so it makes sense to use it
for this purpose.

> * lei saved searches can probably done quick for 1.7,
>   but without full keywords support for externals...

I was wondering if lei should be part of the same suite as public-inbox proper
as opposed to a standalone (or interdependent) client tool. Unless I'm
mistaken, someone running a public-inbox origin or mirror server wouldn't
necessarily be an active lei user. Decoupling them from each-other would allow
different release cadence, no?

> In any case, I need to take a few days away to clear my head.

Please take care!

Thanks for your effort on this project.

Best,
-K

^ permalink raw reply	[relevance 71%]

* Re: release timelines (-extindex, JMAP, lei)
  2021-03-08 21:33 71% ` Konstantin Ryabitsev
@ 2021-03-08 22:16 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-08 22:16 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Fri, Mar 05, 2021 at 10:20:19PM +0000, Eric Wong wrote:
> > * lei saved searches can probably done quick for 1.7,
> >   but without full keywords support for externals...
> 
> I was wondering if lei should be part of the same suite as public-inbox proper
> as opposed to a standalone (or interdependent) client tool. Unless I'm
> mistaken, someone running a public-inbox origin or mirror server wouldn't
> necessarily be an active lei user. Decoupling them from each-other would allow
> different release cadence, no?

I've considered it, yes, but there's a lot of shared code and
writing release notes/planning/packaging for one project is
challenging enough.  As one might imagine, the non-coding
aspects of the project are the least interesting to me so I try
to avoid it as much as possible.

And I'm planning on sharing more code based on the work I've
done so far around IPC, too.

> > In any case, I need to take a few days away to clear my head.
> 
> Please take care!
> 
> Thanks for your effort on this project.

You're welcome.  Should be back to coding in a day or so,
still dealing with leaks of the non-memory variety :x

^ permalink raw reply	[relevance 71%]

* [PATCH 4/5] lei import: skip trashed Maildir messages
    2021-03-10 13:23 67% ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
@ 2021-03-10 13:23 68% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

This matches IMAP behavior in NetReader in skipping \\Deleted
messages.  Since lei may be used for personal, non-public mail;
Draft messages are NOT skipped by "lei import".
---
 lib/PublicInbox/MdirReader.pm | 1 +
 t/lei-import-maildir.t        | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index 44724af1..06806e80 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -57,6 +57,7 @@ sub maildir_each_eml ($$;@) {
 	opendir my $dh, $pfx or return;
 	while (defined(my $bn = readdir($dh))) {
 		my $fl = maildir_basename_flags($bn) // next;
+		next if index($fl, 'T') >= 0;
 		my $f = $pfx.$bn;
 		my $eml = eml_from_path($f) or next;
 		my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index a3796491..bd89677a 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -29,5 +29,12 @@ test_lei(sub {
 	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
 	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
 	is($res->[1], undef, 'only got one result');
+
+	symlink(abs_path('t/utf8.eml'), "$md/cur/u:2,ST") or
+		BAIL_OUT "symlink $md $!";
+	lei_ok('import', "maildir:$md", \'import Maildir w/ trashed message');
+	lei_ok(qw(q -d none m:testmessage@example.com));
+	$res = json_utf8->decode($lei_out);
+	is_deeply($res, [ undef ], 'trashed message not imported');
 });
 done_testing;

^ permalink raw reply related	[relevance 68%]

* [PATCH 3/5] lei import: simplify Maildir handling
  @ 2021-03-10 13:23 67% ` Eric Wong
  2021-03-10 13:23 68% ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

Having a one-off Maildir functionality in LeiStore doesn't seem
worth the maintenance burden, especially given an upcoming
change to skip trashed messages.

I expect this will hurt performance slightly with extra IPC
overhead for the socket copy, but "lei import" may eventually
become rare or at least not hit messages redundantly.
---
 lib/PublicInbox/LeiImport.pm | 8 ++++----
 lib/PublicInbox/LeiStore.pm  | 6 ------
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 23cecd53..815788b3 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -147,9 +147,9 @@ error reading $input: $!
 	$lei->child_error(1 << 8, "$input: $@") if $@;
 }
 
-sub _import_maildir { # maildir_each_file cb
-	my ($f, $sto, $set_kw) = @_;
-	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
+sub _import_maildir { # maildir_each_eml cb
+	my ($f, $kw, $eml, $sto, $set_kw) = @_;
+	$sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
 }
 
 sub _import_net { # imap_each, nntp_each cb
@@ -181,7 +181,7 @@ sub import_path_url {
 		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
 $input appears to a be a maildir, not $ifmt
 EOM
-		PublicInbox::MdirReader::maildir_each_file($input,
+		PublicInbox::MdirReader::maildir_each_eml($input,
 					\&_import_maildir,
 					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 92c29100..6ace2ad1 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -213,12 +213,6 @@ sub set_eml {
 	add_eml($self, $eml, @kw) // set_eml_keywords($self, $eml, @kw);
 }
 
-sub set_eml_from_maildir {
-	my ($self, $f, $set_kw) = @_;
-	my $eml = eml_from_path($f) or return;
-	set_eml($self, $eml, $set_kw ? maildir_keywords($f) : ());
-}
-
 sub checkpoint {
 	my ($self, $wait) = @_;
 	if (my $im = $self->{im}) {

^ permalink raw reply related	[relevance 67%]

* final "null" in "lei q" JSON output
@ 2021-03-11 22:43 71% Eric Wong
  2021-03-12  4:22 71% ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-03-11 22:43 UTC (permalink / raw)
  To: meta

The standard JSON (not line-delimited) output is an array with
a "null" element as the last element.

I don't really like it, but it seems necessary if we're
doing parallel writes to stdout and JSON doesn't allow
trailing commas.

So I don't think that part is considered part of the stable
output.

Maybe it could be tweaked to show stats, or something else,
I don't know.

Thoughts?

^ permalink raw reply	[relevance 71%]

* Re: final "null" in "lei q" JSON output
  2021-03-11 22:43 71% final "null" in "lei q" JSON output Eric Wong
@ 2021-03-12  4:22 71% ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-03-12  4:22 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> The standard JSON (not line-delimited) output is an array with
> a "null" element as the last element.
>
> I don't really like it, but it seems necessary if we're
> doing parallel writes to stdout and JSON doesn't allow
> trailing commas.

I dunno, it doesn't really bother me, and it seems like a fine solution
to the problem.

> So I don't think that part is considered part of the stable
> output.

I think that's fair enough.

> Maybe it could be tweaked to show stats, or something else,
> I don't know.

Nothing too useful comes to mind, but if someone does think of
something, substituting it for the trailing null makes sense to me.
Until then, my vote would be to leave the helpful creature be :)

^ permalink raw reply	[relevance 71%]

* [PATCH 1/3] lei: add help + completion for --no-external
  2021-03-12 10:39 71% [PATCH 0/3] lei CLI option updates Eric Wong
@ 2021-03-12 10:39 71% ` Eric Wong
  2021-03-12 10:39 51% ` [PATCH 2/3] lei: rearrange OPT_DESC and drop some TBD switches Eric Wong
  2021-03-12 10:39 62% ` [PATCH 3/3] lei q: mbox*: disable changing parallelism, add --rsyncable Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-12 10:39 UTC (permalink / raw)
  To: meta

I just needed it.
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 50c0a885..ddc27361 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -228,6 +228,7 @@ my %OPTDESC = (
 'globoff|g' => "do not match locations using '*?' wildcards ".
 		"and\xa0'[]'\x{a0}ranges",
 'verbose|v+' => 'be more verbose',
+'external!' => 'do not use externals',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
 'torsocks=s' => ['VAL|auto|no|yes',
 		'whether or not to wrap git and curl commands with torsocks'],

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/3] lei CLI option updates
@ 2021-03-12 10:39 71% Eric Wong
  2021-03-12 10:39 71% ` [PATCH 1/3] lei: add help + completion for --no-external Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2021-03-12 10:39 UTC (permalink / raw)
  To: meta

Mentally, I still use "--no-externals" (plural),
but I'm not sure if that's necessary to support with
completion, now...

Eric Wong (3):
  lei: add help + completion for --no-external
  lei: rearrange OPT_DESC and drop some TBD switches
  lei q: mbox*: disable changing parallelism, add --rsyncable

 lib/PublicInbox/LEI.pm       | 41 +++++++++++++++---------------------
 lib/PublicInbox/LeiHelp.pm   |  2 +-
 lib/PublicInbox/LeiToMail.pm |  6 +++---
 3 files changed, 21 insertions(+), 28 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 3/3] lei q: mbox*: disable changing parallelism, add --rsyncable
  2021-03-12 10:39 71% [PATCH 0/3] lei CLI option updates Eric Wong
  2021-03-12 10:39 71% ` [PATCH 1/3] lei: add help + completion for --no-external Eric Wong
  2021-03-12 10:39 51% ` [PATCH 2/3] lei: rearrange OPT_DESC and drop some TBD switches Eric Wong
@ 2021-03-12 10:39 62% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-12 10:39 UTC (permalink / raw)
  To: meta

Unfortunately, being mairix-compatible with --threads means we
can't change thread-count of gzip, bzip2, or xz when writing to
compressed mbox with a --threads= parameter.  It's probably not
worth changing, anyways, so another switch or additional value
for --jobs= won't be added.

While we're in the area, add --rsyncable support since
most installations of gzip support it nowadays.

Fixes: 5beb4a5f6585acd ("lei: replace --thread with --threads")
---
 lib/PublicInbox/LEI.pm       | 2 +-
 lib/PublicInbox/LeiToMail.pm | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9bf60ad4..59a3338c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -113,7 +113,7 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g augment|a
-	import-remote! import-before! lock=s@
+	import-remote! import-before! lock=s@ rsyncable
 	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 13b4f672..13764d79 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -170,9 +170,9 @@ sub reap_compress { # dwaitpid callback
 # { foo => '' } means "--foo" is passed to the command-line,
 # otherwise { foo => '--bar' } passes "--bar"
 our %zsfx2cmd = (
-	gz => [ qw(GZIP pigz gzip), { rsyncable => '', threads => '-p' } ],
+	gz => [ qw(GZIP pigz gzip), { rsyncable => '' } ],
 	bz2 => [ 'bzip2', {} ],
-	xz => [ 'xz', { threads => '-T' } ],
+	xz => [ 'xz', {} ],
 	# XXX does anybody care for these?  I prefer zstd on entire FSes,
 	# so it's probably not necessary on a per-file basis
 	# zst => [ 'zstd', { -default => [ qw(-q) ], # it's noisy by default
@@ -203,7 +203,7 @@ sub zsfx2cmd ($$$) {
 		my $switch = $cmd_opt->{rsyncable} // next;
 		push @cmd, '--'.($switch || $bool);
 	}
-	for my $key (qw(threads)) { # support compression level?
+	for my $key (qw(rsyncable)) { # support compression level?
 		my $switch = $cmd_opt->{$key} // next;
 		my $val = $lei->{opt}->{$key} // next;
 		push @cmd, $switch, $val;

^ permalink raw reply related	[relevance 62%]

* [PATCH 2/3] lei: rearrange OPT_DESC and drop some TBD switches
  2021-03-12 10:39 71% [PATCH 0/3] lei CLI option updates Eric Wong
  2021-03-12 10:39 71% ` [PATCH 1/3] lei: add help + completion for --no-external Eric Wong
@ 2021-03-12 10:39 51% ` Eric Wong
  2021-03-12 10:39 62% ` [PATCH 3/3] lei q: mbox*: disable changing parallelism, add --rsyncable Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-12 10:39 UTC (permalink / raw)
  To: meta

It'll be easier for us to have the option-spec in front of the
command instead of the other way around.  The option-spec in
front makes it easier to sort and keep track of potentially
confusing/ambiguous use of command-line switches between
different commands.

We'll also update some of the proposed switches while we're
at it.
---
 lib/PublicInbox/LEI.pm     | 38 +++++++++++++++-----------------------
 lib/PublicInbox/LeiHelp.pm |  2 +-
 2 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ddc27361..9bf60ad4 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -133,7 +133,7 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(prune quiet|q C=s@) ],
 
 'ls-query' => [ '[FILTER...]', 'list saved search queries',
-		qw(name-only format|f=s z C=s@) ],
+		qw(name-only format|f=s C=s@) ],
 'rm-query' => [ 'QUERY_NAME', 'remove a saved search', qw(C=s@) ],
 'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search', qw(C=s@) ],
 
@@ -240,8 +240,7 @@ my %OPTDESC = (
 
 'dedupe|d=s' => ['STRATEGY|content|oid|mid|none',
 		'deduplication strategy'],
-'show	threads|t' => 'display entire thread a message belongs to',
-'q	threads|t+' =>
+'threads|t+' =>
 	'return all messages in the same threads as the actual match(es)',
 
 'want|w=s@' => [ 'PREFIX|dfpost|dfn', # common ones in help...
@@ -260,17 +259,11 @@ my %OPTDESC = (
 'mua=s' => [ 'CMD',
 	"MUA to run on --output Maildir or mbox (e.g.\xa0`mutt\xa0-f\xa0%f')" ],
 
-'show	format|f=s' => [ 'OUT|plain|raw|html|mboxrd|mboxcl2|mboxcl',
-			'message/object output format' ],
-'mark	format|f=s' => $stdin_formats,
-'forget	format|f=s' => $stdin_formats,
-
-'add-external	inbox-version=i' => [ 'NUM|1|2',
+'inbox-version=i' => [ 'NUM|1|2',
 		'force a public-inbox version with --mirror'],
-'add-external	mirror=s' => [ 'URL', 'mirror a public-inbox'],
+'mirror=s' => [ 'URL', 'mirror a public-inbox'],
 
 # public-inbox-index options
-'add-external	jobs|j=i' => 'set parallelism when indexing after --mirror',
 'fsync!' => 'speed up indexing after --mirror, risk index corruption',
 'compact' => 'run compact index after mirroring',
 'indexlevel|L=s' => [ 'LEVEL|full|medium|basic',
@@ -284,23 +277,22 @@ my %OPTDESC = (
 'skip-docdata' =>
 	'drop compatibility w/ public-inbox <1.6 to save ~1.5% space',
 
-'q	format|f=s' => [
+'format|f=s	q' => [
 	'OUT|maildir|mboxrd|mboxcl2|mboxcl|mboxo|html|json|jsonl|concatjson',
 		'specify output format, default depends on --output'],
-'q	exclude=s@' => [ 'LOCATION',
+'exclude=s@	q' => [ 'LOCATION',
 		'exclude specified external(s) from search' ],
-'q	include|I=s@' => [ 'LOCATION',
+'include|I=s@	q' => [ 'LOCATION',
 		'include specified external(s) in search' ],
-'q	only=s@' => [ 'LOCATION',
+'only=s@	q' => [ 'LOCATION',
 		'only use specified external(s) for search' ],
-
-'q	jobs=s'	=> [ '[SEARCH_JOBS][,WRITER_JOBS]',
+'jobs=s	q' => [ '[SEARCH_JOBS][,WRITER_JOBS]',
 		'control number of search and writer jobs' ],
+'jobs|j=i	add-external' => 'set parallelism when indexing after --mirror',
 
-'import format|f=s' => $stdin_formats,
-
-'ls-query	format|f=s' => $ls_format,
-'ls-external	format|f=s' => $ls_format,
+'in-format|F=s' => $stdin_formats,
+'format|f=s	ls-query' => $ls_format,
+'format|f=s	ls-external' => $ls_format,
 
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
@@ -770,7 +762,7 @@ sub lei__complete {
 				my $x = length > 1 ? "--$_" : "-$_";
 				$x eq $cur ? () : $x;
 			} grep(!/_/, split(/\|/, $_, -1)) # help|h
-		} grep { $OPTDESC{"$cmd\t$_"} || $OPTDESC{$_} } @spec);
+		} grep { $OPTDESC{"$_\t$cmd"} || $OPTDESC{$_} } @spec);
 	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
 		puts $self, grep(/$re/, keys %CONFIG_KEYS);
 	}
@@ -785,7 +777,7 @@ sub lei__complete {
 			# (TODO: completion for external paths)
 			shift(@v) if uc($v[0]) eq $v[0];
 			@v;
-		} grep(/\A(?:$cmd\t|)(?:[\w-]+\|)*$opt\b/, keys %OPTDESC);
+		} grep(/\A(?:[\w-]+\|)*$opt\b.*?(?:\t$cmd)?\z/, keys %OPTDESC);
 	}
 	$cmd =~ tr/-/_/;
 	if (my $sub = $self->can("_complete_$cmd")) {
diff --git a/lib/PublicInbox/LeiHelp.pm b/lib/PublicInbox/LeiHelp.pm
index a654e1c2..be31c2a8 100644
--- a/lib/PublicInbox/LeiHelp.pm
+++ b/lib/PublicInbox/LeiHelp.pm
@@ -20,7 +20,7 @@ sub call {
 	my @opt_desc;
 	my $lpad = 2;
 	for my $sw (grep { !ref } @info) { # ("prio=s", "z", $GLP_PASS)
-		my $desc = $OPTDESC->{"$cmd\t$sw"} // $OPTDESC->{$sw} // next;
+		my $desc = $OPTDESC->{"$sw\t$cmd"} // $OPTDESC->{$sw} // next;
 		my $arg_vals = '';
 		($arg_vals, $desc) = @$desc if ref($desc) eq 'ARRAY';
 

^ permalink raw reply related	[relevance 51%]

* [PATCH] lei q: do not import unnecessarily from externals
@ 2021-03-14 11:12 40% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-14 11:12 UTC (permalink / raw)
  To: meta

We only want to auto import messages that are exclusively in
remote externals.  Messages in local externals are not
auto-imported to save space and reduce wear on storage device.
---
 lib/PublicInbox/LeiSearch.pm  | 37 ++++++++++++++++---------
 lib/PublicInbox/LeiStore.pm   | 52 +++++++++++++++++++++++++++++++----
 lib/PublicInbox/LeiToMail.pm  |  2 +-
 lib/PublicInbox/LeiXSearch.pm | 10 ++++++-
 t/lei-q-remote-import.t       | 45 +++++++++++++++++++++++++++++-
 5 files changed, 124 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm
index ceb3624b..2e3f10fd 100644
--- a/lib/PublicInbox/LeiSearch.pm
+++ b/lib/PublicInbox/LeiSearch.pm
@@ -44,29 +44,40 @@ sub content_key ($) {
 
 sub _cmp_1st { # git->cat_async callback
 	my ($bref, $oid, $type, $size, $cmp) = @_; # cmp: [chash, found, smsg]
-	return if defined($cmp->[1]->[0]); # $found->[0]
 	if (content_hash(PublicInbox::Eml->new($bref)) eq $cmp->[0]) {
-		push @{$cmp->[1]}, $cmp->[2]->{num};
+		$cmp->[1]->{$oid} = $cmp->[2]->{num};
 	}
 }
 
-# returns true if $eml is indexed by lei/store and keywords don't match
-sub kw_changed {
-	my ($self, $eml, $new_kw_sorted) = @_;
+sub xids_for { # returns { OID => docid } mapping for $eml matches
+	my ($self, $eml, $min) = @_;
 	my ($chash, $mids) = content_key($eml);
-	my $over = $self->over;
+	my @overs = ($self->over // $self->overs_all);
 	my $git = $self->git;
-	my $found = [];
+	my $found = {};
 	for my $mid (@$mids) {
-		my ($id, $prev);
-		while (my $cur = $over->next_by_mid($mid, \$id, \$prev)) {
-			$git->cat_async($cur->{blob}, \&_cmp_1st,
-					[ $chash, $found, $cur ]);
-			last if scalar(@$found);
+		for my $o (@overs) {
+			my ($id, $prev);
+			while (my $cur = $o->next_by_mid($mid, \$id, \$prev)) {
+				next if $found->{$cur->{blob}};
+				$git->cat_async($cur->{blob}, \&_cmp_1st,
+						[ $chash, $found, $cur ]);
+				if ($min && scalar(keys %$found) >= $min) {
+					$git->cat_async_wait;
+					return $found;
+				}
+			}
 		}
 	}
 	$git->cat_async_wait;
-	my $num = $found->[0] // return;
+	scalar(keys %$found) ? $found : undef;
+}
+
+# returns true if $eml is indexed by lei/store and keywords don't match
+sub kw_changed {
+	my ($self, $eml, $new_kw_sorted) = @_;
+	my $found = xids_for($self, $eml, 1) // return;
+	my ($num) = values %$found;
 	my @cur_kw = msg_keywords($self, $num);
 	join("\0", @$new_kw_sorted) eq join("\0", @cur_kw) ? 0 : 1;
 }
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 6ace2ad1..aaee5874 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -213,6 +213,24 @@ sub set_eml {
 	add_eml($self, $eml, @kw) // set_eml_keywords($self, $eml, @kw);
 }
 
+sub add_eml_maybe {
+	my ($self, $eml) = @_;
+	my $lxs = $self->{lxs_all_local} // die 'BUG: no {lxs_all_local}';
+	return if $lxs->xids_for($eml, 1);
+	add_eml($self, $eml);
+}
+
+# set or update keywords for external message, called via ipc_do
+sub set_xkw {
+	my ($self, $eml, $kw) = @_;
+	my $lxs = $self->{lxs_all_local} // die 'BUG: no {lxs_all_local}';
+	if ($lxs->xids_for($eml, 1)) { # is it in a local external?
+		# TODO: index keywords only
+	} else {
+		set_eml($self, $eml, @$kw);
+	}
+}
+
 sub checkpoint {
 	my ($self, $wait) = @_;
 	if (my $im = $self->{im}) {
@@ -237,18 +255,40 @@ sub done {
 
 sub ipc_atfork_child {
 	my ($self) = @_;
-	my $lei = delete $self->{lei};
+	my $lei = $self->{lei};
 	$lei->lei_atfork_child(1) if $lei;
 	$self->SUPER::ipc_atfork_child;
 }
 
+sub refresh_local_externals {
+	my ($self) = @_;
+	my $cfg = $self->{lei}->_lei_cfg or return;
+	my $cur_cfg = $self->{cur_cfg} // -1;
+	my $lxs = $self->{lxs_all_local};
+	if ($cfg != $cur_cfg || !$lxs) {
+		$lxs = PublicInbox::LeiXSearch->new;
+		my @loc = $self->{lei}->externals_each;
+		for my $loc (@loc) { # locals only
+			$lxs->prepare_external($loc) if -d $loc;
+		}
+		$self->{lxs_all_local} = $lxs;
+		$self->{cur_cfg} = $cfg;
+	}
+	($lxs->{git_tmp} //= $lxs->git_tmp)->{git_dir};
+}
+
 sub write_prepare {
 	my ($self, $lei) = @_;
-	$self->ipc_lock_init;
-	# Mail we import into lei are private, so headers filtered out
-	# by -mda for public mail are not appropriate
-	local @PublicInbox::MDA::BAD_HEADERS = ();
-	$self->ipc_worker_spawn('lei_store', $lei->oldset, { lei => $lei });
+	unless ($self->{-ipc_req}) {
+		require PublicInbox::LeiXSearch;
+		$self->ipc_lock_init;
+		# Mail we import into lei are private, so headers filtered out
+		# by -mda for public mail are not appropriate
+		local @PublicInbox::MDA::BAD_HEADERS = ();
+		$self->ipc_worker_spawn('lei_store', $lei->oldset,
+					{ lei => $lei });
+	}
+	$lei->{all_ext_git_dir} = $self->ipc_do('refresh_local_externals');
 	$lei->{sto} = $self;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 13764d79..587804bb 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -279,7 +279,7 @@ sub update_kw_maybe ($$$$) {
 	if ($x) {
 		$lei->{sto}->ipc_do('set_eml', $eml, @$kw);
 	} elsif (!defined($x)) {
-		# TODO: xkw
+		$lei->{sto}->ipc_do('set_xkw', $eml, $kw);
 	}
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index f2c8c02e..22c8026c 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -97,6 +97,11 @@ sub recent {
 
 sub over {}
 
+sub overs_all { # for xids_for
+	my ($self) = @_;
+	grep(defined, map { $_->over } locals($self))
+}
+
 sub _mset_more ($$) {
 	my ($mset, $mo) = @_;
 	my $size = $mset->size;
@@ -204,7 +209,9 @@ sub query_mset { # non-parallel for non-"--threads" users
 
 sub each_remote_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self, $lei, $each_smsg) = @_;
-	$lei->{sto}->ipc_do('add_eml', $eml) if $lei->{opt}->{'import-remote'};
+	if (my $sto = $self->{import_sto}) {
+		$sto->ipc_do('add_eml_maybe', $eml);
+	}
 	my $smsg = bless {}, 'PublicInbox::Smsg';
 	$smsg->populate($eml);
 	$smsg->parse_references($eml, mids($eml));
@@ -249,6 +256,7 @@ sub query_remote_mboxrd {
 	my $curl = PublicInbox::LeiCurl->new($lei, $self->{curl}) or return;
 	push @$curl, '-s', '-d', '';
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
+	$self->{import_sto} = $lei->{sto} if $lei->{opt}->{'import-remote'};
 	for my $uri (@$uris) {
 		$lei->{-current_url} = $uri->as_string;
 		$lei->{-nr_remote_eml} = 0;
diff --git a/t/lei-q-remote-import.t b/t/lei-q-remote-import.t
index 4088b6ad..8b82579c 100644
--- a/t/lei-q-remote-import.t
+++ b/t/lei-q-remote-import.t
@@ -5,6 +5,7 @@ use strict; use v5.10.1; use PublicInbox::TestCommon;
 require_git 2.6;
 require_mods(qw(json DBD::SQLite Search::Xapian));
 use PublicInbox::MboxReader;
+use PublicInbox::InboxWritable;
 my ($ro_home, $cfg_path) = setup_public_inboxes;
 my $sock = tcp_server;
 my ($tmpdir, $for_destroy) = tmpdir;
@@ -36,7 +37,8 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	is_deeply($slurp_emls->($o), $exp1, 'got results after remote search');
 	unlink $o or BAIL_OUT $!;
 	lei_ok(@cmd);
-	ok(-f $o && -s _, 'output exists after import but is not empty');
+	ok(-f $o && -s _, 'output exists after import but is not empty') or
+		diag $lei_err;
 	is_deeply($slurp_emls->($o), $exp1, 'got results w/o remote search');
 	unlink $o or BAIL_OUT $!;
 
@@ -58,5 +60,46 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	unlink "$o.lock" or BAIL_OUT $!;
 	lei_ok(@cmd, '--lock=dotlock,timeout=0.000001',
 		\'succeeds after lock removal');
+
+	# XXX memoize this external creation
+	my $inboxdir = "$ENV{HOME}/tmp_git";
+	my $ibx = PublicInbox::InboxWritable->new({
+		name => 'tmp',
+		-primary_address => 'lei@example.com',
+		inboxdir => $inboxdir,
+		indexlevel => 'medium',
+	}, { nproc => 1 });
+	my $im = $ibx->importer(0);
+	$im->add(eml_load('t/utf8.eml')) or BAIL_OUT '->add';
+	$im->done;
+
+	run_script(['-index', $inboxdir], undef) or BAIL_OUT '-init';
+	lei_ok(qw(add-external -q), $inboxdir);
+	lei_ok(qw(q -o), "mboxrd:$o", '--only', $url,
+		'm:testmessage@example.com');
+	ok(-s $o, 'got result from remote external');
+	my $exp = eml_load('t/utf8.eml');
+	is_deeply($slurp_emls->($o), [$exp], 'got expected result');
+	lei_ok(qw(q --no-external -o), "mboxrd:/dev/stdout",
+			'm:testmessage@example.com');
+	is($lei_out, '', 'message not imported when in local external');
+
+	open $fh, '>', $o or BAIL_OUT;
+	print $fh <<'EOF' or BAIL_OUT;
+From a@z Mon Sep 17 00:00:00 2001
+From: nobody@localhost
+Date: Sat, 13 Mar 2021 18:23:01 +0600
+Message-ID: <never-before-seen@example.com>
+Status: RO
+
+whatever
+EOF
+	close $fh or BAIL_OUT;
+	lei_ok(qw(q -o), "mboxrd:$o", 'm:testmessage@example.com');
+	is_deeply($slurp_emls->($o), [$exp],
+		'got expected result after clobber') or diag $lei_err;
+	lei_ok(qw(q -o mboxrd:/dev/stdout m:never-before-seen@example.com));
+	like($lei_out, qr/seen\@example\.com>\nStatus: OR\n\nwhatever/sm,
+		'--import-before imported totally unseen message');
 });
 done_testing;

^ permalink raw reply related	[relevance 40%]

* [PATCH] lei: reuse LeiStore object on config changes
@ 2021-03-15  9:32 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-15  9:32 UTC (permalink / raw)
  To: meta

Unless leistore.dir changes, the same LeiStore object
is should remain reusable and accessible to any clients

This seems to fix problems with t/lei-q-remote-import.t
occasionally getting stuck
---
 lib/PublicInbox/LEI.pm | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 59a3338c..31d5b838 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -606,8 +606,10 @@ sub _lei_cfg ($;$) {
 	my $f = _config_path($self);
 	my @st = stat($f);
 	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
+	my ($sto, $sto_dir);
 	if (my $cfg = $PATH2CFG{$f}) { # reuse existing object in common case
 		return ($self->{cfg} = $cfg) if $cur_st eq $cfg->{-st};
+		($sto, $sto_dir) = @$cfg{qw(-lei_store leistore.dir)};
 	}
 	if (!@st) {
 		unless ($creat) {
@@ -625,6 +627,10 @@ sub _lei_cfg ($;$) {
 	bless $cfg, 'PublicInbox::Config';
 	$cfg->{-st} = $cur_st;
 	$cfg->{'-f'} = $f;
+	if ($sto && File::Spec->canonpath($sto_dir) eq
+			File::Spec->canonpath($cfg->{'leistore.dir'})) {
+		$cfg->{-lei_store} = $sto;
+	}
 	$self->{cfg} = $PATH2CFG{$f} = $cfg;
 }
 

^ permalink raw reply related	[relevance 71%]

* [PATCH 1/2] lei: disallow "\n" in local externals paths
  @ 2021-03-19 12:35 53% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-19 12:35 UTC (permalink / raw)
  To: meta

git 2.11 and earlier could not handle git directories with
newlines in them, nor does libgit2 support them.

Followup-to: d87dd0e679587043 ("config: reject `\n' in `inboxdir'")
---
 lib/PublicInbox/LeiExternal.pm |  5 +++++
 lib/PublicInbox/LeiXSearch.pm  |  2 ++
 t/lei-externals.t              |  8 ++++++--
 t/lei-mirror.t                 |  5 +++++
 t/lei.t                        | 12 ++++++++++++
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 47791d4e..b5dd85e1 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -170,9 +170,14 @@ sub lei_add_external {
 		$self->fail(<<""); # TODO: did you mean "update-external?"
 --mirror destination `$location' already exists
 
+	} elsif (-d $location) {
+		index($location, "\n") >= 0 and
+			return $self->fail("`\\n' not allowed in `$location'");
 	}
 	if ($location !~ m!\Ahttps?://! && !-d $location) {
 		$mirror // return $self->fail("$location not a directory");
+		index($location, "\n") >= 0 and
+			return $self->fail("`\\n' not allowed in `$location'");
 		$mirror = ext_canonicalize($mirror);
 		require PublicInbox::LeiMirror;
 		PublicInbox::LeiMirror->start($self, $mirror => $location);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 22c8026c..d95a218e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -502,8 +502,10 @@ sub prepare_external {
 		return add_uri($self, URI->new($loc));
 	} elsif (-f "$loc/ei.lock") {
 		require PublicInbox::ExtSearch;
+		die "`\\n' not allowed in `$loc'\n" if index($loc, "\n") >= 0;
 		$loc = PublicInbox::ExtSearch->new($loc);
 	} elsif (-f "$loc/inbox.lock" || -d "$loc/public-inbox") {
+		die "`\\n' not allowed in `$loc'\n" if index($loc, "\n") >= 0;
 		require PublicInbox::Inbox; # v2, v1
 		$loc = bless { inboxdir => $loc }, 'PublicInbox::Inbox';
 	} else {
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 2a92d101..1695ff0b 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -78,8 +78,12 @@ test_lei(sub {
 	ok(!-e $config_file && !-e $store_dir,
 		'nothing created by ls-external');
 
-	ok(!lei('add-external', "$home/nonexistent",
-		"fails on non-existent dir"));
+	ok(!lei('add-external', "$home/nonexistent"),
+		"fails on non-existent dir");
+	like($lei_err, qr/not a directory/, 'noted non-existence');
+	mkdir "$home/new\nline" or BAIL_OUT "mkdir: $!";
+	ok(!lei('add-external', "$home/new\nline"), "fails on newline");
+	like($lei_err, qr/`\\n' not allowed/, 'newline noted in error');
 	lei_ok('ls-external', \'ls-external works after add failure');
 	is($lei_out.$lei_err, '', 'ls-external still has no output');
 	my $cfg = PublicInbox::Config->new($cfg_path);
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 1d113e3e..9769f31b 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -29,8 +29,13 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	ok(!lei('add-external', $t2, '--mirror', "$http/t2/"),
 		'--mirror fails if reused') or diag "$lei_err.$lei_out = $?";
 
+	ok(!lei('add-external', "$home/t2\nnewline", '--mirror', "$http/t2/"),
+		'--mirror fails on newline');
+	like($lei_err, qr/`\\n' not allowed/, 'newline noted in error');
+
 	lei_ok('ls-external');
 	like($lei_out, qr!\Q$t2\E!, 'still in ls-externals');
+	unlike($lei_out, qr!\Qnewline\E!, 'newline entry not added');
 
 	ok(!lei('add-external', "$t2-fail", '-Lmedium'), '--mirror v2');
 	ok(!-d "$t2-fail", 'destination not created on failure');
diff --git a/t/lei.t b/t/lei.t
index 74a775ca..2bf4b862 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -133,6 +133,18 @@ my $test_fail = sub {
 	is($? >> 8, 1, 'chdir at end fails to /dev/null');
 	lei('-C', '/dev/null', 'q', 'whatever');
 	is($? >> 8, 1, 'chdir at beginning fails to /dev/null');
+
+	for my $lk (qw(ei inbox)) {
+		my $d = "$home/newline\n$lk";
+		mkdir $d;
+		open my $fh, '>', "$d/$lk.lock" or BAIL_OUT "open $d/$lk.lock";
+		for my $fl (qw(-I --only)) {
+			ok(!lei('q', $fl, $d, 'whatever'),
+				"newline $lk.lock fails with q $fl");
+			like($lei_err, qr/`\\n' not allowed/,
+				"error noted with q $fl");
+		}
+	}
 SKIP: {
 	skip 'no curl', 3 unless which('curl');
 	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m));

^ permalink raw reply related	[relevance 53%]

* [PATCH] t/lei-externals: add diagnostic for warning
@ 2021-03-19 12:41 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-19 12:41 UTC (permalink / raw)
  To: meta

Not sure where it's coming from, but I saw it fail, once
(and we should be doing more "or diag ..." anyways to improve
diagnostics).
---
 t/lei-externals.t | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/lei-externals.t b/t/lei-externals.t
index 2a92d101..915f25ad 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -229,7 +229,8 @@ SKIP: {
 		is(scalar(@s), 2, "2 results in mbox$sfx");
 
 		lei_ok('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
-		is(grep(!/^#/, $lei_err), 0, "no errors on no results ($sfx)");
+		is(grep(!/^#/, $lei_err), 0, "no errors on no results ($sfx)")
+			or diag $lei_err;
 
 		my @s2 = grep(/^Subject:/, $cat->());
 		is_deeply(\@s2, \@s,

^ permalink raw reply related	[relevance 71%]

* [PATCH] lei q: -I/--include overrides --no-(external|local|remote)
@ 2021-03-19 22:38 59% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-19 22:38 UTC (permalink / raw)
  To: meta

Assume that anybody using -I/--include for external locations
will want to override --no-$FOO if they're explicitly including
a location.

With some effort, we could make it order-dependent (e.g.
"-I $LOCATION --no-$FOO" and "--no-$FOO -I $LOCATION"
behave differently).  However that's not straightforward
when using Getopt::Long to parse command-line options into
a hashref.

I'm also not sure if order-dependent switches are a desirable
UI/UX quality.
---
 lib/PublicInbox/LeiQuery.pm |  7 +++++--
 t/lei-externals.t           | 15 +++++++++++++--
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 623b92cd..532668ae 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -64,9 +64,12 @@ sub lei_q {
 			$lxs->prepare_external($_) for @loc;
 		}
 	} else {
+		my (@ilocals, @iremotes);
 		for my $loc (@{$opt->{include} // []}) {
 			my @loc = $self->get_externals($loc) or return;
 			$lxs->prepare_external($_) for @loc;
+			@ilocals = @{$lxs->{locals} // []};
+			@iremotes = @{$lxs->{remotes} // []};
 		}
 		# --external is enabled by default, but allow --no-external
 		if ($opt->{external} //= 1) {
@@ -78,9 +81,9 @@ sub lei_q {
 			my $ne = $self->externals_each(\&prep_ext, $lxs, \%x);
 			$opt->{remote} //= !($lxs->locals - $opt->{'local'});
 			if ($opt->{'local'}) {
-				delete($lxs->{remotes}) if !$opt->{remote};
+				$lxs->{remotes} = \@iremotes if !$opt->{remote};
 			} else {
-				delete($lxs->{locals});
+				$lxs->{locals} = \@ilocals;
 			}
 		}
 	}
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 1695ff0b..1d2a9a16 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -127,8 +127,10 @@ test_lei(sub {
 	lei_ok qw(_complete lei forget-external), \'complete for externals';
 	my %comp = map { $_ => 1 } split(/\s+/, $lei_out);
 	ok($comp{'https://example.com/ibx/'}, 'forget external completion');
+	my @dirs;
 	$cfg->each_inbox(sub {
 		my ($ibx) = @_;
+		push @dirs, $ibx->{inboxdir};
 		ok($comp{$ibx->{inboxdir}}, "local $ibx->{name} completion");
 	});
 	for my $u (qw(h http https https: https:/ https:// https://e
@@ -157,7 +159,8 @@ test_lei(sub {
 	lei_ok('ls-external');
 	unlike($lei_out, qr!https://example\.com/ibx/!s,
 		'removed canonical URL');
-SKIP: {
+
+	# do some queries
 	ok(!lei(qw(q s:prefix -o maildir:/dev/null)), 'bad maildir');
 	like($lei_err, qr!/dev/null exists and is not a directory!,
 		'error shown');
@@ -249,6 +252,15 @@ SKIP: {
 	is($? >> 8, 1, 'proper exit code');
 	like($lei_err, qr/no local or remote.+? to search/, 'no inbox');
 
+	for my $no (['--no-local'], ['--no-external'],
+			[qw(--no-local --no-external)]) {
+		lei_ok(qw(q mid:testmessage@example.com), @$no,
+			'-I', $dirs[0], \"-I and @$no combine");
+		$res = json_utf8->decode($lei_out);
+		is($res->[0]->{'m'}, 'testmessage@example.com',
+			"-I \$DIR got results regardless of @$no");
+	}
+
 	{
 		opendir my $dh, '.' or BAIL_OUT "opendir(.) $!";
 		my $od = PublicInbox::OnDestroy->new($$, sub {
@@ -278,6 +290,5 @@ SKIP: {
 		$url = $e{$k} if $url eq '1';
 		$test_external_remote->($url, $k);
 	}
-	}; # /SKIP
 }); # test_lei
 done_testing;

^ permalink raw reply related	[relevance 59%]

* [PATCH 0/5] lei: preserve keywords across queries
@ 2021-03-20 10:04 67% Eric Wong
  2021-03-20 10:04 33% ` [PATCH 1/5] lei: All Local Externals: bare git dir for alternates Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-03-20 10:04 UTC (permalink / raw)
  To: meta

2/5 is the major milestone to me, it took me a long while to
figure out and arrive at.

PATCH 1/5 made things click, and
<https://public-inbox.org/meta/20210320023800.1809-1-e@80x24.org/>
("lei_store: initialize IPC lock properly") was also needed :x

There'll still need to be some followup to support import,
and also to deal with public-inbox-edit/purge in externals.

Inotify + EVFILT_VNODE support should make things sweeter, too.

Eric Wong (5):
  lei: All Local Externals: bare git dir for alternates
  lei q: support vmd for external-only messages
  lei q: put keywords on one line in --pretty output
  lei_to_mail: match mutt order of status headers
  lei: tie ALE lifetime to config file

 MANIFEST                       |   1 +
 lib/PublicInbox/LEI.pm         |  15 +++++
 lib/PublicInbox/LeiALE.pm      | 111 +++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiOverview.pm |  12 +++-
 lib/PublicInbox/LeiQuery.pm    |   1 +
 lib/PublicInbox/LeiSearch.pm   |  37 +++++++----
 lib/PublicInbox/LeiStore.pm    |  82 ++++++++++++++----------
 lib/PublicInbox/LeiToMail.pm   |  38 ++++++-----
 lib/PublicInbox/LeiXSearch.pm  |  44 +++----------
 lib/PublicInbox/Lock.pm        |   2 +-
 lib/PublicInbox/Over.pm        |  22 ++++++-
 lib/PublicInbox/OverIdx.pm     |  10 ---
 lib/PublicInbox/SearchIdx.pm   |   3 +
 t/eml.t                        |   2 +
 t/lei-convert.t                |   3 +-
 t/lei-externals.t              |   3 +-
 t/lei-q-kw.t                   |  59 ++++++++++++++++--
 t/lei-q-remote-import.t        |   4 +-
 t/lei-q-thread.t               |   7 ++-
 t/lei_to_mail.t                |   2 +-
 t/lei_xsearch.t                |  22 ++++++-
 21 files changed, 355 insertions(+), 125 deletions(-)
 create mode 100644 lib/PublicInbox/LeiALE.pm


^ permalink raw reply	[relevance 67%]

* [PATCH 3/5] lei q: put keywords on one line in --pretty output
  2021-03-20 10:04 67% [PATCH 0/5] lei: preserve keywords across queries Eric Wong
  2021-03-20 10:04 33% ` [PATCH 1/5] lei: All Local Externals: bare git dir for alternates Eric Wong
  2021-03-20 10:04 25% ` [PATCH 2/5] lei q: support vmd for external-only messages Eric Wong
@ 2021-03-20 10:04 64% ` Eric Wong
  2021-03-20 10:04 59% ` [PATCH 5/5] lei: tie ALE lifetime to config file Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-20 10:04 UTC (permalink / raw)
  To: meta

Don't waste precious terminal space when there are only a small
number of possible keywords supported/reserved for JMAP.  In the
future, we may implement more sophisticated wrapping for labels,
but it we'll cross tha bridge when we come to it.
---
 lib/PublicInbox/LeiOverview.pm | 5 ++++-
 t/lei-q-kw.t                   | 7 +++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 48237f8a..521bca50 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -176,7 +176,10 @@ sub _json_pretty {
 					$pair =~ s/(null|"),"/$1, "/g;
 					$pair;
 				} @$v) . ']';
-			} else { # references
+			} elsif ($k eq 'kw') { # keywords are short, one-line
+				$v = $json->encode($v);
+				$v =~ s/","/", "/g;
+			} else { # refs, labels, ...
 				$v = '[' . join($sep, map {
 					substr($json->encode([$_]), 1, -1);
 				} @$v) . ']';
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index e7e14221..de2c775a 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -144,7 +144,7 @@ lei_ok(qw(q -o), "mboxrd:$o", "m:$m", @inc);
 # emulate MUA marking mboxrd message as unread
 open my $fh, '<', $o or BAIL_OUT;
 my $s = do { local $/; <$fh> };
-$s =~ s/^Status: OR\n/Status: O\nX-Status: A\n/sm or
+$s =~ s/^Status: OR\n/Status: O\nX-Status: AF\n/sm or
 	fail "failed to clear R flag in $s";
 open $fh, '>', $o or BAIL_OUT;
 print $fh $s or BAIL_OUT;
@@ -156,7 +156,10 @@ lei_ok(qw(q -o), "mboxrd:$o", "m:$m", @inc);
 open $fh, '<', $o or BAIL_OUT;
 $s = do { local $/; <$fh> };
 like($s, qr/^Status: O\n/ms, 'seen keyword gone in mbox');
-like($s, qr/^X-Status: A\n/ms, 'answered flag set');
+like($s, qr/^X-Status: AF\n/ms, 'answered + flagged set');
 
+lei_ok(qw(q --pretty), "m:$m", @inc);
+like($lei_out, qr/^  "kw": \["answered", "flagged"\],\n/sm,
+	'--pretty JSON output shows kw: on one line');
 }); # test_lei
 done_testing;

^ permalink raw reply related	[relevance 64%]

* [PATCH 1/5] lei: All Local Externals: bare git dir for alternates
  2021-03-20 10:04 67% [PATCH 0/5] lei: preserve keywords across queries Eric Wong
@ 2021-03-20 10:04 33% ` Eric Wong
  2021-03-20 10:04 25% ` [PATCH 2/5] lei q: support vmd for external-only messages Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-20 10:04 UTC (permalink / raw)
  To: meta

This will be used for keyword (and label) storage for externals.
We'll be using this to ensure we don't redundantly auto-import
messages into lei/store if they're already in a local external
(they can still be imported explicitly via "lei import").
---
 MANIFEST                       |  1 +
 lib/PublicInbox/LEI.pm         | 16 ++++++
 lib/PublicInbox/LeiALE.pm      | 98 ++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiExternal.pm |  6 +++
 lib/PublicInbox/LeiOverview.pm |  3 +-
 lib/PublicInbox/LeiQuery.pm    |  5 ++
 lib/PublicInbox/LeiStore.pm    |  5 +-
 lib/PublicInbox/LeiToMail.pm   | 10 ++--
 lib/PublicInbox/LeiXSearch.pm  | 27 +---------
 lib/PublicInbox/Lock.pm        |  2 +-
 t/lei-externals.t              |  3 +-
 t/lei_xsearch.t                | 22 +++++++-
 12 files changed, 158 insertions(+), 40 deletions(-)
 create mode 100644 lib/PublicInbox/LeiALE.pm

diff --git a/MANIFEST b/MANIFEST
index 775de5cd..b6b4a3ab 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -179,6 +179,7 @@ lib/PublicInbox/InputPipe.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiALE.pm
 lib/PublicInbox/LeiAuth.pm
 lib/PublicInbox/LeiConvert.pm
 lib/PublicInbox/LeiCurl.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index d20ba744..0da26a32 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -97,6 +97,22 @@ sub _config_path ($) {
 		.'/lei/config');
 }
 
+sub cache_dir ($) {
+	my ($self) = @_;
+	rel2abs($self, ($self->{env}->{XDG_CACHE_HOME} //
+		($self->{env}->{HOME} // '/nonexistent').'/.cache')
+		.'/lei');
+}
+
+sub ale {
+	my ($self) = @_;
+	$self->{ale} //= do {
+		require PublicInbox::LeiALE;
+		PublicInbox::LeiALE->new(cache_dir($self).
+					'/all_locals_ever.git');
+	};
+}
+
 sub index_opt {
 	# TODO: drop underscore variants everywhere, they're undocumented
 	qw(fsync|sync! jobs|j=i indexlevel|L=s compact
diff --git a/lib/PublicInbox/LeiALE.pm b/lib/PublicInbox/LeiALE.pm
new file mode 100644
index 00000000..bdb50a1a
--- /dev/null
+++ b/lib/PublicInbox/LeiALE.pm
@@ -0,0 +1,98 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# All Locals Ever: track lei/store + externals ever used as
+# long as they're on an accessible FS.  Includes "lei q" --include
+# and --only targets that haven't been through "lei add-external".
+# Typically: ~/.cache/lei/all_locals_ever.git
+package PublicInbox::LeiALE;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::LeiSearch PublicInbox::Lock);
+use PublicInbox::Git;
+use PublicInbox::Import;
+use Fcntl qw(SEEK_SET);
+
+sub new {
+	my ($cls, $d) = @_;
+	PublicInbox::Import::init_bare($d, 'ale');
+	bless {
+		git => PublicInbox::Git->new($d),
+		lock_path => "$d/lei_ale.state", # dual-duty lock + state
+		ibxish => [], # Inbox and ExtSearch (and LeiSearch) objects
+	}, $cls;
+}
+
+sub over {} # undef for xoids_for
+
+sub overs_all { # for xoids_for (called only in lei workers?)
+	my ($self) = @_;
+	my $pid = $$;
+	if (($self->{owner_pid} // $pid) != $pid) {
+		delete($_->{over}) for @{$self->{ibxish}};
+	}
+	$self->{owner_pid} = $pid;
+	grep(defined, map { $_->over } @{$self->{ibxish}});
+}
+
+sub refresh_externals {
+	my ($self, $lxs) = @_;
+	$self->git->cleanup;
+	my $lk = $self->lock_for_scope;
+	my $cur_lxs = ref($lxs)->new;
+	my $orig = do {
+		local $/;
+		readline($self->{lockfh}) //
+				die "readline($self->{lock_path}): $!";
+	};
+	my $new = '';
+	my $old = '';
+	my $gone = 0;
+	my %seen_ibxish; # $dir => any-defined value
+	for my $dir (split(/\n/, $orig)) {
+		if (-d $dir && -r _ && $cur_lxs->prepare_external($dir)) {
+			$seen_ibxish{$dir} //= length($old .= "$dir\n");
+		} else {
+			++$gone;
+		}
+	}
+	my @ibxish = $cur_lxs->locals;
+	for my $x ($lxs->locals) {
+		my $d = File::Spec->canonpath($x->{inboxdir} // $x->{topdir});
+		$seen_ibxish{$d} //= do {
+			$new .= "$d\n";
+			push @ibxish, $x;
+		};
+	}
+	if ($new ne '' || $gone) {
+		$self->{lockfh}->autoflush(1);
+		if ($gone) {
+			seek($self->{lockfh}, 0, SEEK_SET) or die "seek: $!";
+			truncate($self->{lockfh}, 0) or die "truncate: $!";
+		} else {
+			$old = '';
+		}
+		print { $self->{lockfh} } $old, $new or die "print: $!";
+	}
+	$new = $old = '';
+	my $f = $self->git->{git_dir}.'/objects/info/alternates';
+	if (open my $fh, '<', $f) {
+		local $/;
+		$old = <$fh> // die "readline($f): $!";
+	}
+	for my $x (@ibxish) {
+		$new .= File::Spec->canonpath($x->git->{git_dir})."/objects\n";
+	}
+	$self->{ibxish} = \@ibxish;
+	return if $old eq $new;
+
+	# this needs to be atomic since child processes may start
+	# git-cat-file at any time
+	my $tmp = "$f.$$.tmp";
+	open my $fh, '>', $tmp or die "open($tmp): $!";
+	print $fh $new or die "print($tmp): $!";
+	close $fh or die "close($tmp): $!";
+	rename($tmp, $f) or die "rename($tmp, $f): $!";
+}
+
+1;
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index b5dd85e1..aa09be9e 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -139,6 +139,12 @@ sub add_external_finish {
 	my $key = "external.$location.boost";
 	my $cur_boost = $cfg->{$key};
 	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
+	if (-d $location) {
+		require PublicInbox::LeiXSearch;
+		my $lxs = PublicInbox::LeiXSearch->new;
+		$lxs->prepare_external($location);
+		$self->ale->refresh_externals($lxs);
+	}
 	$self->lei_config($key, $new_boost);
 }
 
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index f6348162..1036f465 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -209,11 +209,10 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
-		my $git_dir = $ibxish->git->{git_dir};
 		sub {
 			my ($smsg, $mitem) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
-			$l2m->wq_io_do('write_mail', [], $git_dir, $smsg);
+			$l2m->wq_io_do('write_mail', [], $smsg);
 		}
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 532668ae..007e35fc 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -57,6 +57,10 @@ sub lei_q {
 	}
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		$lxs->prepare_external($lse);
+	} else {
+		my $tmp = PublicInbox::LeiXSearch->new;
+		$tmp->prepare_external($lse);
+		$self->ale->refresh_externals($tmp);
 	}
 	if (@only) {
 		for my $loc (@only) {
@@ -90,6 +94,7 @@ sub lei_q {
 	unless ($lxs->locals || $lxs->remotes) {
 		return $self->fail('no local or remote inboxes to search');
 	}
+	$self->ale->refresh_externals($lxs);
 	my ($xj, $mj) = split(/,/, $opt->{jobs} // '');
 	if (defined($xj) && $xj ne '' && $xj !~ /\A[1-9][0-9]*\z/) {
 		return $self->fail("`$xj' search jobs must be >= 1");
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 26f975c3..c1abc288 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -251,10 +251,11 @@ sub refresh_local_externals {
 		for my $loc (@loc) { # locals only
 			$lxs->prepare_external($loc) if -d $loc;
 		}
+		$self->{lei}->ale->refresh_externals($lxs);
+		$lxs->{git} = $self->{lei}->ale->git;
 		$self->{lxs_all_local} = $lxs;
 		$self->{cur_cfg} = $cfg;
 	}
-	($lxs->{git_tmp} //= $lxs->git_tmp)->{git_dir};
 }
 
 sub write_prepare {
@@ -268,7 +269,7 @@ sub write_prepare {
 		$self->ipc_worker_spawn('lei_store', $lei->oldset,
 					{ lei => $lei });
 	}
-	$lei->{all_ext_git_dir} = $self->ipc_do('refresh_local_externals');
+	my $wait = $self->ipc_do('refresh_local_externals');
 	$lei->{sto} = $self;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 6f386b10..7e821646 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -11,7 +11,6 @@ use PublicInbox::Lock;
 use PublicInbox::ProcessPipe;
 use PublicInbox::Spawn qw(which spawn popen_rd);
 use PublicInbox::LeiDedupe;
-use PublicInbox::Git;
 use PublicInbox::GitAsyncCat;
 use PublicInbox::PktOp qw(pkt_do);
 use Symbol qw(gensym);
@@ -642,18 +641,15 @@ sub poke_dst {
 }
 
 sub write_mail { # via ->wq_io_do
-	my ($self, $git_dir, $smsg) = @_;
-	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
-	git_async_cat($git, $smsg->{blob}, \&git_to_mail,
+	my ($self, $smsg) = @_;
+	git_async_cat($self->{lei}->{ale}->git, $smsg->{blob}, \&git_to_mail,
 				[$self->{wcb}, $smsg]);
 }
 
 sub wq_atexit_child {
 	my ($self) = @_;
 	delete $self->{wcb};
-	for my $git (delete @$self{grep(/\A$$\0/, keys %$self)}) {
-		$git->async_wait_all;
-	}
+	$self->{lei}->{ale}->git->async_wait_all;
 	$SIG{__WARN__} = 'DEFAULT';
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index d95a218e..1266b3b3 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -297,27 +297,7 @@ sub query_remote_mboxrd {
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
-# called by LeiOverview::each_smsg_cb
-sub git { $_[0]->{git_tmp} // die 'BUG: caller did not set {git_tmp}' }
-
-sub git_tmp ($) {
-	my ($self) = @_;
-	my (%seen, @dirs);
-	my $tmp = File::Temp->newdir("lei_xsearch_git.$$-XXXX", TMPDIR => 1);
-	for my $ibxish (locals($self)) {
-		my $d = File::Spec->canonpath($ibxish->git->{git_dir});
-		$seen{$d} //= push @dirs, "$d/objects\n"
-	}
-	my $git_dir = $tmp->dirname;
-	PublicInbox::Import::init_bare($git_dir);
-	my $f = "$git_dir/objects/info/alternates";
-	open my $alt, '>', $f or die "open($f): $!";
-	print $alt @dirs or die "print $f: $!";
-	close $alt or die "close $f: $!";
-	my $git = PublicInbox::Git->new($git_dir);
-	$git->{-tmp} = $tmp;
-	$git;
-}
+sub git { $_[0]->{git} // die 'BUG: git uninitialized' }
 
 sub xsearch_done_wait { # dwaitpid callback
 	my ($arg, $pid) = @_;
@@ -460,11 +440,6 @@ sub do_query {
 		# 1031: F_SETPIPE_SZ
 		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
 	}
-	if (!$lei->{opt}->{threads} && locals($self)) { # for query_mset
-		# lei->{git_tmp} is set for wq_wait_old so we don't
-		# delete until all lei2mail + lei_xsearch workers are reaped
-		$lei->{git_tmp} = $self->{git_tmp} = git_tmp($self);
-	}
 	$self->wq_workers_start('lei_xsearch', undef,
 				$lei->oldset, { lei => $lei });
 	my $op = delete $lei->{pkt_op_c};
diff --git a/lib/PublicInbox/Lock.pm b/lib/PublicInbox/Lock.pm
index 76c3ffb2..0ee2a8bd 100644
--- a/lib/PublicInbox/Lock.pm
+++ b/lib/PublicInbox/Lock.pm
@@ -16,7 +16,7 @@ sub lock_acquire {
 	my $lock_path = $self->{lock_path};
 	croak 'already locked '.($lock_path // '(undef)') if $self->{lockfh};
 	return unless defined($lock_path);
-	sysopen(my $lockfh, $lock_path, O_WRONLY|O_CREAT) or
+	sysopen(my $lockfh, $lock_path, O_RDWR|O_CREAT) or
 		croak "failed to open $lock_path: $!\n";
 	flock($lockfh, LOCK_EX) or croak "lock $lock_path failed: $!\n";
 	$self->{lockfh} = $lockfh;
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 1d2a9a16..2045691f 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -236,7 +236,8 @@ test_lei(sub {
 		is(scalar(@s), 2, "2 results in mbox$sfx");
 
 		lei_ok('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
-		is(grep(!/^#/, $lei_err), 0, "no errors on no results ($sfx)");
+		is(grep(!/^#/, $lei_err), 0, "no errors on no results ($sfx)")
+			or diag $lei_err;
 
 		my @s2 = grep(/^Subject:/, $cat->());
 		is_deeply(\@s2, \@s,
diff --git a/t/lei_xsearch.t b/t/lei_xsearch.t
index f626c790..68211d18 100644
--- a/t/lei_xsearch.t
+++ b/t/lei_xsearch.t
@@ -10,6 +10,7 @@ require_mods(qw(DBD::SQLite Search::Xapian));
 require PublicInbox::ExtSearchIdx;
 require_git 2.6;
 require_ok 'PublicInbox::LeiXSearch';
+require_ok 'PublicInbox::LeiALE';
 my ($home, $for_destroy) = tmpdir();
 my @ibx;
 for my $V (1..2) {
@@ -75,7 +76,8 @@ is($lxs->over, undef, '->over fails');
 	my $v2ibx = create_inbox 'v2full', version => 2, sub {
 		$_[0]->add(eml_load('t/plack-qp.eml'));
 	};
-	my $v1ibx = create_inbox 'v1medium', indexlevel => 'medium', sub {
+	my $v1ibx = create_inbox 'v1medium', indexlevel => 'medium',
+				tmpdir => "$home/v1tmp", sub {
 		$_[0]->add(eml_load('t/utf8.eml'));
 	};
 	$lxs->prepare_external($v1ibx);
@@ -85,6 +87,24 @@ is($lxs->over, undef, '->over fails');
 	}
 	my $mset = $lxs->mset('m:testmessage@example.com');
 	is($mset->size, 1, 'got m: match on medium+full XSearch mix');
+	my $mitem = ($mset->items)[0];
+	my $smsg = $lxs->smsg_for($mitem) or BAIL_OUT 'smsg_for broken';
+
+	my $ale = PublicInbox::LeiALE->new("$home/ale");
+	$ale->refresh_externals($lxs);
+	my $exp = [ $smsg->{blob}, 'blob', -s 't/utf8.eml' ];
+	is_deeply([ $ale->git->check($smsg->{blob}) ], $exp, 'ale->git->check');
+
+	$lxs = PublicInbox::LeiXSearch->new;
+	$lxs->prepare_external($v2ibx);
+	$ale->refresh_externals($lxs);
+	is_deeply([ $ale->git->check($smsg->{blob}) ], $exp,
+			'ale->git->check remembered inactive external');
+
+	rename("$home/v1tmp", "$home/v1moved") or BAIL_OUT "rename: $!";
+	$ale->refresh_externals($lxs);
+	is($ale->git->check($smsg->{blob}), undef,
+			'missing after directory gone');
 }
 
 done_testing;

^ permalink raw reply related	[relevance 33%]

* [PATCH 2/5] lei q: support vmd for external-only messages
  2021-03-20 10:04 67% [PATCH 0/5] lei: preserve keywords across queries Eric Wong
  2021-03-20 10:04 33% ` [PATCH 1/5] lei: All Local Externals: bare git dir for alternates Eric Wong
@ 2021-03-20 10:04 25% ` Eric Wong
  2021-03-20 10:04 64% ` [PATCH 3/5] lei q: put keywords on one line in --pretty output Eric Wong
  2021-03-20 10:04 59% ` [PATCH 5/5] lei: tie ALE lifetime to config file Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-20 10:04 UTC (permalink / raw)
  To: meta

"lei q" now preserves changes per-message keywords across
invocations when it's --output (Maildir or mbox) is reused
(with or without --augment).

In the future, these changes will be monitored via inotify,
EVFILT_VNODE or IMAP IDLE, too.

Unfortunately, this currently prevents "lei import" from ever
importing a message that's in an external.  That will be fixed
in a future change.
---
 lib/PublicInbox/LeiOverview.pm |  4 ++
 lib/PublicInbox/LeiSearch.pm   | 37 ++++++++++-----
 lib/PublicInbox/LeiStore.pm    | 83 ++++++++++++++++++++--------------
 lib/PublicInbox/LeiToMail.pm   | 16 ++++++-
 lib/PublicInbox/LeiXSearch.pm  | 17 +++----
 lib/PublicInbox/Over.pm        | 22 ++++++++-
 lib/PublicInbox/OverIdx.pm     | 10 ----
 lib/PublicInbox/SearchIdx.pm   |  3 ++
 t/eml.t                        |  2 +
 t/lei-convert.t                |  3 +-
 t/lei-q-kw.t                   | 48 +++++++++++++++++++-
 t/lei-q-thread.t               |  7 +--
 12 files changed, 177 insertions(+), 75 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 1036f465..48237f8a 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -216,9 +216,11 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		}
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
+		my $lse = $lei->{sto}->search;
 		sub { # DIY prettiness :P
 			my ($smsg, $mitem) = @_;
 			return if $dedupe->is_smsg_dup($smsg);
+			$lse->xsmsg_vmd($smsg);
 			$smsg = _unbless_smsg($smsg, $mitem);
 			$buf .= "{\n";
 			$buf .= join(",\n", map {
@@ -238,9 +240,11 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		}
 	} elsif ($json) {
 		my $ORS = $self->{fmt} eq 'json' ? ",\n" : "\n"; # JSONL
+		my $lse = $lei->{sto}->search;
 		sub {
 			my ($smsg, $mitem) = @_;
 			return if $dedupe->is_smsg_dup($smsg);
+			$lse->xsmsg_vmd($smsg);
 			$buf .= $json->encode(_unbless_smsg(@_)) . $ORS;
 			return if length($buf) < 65536;
 			my $lk = $self->lock_for_scope;
diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm
index 2e3f10fd..360a37e5 100644
--- a/lib/PublicInbox/LeiSearch.pm
+++ b/lib/PublicInbox/LeiSearch.pm
@@ -27,6 +27,20 @@ sub msg_keywords {
 	wantarray ? sort(keys(%$kw)) : $kw;
 }
 
+sub xsmsg_vmd {
+	my ($self, $smsg) = @_;
+	return if $smsg->{kw};
+	my $xdb = $self->xdb; # set {nshard};
+	my %kw;
+	$kw{flagged} = 1 if delete($smsg->{lei_q_tt_flagged});
+	my @num = $self->over->blob_exists($smsg->{blob});
+	for my $num (@num) { # there should only be one...
+		my $kw = xap_terms('K', $xdb, num2docid($self, $num));
+		%kw = (%kw, %$kw);
+	}
+	$smsg->{kw} = [ sort keys %kw ] if scalar(keys(%kw));
+}
+
 # when a message has no Message-IDs at all, this is needed for
 # unsent Draft messages, at least
 sub content_key ($) {
@@ -43,41 +57,42 @@ sub content_key ($) {
 }
 
 sub _cmp_1st { # git->cat_async callback
-	my ($bref, $oid, $type, $size, $cmp) = @_; # cmp: [chash, found, smsg]
-	if (content_hash(PublicInbox::Eml->new($bref)) eq $cmp->[0]) {
+	my ($bref, $oid, $type, $size, $cmp) = @_; # cmp: [chash, xoids, smsg]
+	if ($bref && content_hash(PublicInbox::Eml->new($bref)) eq $cmp->[0]) {
 		$cmp->[1]->{$oid} = $cmp->[2]->{num};
 	}
 }
 
-sub xids_for { # returns { OID => docid } mapping for $eml matches
+sub xoids_for { # returns { OID => docid } mapping for $eml matches
 	my ($self, $eml, $min) = @_;
 	my ($chash, $mids) = content_key($eml);
 	my @overs = ($self->over // $self->overs_all);
 	my $git = $self->git;
-	my $found = {};
+	my $xoids = {};
 	for my $mid (@$mids) {
 		for my $o (@overs) {
 			my ($id, $prev);
 			while (my $cur = $o->next_by_mid($mid, \$id, \$prev)) {
-				next if $found->{$cur->{blob}};
+				next if $cur->{bytes} == 0 ||
+					$xoids->{$cur->{blob}};
 				$git->cat_async($cur->{blob}, \&_cmp_1st,
-						[ $chash, $found, $cur ]);
-				if ($min && scalar(keys %$found) >= $min) {
+						[ $chash, $xoids, $cur ]);
+				if ($min && scalar(keys %$xoids) >= $min) {
 					$git->cat_async_wait;
-					return $found;
+					return $xoids;
 				}
 			}
 		}
 	}
 	$git->cat_async_wait;
-	scalar(keys %$found) ? $found : undef;
+	scalar(keys %$xoids) ? $xoids : undef;
 }
 
 # returns true if $eml is indexed by lei/store and keywords don't match
 sub kw_changed {
 	my ($self, $eml, $new_kw_sorted) = @_;
-	my $found = xids_for($self, $eml, 1) // return;
-	my ($num) = values %$found;
+	my $xoids = xoids_for($self, $eml, 1) // return;
+	my ($num) = values %$xoids;
 	my @cur_kw = msg_keywords($self, $num);
 	join("\0", @$new_kw_sorted) eq join("\0", @cur_kw) ? 0 : 1;
 }
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index c1abc288..c66d3dc2 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -114,6 +114,7 @@ sub _docids_for ($$) {
 	for my $mid (@$mids) {
 		my ($id, $prev);
 		while (my $cur = $oidx->next_by_mid($mid, \$id, \$prev)) {
+			next if $cur->{bytes} == 0; # external-only message
 			my $oid = $cur->{blob};
 			my $docid = $cur->{num};
 			my $bref = $im ? $im->cat_blob($oid) : undef;
@@ -163,7 +164,7 @@ sub add_eml {
 	my ($self, $eml, $vmd) = @_;
 	my $im = $self->importer; # may create new epoch
 	my $eidx = eidx_init($self); # writes ALL.git/objects/info/alternates
-	my $oidx = $eidx->{oidx};
+	my $oidx = $eidx->{oidx}; # PublicInbox::Import::add checks this
 	my $smsg = bless { -oidx => $oidx }, 'PublicInbox::Smsg';
 	$im->add($eml, undef, $smsg) or return; # duplicate returns undef
 
@@ -193,22 +194,54 @@ sub set_eml {
 	add_eml($self, $eml, $vmd) // set_eml_vmd($self, $eml, $vmd);
 }
 
-sub add_eml_maybe {
-	my ($self, $eml) = @_;
-	my $lxs = $self->{lxs_all_local} // die 'BUG: no {lxs_all_local}';
-	return if $lxs->xids_for($eml, 1);
-	add_eml($self, $eml);
-}
-
 # set or update keywords for external message, called via ipc_do
-sub set_xkw {
-	my ($self, $eml, $kw) = @_;
-	my $lxs = $self->{lxs_all_local} // die 'BUG: no {lxs_all_local}';
-	if ($lxs->xids_for($eml, 1)) { # is it in a local external?
-		# TODO: index keywords only
-	} else {
-		set_eml($self, $eml, { kw => $kw });
+sub set_xvmd {
+	my ($self, $xoids, $eml, $vmd) = @_;
+
+	my $eidx = eidx_init($self);
+	my $oidx = $eidx->{oidx};
+
+	# see if we can just update existing docs
+	for my $oid (keys %$xoids) {
+		my @docids = $oidx->blob_exists($oid) or next;
+		scalar(@docids) > 1 and
+			warn "W: $oid indexed as multiple docids: @docids\n";
+		for my $docid (@docids) {
+			my $idx = $eidx->idx_shard($docid);
+			$idx->ipc_do('set_vmd', $docid, $vmd);
+		}
+		delete $xoids->{$oid}; # all done with this oid
 	}
+	return unless scalar(keys(%$xoids));
+
+	# see if it was indexed, but with different OID(s)
+	if (my @docids = _docids_for($self, $eml)) {
+		for my $docid (@docids) {
+			for my $oid (keys %$xoids) {
+				$oidx->add_xref3($docid, -1, $oid, '.');
+			}
+			my $idx = $eidx->idx_shard($docid);
+			$idx->ipc_do('set_vmd', $docid, $vmd);
+		}
+		return;
+	}
+	# totally unseen
+	my $smsg = bless { blob => '' }, 'PublicInbox::Smsg';
+	$smsg->{num} = $oidx->adj_counter('eidx_docid', '+');
+	# save space for an externals-only message
+	my $hdr = $eml->header_obj;
+	$smsg->populate($hdr); # sets lines == 0
+	$smsg->{bytes} = 0;
+	delete @$smsg{qw(From Subject)};
+	$smsg->{to} = $smsg->{cc} = $smsg->{from} = '';
+	$oidx->add_overview($hdr, $smsg); # subject+references for threading
+	$smsg->{subject} = '';
+	for my $oid (keys %$xoids) {
+		$oidx->add_xref3($smsg->{num}, -1, $oid, '.');
+	}
+	my $idx = $eidx->idx_shard($smsg->{num});
+	$idx->index_eml(PublicInbox::Eml->new("\n\n"), $smsg);
+	$idx->ipc_do('add_vmd', $smsg->{num}, $vmd);
 }
 
 sub checkpoint {
@@ -240,28 +273,9 @@ sub ipc_atfork_child {
 	$self->SUPER::ipc_atfork_child;
 }
 
-sub refresh_local_externals {
-	my ($self) = @_;
-	my $cfg = $self->{lei}->_lei_cfg or return;
-	my $cur_cfg = $self->{cur_cfg} // -1;
-	my $lxs = $self->{lxs_all_local};
-	if ($cfg != $cur_cfg || !$lxs) {
-		$lxs = PublicInbox::LeiXSearch->new;
-		my @loc = $self->{lei}->externals_each;
-		for my $loc (@loc) { # locals only
-			$lxs->prepare_external($loc) if -d $loc;
-		}
-		$self->{lei}->ale->refresh_externals($lxs);
-		$lxs->{git} = $self->{lei}->ale->git;
-		$self->{lxs_all_local} = $lxs;
-		$self->{cur_cfg} = $cfg;
-	}
-}
-
 sub write_prepare {
 	my ($self, $lei) = @_;
 	unless ($self->{-ipc_req}) {
-		require PublicInbox::LeiXSearch;
 		$self->ipc_lock_init($lei->store_path . '/ipc.lock');
 		# Mail we import into lei are private, so headers filtered out
 		# by -mda for public mail are not appropriate
@@ -269,7 +283,6 @@ sub write_prepare {
 		$self->ipc_worker_spawn('lei_store', $lei->oldset,
 					{ lei => $lei });
 	}
-	my $wait = $self->ipc_do('refresh_local_externals');
 	$lei->{sto} = $self;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 7e821646..3e6cf00c 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -11,6 +11,7 @@ use PublicInbox::Lock;
 use PublicInbox::ProcessPipe;
 use PublicInbox::Spawn qw(which spawn popen_rd);
 use PublicInbox::LeiDedupe;
+use PublicInbox::Git;
 use PublicInbox::GitAsyncCat;
 use PublicInbox::PktOp qw(pkt_do);
 use Symbol qw(gensym);
@@ -260,10 +261,12 @@ sub _mbox_write_cb ($$) {
 	my $atomic_append = !defined($ovv->{lock_path});
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe;
+	my $lse = $lei->{sto} ? $lei->{sto}->search : undef;
 	sub { # for git_to_mail
 		my ($buf, $smsg, $eml) = @_;
 		$eml //= PublicInbox::Eml->new($buf);
 		return if $dedupe->is_dup($eml, $smsg->{blob});
+		$lse->xsmsg_vmd($smsg) if $lse;
 		$buf = $eml2mbox->($eml, $smsg);
 		return atomic_append($lei, $buf) if $atomic_append;
 		my $lk = $ovv->lock_for_scope;
@@ -275,10 +278,15 @@ sub update_kw_maybe ($$$$) {
 	my ($lei, $lse, $eml, $kw) = @_;
 	return unless $lse;
 	my $x = $lse->kw_changed($eml, $kw);
+	my $vmd = { kw => $kw };
 	if ($x) {
-		$lei->{sto}->ipc_do('set_eml', $eml, { kw => $kw });
+		$lei->{sto}->ipc_do('set_eml', $eml, $vmd);
 	} elsif (!defined($x)) {
-		$lei->{sto}->ipc_do('set_xkw', $eml, $kw);
+		if (my $xoids = $lei->{ale}->xoids_for($eml)) {
+			$lei->{sto}->ipc_do('set_xvmd', $xoids, $eml, $vmd);
+		} else {
+			$lei->{sto}->ipc_do('set_eml', $eml, $vmd);
+		}
 	}
 }
 
@@ -342,10 +350,12 @@ sub _maildir_write_cb ($$) {
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe if $dedupe;
 	my $dst = $lei->{ovv}->{dst};
+	my $lse = $lei->{sto} ? $lei->{sto}->search : undef;
 	sub { # for git_to_mail
 		my ($buf, $smsg, $eml) = @_;
 		$dst // return $lei->fail; # dst may be undef-ed in last run
 		$buf //= \($eml->as_string);
+		$lse->xsmsg_vmd($smsg) if $lse;
 		return _buf2maildir($dst, $buf, $smsg) if !$dedupe;
 		$eml //= PublicInbox::Eml->new($$buf); # copy buf
 		return if $dedupe->is_dup($eml, $smsg->{blob});
@@ -361,6 +371,7 @@ sub _imap_write_cb ($$) {
 	my $imap_append = $lei->{net}->can('imap_append');
 	my $mic = $lei->{net}->mic_get($self->{uri});
 	my $folder = $self->{uri}->mailbox;
+	my $lse = $lei->{sto} ? $lei->{sto}->search : undef;
 	sub { # for git_to_mail
 		my ($bref, $smsg, $eml) = @_;
 		$mic // return $lei->fail; # dst may be undef-ed in last run
@@ -368,6 +379,7 @@ sub _imap_write_cb ($$) {
 			$eml //= PublicInbox::Eml->new($$bref); # copy bref
 			return if $dedupe->is_dup($eml, $smsg->{blob});
 		}
+		$lse->xsmsg_vmd($smsg) if $lse;
 		eval { $imap_append->($mic, $folder, $bref, $smsg, $eml) };
 		if (my $err = $@) {
 			undef $mic;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1266b3b3..57717b87 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -83,6 +83,7 @@ sub smsg_for {
 	my $num = int(($docid - 1) / $nshard) + 1;
 	my $ibx = $self->{shard2ibx}->[$shard];
 	my $smsg = $ibx->over->get_art($num);
+	return if $smsg->{bytes} == 0;
 	mitem_kw($smsg, $mitem) if $ibx->can('msg_keywords');
 	$smsg->{docid} = $docid;
 	$smsg;
@@ -97,11 +98,6 @@ sub recent {
 
 sub over {}
 
-sub overs_all { # for xids_for
-	my ($self) = @_;
-	grep(defined, map { $_->over } locals($self))
-}
-
 sub _mset_more ($$) {
 	my ($mset, $mo) = @_;
 	my $size = $mset->size;
@@ -153,7 +149,7 @@ sub query_thread_mset { # for --threads
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
 	my $can_kw = !!$ibxish->can('msg_keywords');
-	my $fl = $lei->{opt}->{threads} > 1 ? [ 'flagged' ] : undef;
+	my $fl = $lei->{opt}->{threads} > 1 ? 1 : undef;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		mset_progress($lei, $desc, $mset->size,
@@ -165,13 +161,14 @@ sub query_thread_mset { # for --threads
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
-				wait_startq($lei);
 				my $mitem = delete $n2item{$smsg->{num}};
+				next if $smsg->{bytes} == 0;
+				wait_startq($lei); # wait for keyword updates
 				if ($mitem) {
 					if ($can_kw) {
 						mitem_kw($smsg, $mitem, $fl);
 					} elsif ($fl) {
-						$smsg->{kw} = $fl;
+						$smsg->{lei_q_tt_flagged} = 1;
 					}
 				}
 				$each_smsg->($smsg, $mitem);
@@ -209,8 +206,8 @@ sub query_mset { # non-parallel for non-"--threads" users
 
 sub each_remote_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self, $lei, $each_smsg) = @_;
-	if (my $sto = $self->{import_sto}) {
-		$sto->ipc_do('add_eml_maybe', $eml);
+	if ($self->{import_sto} && !$lei->{ale}->xoids_for($eml, 1)) {
+		$self->{import_sto}->ipc_do('add_eml', $eml);
 	}
 	my $smsg = bless {}, 'PublicInbox::Smsg';
 	$smsg->populate($eml);
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index 06ea439d..587e0516 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -7,7 +7,7 @@
 package PublicInbox::Over;
 use strict;
 use v5.10.1;
-use DBI;
+use DBI qw(:sql_types); # SQL_BLOB
 use DBD::SQLite;
 use PublicInbox::Smsg;
 use Compress::Zlib qw(uncompress);
@@ -349,4 +349,24 @@ sub check_inodes {
 	}
 }
 
+sub blob_exists {
+	my ($self, $oidhex) = @_;
+	if (wantarray) {
+		my $sth = $self->dbh->prepare_cached(<<'', undef, 1);
+SELECT docid FROM xref3 WHERE oidbin = ?
+
+		$sth->bind_param(1, pack('H*', $oidhex), SQL_BLOB);
+		$sth->execute;
+		my $tmp = $sth->fetchall_arrayref;
+		map { $_->[0] } @$tmp;
+	} else {
+		my $sth = $self->dbh->prepare_cached(<<'', undef, 1);
+SELECT COUNT(*) FROM xref3 WHERE oidbin = ?
+
+		$sth->bind_param(1, pack('H*', $oidhex), SQL_BLOB);
+		$sth->execute;
+		$sth->fetchrow_array;
+	}
+}
+
 1;
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 9013ae23..e1cd31b9 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -668,14 +668,4 @@ DELETE FROM eidxq WHERE docid = ?
 
 }
 
-sub blob_exists {
-	my ($self, $oidhex) = @_;
-	my $sth = $self->dbh->prepare_cached(<<'', undef, 1);
-SELECT COUNT(*) FROM xref3 WHERE oidbin = ?
-
-	$sth->bind_param(1, pack('H*', $oidhex), SQL_BLOB);
-	$sth->execute;
-	$sth->fetchrow_array;
-}
-
 1;
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index e2a1a678..3237aadc 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -494,7 +494,10 @@ sub add_eidx_info {
 	begin_txn_lazy($self);
 	my $doc = _get_doc($self, $docid) or return;
 	term_generator($self)->set_document($doc);
+
+	# '.' is special for lei_store
 	$doc->add_boolean_term('O'.$eidx_key) if $eidx_key ne '.';
+
 	index_list_id($self, $doc, $eml);
 	$self->{xdb}->replace_document($docid, $doc);
 }
diff --git a/t/eml.t b/t/eml.t
index ebd45c13..0cf48f22 100644
--- a/t/eml.t
+++ b/t/eml.t
@@ -26,6 +26,8 @@ sub mime_load ($) {
 	is($str, "hi\n", '->new modified body like Email::Simple');
 	is($eml->body, "hi\n", '->body works');
 	is($eml->as_string, "a: b\n\nhi\n", '->as_string');
+	my $empty = PublicInbox::Eml->new("\n\n");
+	is($empty->as_string, "\n\n", 'empty message');
 }
 
 for my $cls (@classes) {
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 186cfb13..e147715d 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -60,7 +60,8 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	PublicInbox::MdirReader::maildir_each_eml("$d/md", sub {
 		push @md, $_[2];
 	});
-	is(scalar(@md), scalar(@mboxrd), 'got expected emails in Maildir');
+	is(scalar(@md), scalar(@mboxrd), 'got expected emails in Maildir') or
+		diag $lei_err;
 	@md = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @md;
 	@mboxrd = sort { ${$a->{bdy}} cmp ${$b->{bdy}} } @mboxrd;
 	my @rd_nostatus = map {
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index 917a2c53..e7e14221 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -112,7 +112,51 @@ for my $sfx ('', '.gz') {
 	lei_ok(qw(q -o), "mboxrd:/dev/stdout", qw(m:qp@example.com)) or
 		diag $lei_err;
 	like($lei_out, qr/^Status: OR\n/sm, 'Status set by previous augment');
-}
+} # /mbox + mbox.gz tests
 
-});
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+
+# import keywords-only for external messages:
+$o = "$ENV{HOME}/kwdir";
+my $m = 'alpine.DEB.2.20.1608131214070.4924@example';
+my @inc = ('-I', "$ro_home/t1");
+lei_ok(qw(q -o), $o, "m:$m", @inc);
+
+# emulate MUA marking a Maildir message as read:
+@fn = glob("$o/cur/*");
+scalar(@fn) == 1 or BAIL_OUT "wrote multiple or zero files: ".explain(\@fn);
+rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
+
+lei_ok(qw(q -o), $o, 'bogus', \'clobber output dir to import keywords');
+@fn = glob("$o/cur/*");
+is_deeply(\@fn, [], 'output dir actually clobbered');
+lei_ok('q', "m:$m", @inc);
+my $res = json_utf8->decode($lei_out);
+is_deeply($res->[0]->{kw}, ['seen'], 'seen flag set for external message')
+	or diag explain($res);
+lei_ok('q', "m:$m", '--no-external');
+is_deeply($res = json_utf8->decode($lei_out), [ undef ],
+	'external message not imported') or diag explain($res);
+
+$o = "$ENV{HOME}/kwmboxrd";
+lei_ok(qw(q -o), "mboxrd:$o", "m:$m", @inc);
+
+# emulate MUA marking mboxrd message as unread
+open my $fh, '<', $o or BAIL_OUT;
+my $s = do { local $/; <$fh> };
+$s =~ s/^Status: OR\n/Status: O\nX-Status: A\n/sm or
+	fail "failed to clear R flag in $s";
+open $fh, '>', $o or BAIL_OUT;
+print $fh $s or BAIL_OUT;
+close $fh or BAIL_OUT;
+
+lei_ok(qw(q -o), "mboxrd:$o", 'm:bogus', @inc,
+	\'clobber mbox to import keywords');
+lei_ok(qw(q -o), "mboxrd:$o", "m:$m", @inc);
+open $fh, '<', $o or BAIL_OUT;
+$s = do { local $/; <$fh> };
+like($s, qr/^Status: O\n/ms, 'seen keyword gone in mbox');
+like($s, qr/^X-Status: A\n/ms, 'answered flag set');
+
+}); # test_lei
 done_testing;
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
index e24fb2cb..c999d12b 100644
--- a/t/lei-q-thread.t
+++ b/t/lei-q-thread.t
@@ -43,10 +43,11 @@ test_lei(sub {
 		'flagged set in direct hit');
 	lei_ok qw(q -tt m:testmessage@example.com --only), "$ro_home/t2";
 	$res = json_utf8->decode($lei_out);
-	is_deeply($res->[0]->{kw}, [ 'flagged' ],
-		'flagged set on external with -tt');
+	is_deeply($res->[0]->{kw}, [ qw(flagged seen) ],
+		'flagged set on external with -tt') or diag explain($res);
 	lei_ok qw(q -t m:testmessage@example.com --only), "$ro_home/t2";
 	$res = json_utf8->decode($lei_out);
-	ok(!exists($res->[0]->{kw}), 'flagged not set on external with 1 -t');
+	is_deeply($res->[0]->{kw}, [ 'seen' ],
+		'flagged not set on external with 1 -t') or diag explain($res);
 });
 done_testing;

^ permalink raw reply related	[relevance 25%]

* [PATCH 5/5] lei: tie ALE lifetime to config file
  2021-03-20 10:04 67% [PATCH 0/5] lei: preserve keywords across queries Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-20 10:04 64% ` [PATCH 3/5] lei q: put keywords on one line in --pretty output Eric Wong
@ 2021-03-20 10:04 59% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-20 10:04 UTC (permalink / raw)
  To: meta

This should make a future change to "lei import" work more
nicely, since we'll be needing ALE to vivify external-only
messages upon explicit "lei import".
---
 lib/PublicInbox/LEI.pm         |  3 +--
 lib/PublicInbox/LeiALE.pm      | 19 ++++++++++++++++---
 lib/PublicInbox/LeiExternal.pm |  6 ------
 lib/PublicInbox/LeiQuery.pm    |  4 ----
 t/lei_xsearch.t                |  2 +-
 5 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0da26a32..72a0e52c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -108,8 +108,7 @@ sub ale {
 	my ($self) = @_;
 	$self->{ale} //= do {
 		require PublicInbox::LeiALE;
-		PublicInbox::LeiALE->new(cache_dir($self).
-					'/all_locals_ever.git');
+		$self->_lei_cfg(1)->{ale} //= PublicInbox::LeiALE->new($self);
 	};
 }
 
diff --git a/lib/PublicInbox/LeiALE.pm b/lib/PublicInbox/LeiALE.pm
index bdb50a1a..45748435 100644
--- a/lib/PublicInbox/LeiALE.pm
+++ b/lib/PublicInbox/LeiALE.pm
@@ -11,16 +11,29 @@ use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::Lock);
 use PublicInbox::Git;
 use PublicInbox::Import;
+use PublicInbox::LeiXSearch;
 use Fcntl qw(SEEK_SET);
 
-sub new {
-	my ($cls, $d) = @_;
+sub _new {
+	my ($d) = @_;
 	PublicInbox::Import::init_bare($d, 'ale');
 	bless {
 		git => PublicInbox::Git->new($d),
 		lock_path => "$d/lei_ale.state", # dual-duty lock + state
 		ibxish => [], # Inbox and ExtSearch (and LeiSearch) objects
-	}, $cls;
+	}, __PACKAGE__
+}
+
+sub new {
+	my ($self, $lei) = @_;
+	ref($self) or $self = _new($lei->cache_dir . '/all_locals_ever.git');
+	my $lxs = PublicInbox::LeiXSearch->new;
+	$lxs->prepare_external($lei->_lei_store(1)->search);
+	for my $loc ($lei->externals_each) { # locals only
+		$lxs->prepare_external($loc) if -d $loc;
+	}
+	$self->refresh_externals($lxs);
+	$self;
 }
 
 sub over {} # undef for xoids_for
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index aa09be9e..b5dd85e1 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -139,12 +139,6 @@ sub add_external_finish {
 	my $key = "external.$location.boost";
 	my $cur_boost = $cfg->{$key};
 	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
-	if (-d $location) {
-		require PublicInbox::LeiXSearch;
-		my $lxs = PublicInbox::LeiXSearch->new;
-		$lxs->prepare_external($location);
-		$self->ale->refresh_externals($lxs);
-	}
 	$self->lei_config($key, $new_boost);
 }
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 007e35fc..148e8524 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -57,10 +57,6 @@ sub lei_q {
 	}
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		$lxs->prepare_external($lse);
-	} else {
-		my $tmp = PublicInbox::LeiXSearch->new;
-		$tmp->prepare_external($lse);
-		$self->ale->refresh_externals($tmp);
 	}
 	if (@only) {
 		for my $loc (@only) {
diff --git a/t/lei_xsearch.t b/t/lei_xsearch.t
index 68211d18..e56b2820 100644
--- a/t/lei_xsearch.t
+++ b/t/lei_xsearch.t
@@ -90,7 +90,7 @@ is($lxs->over, undef, '->over fails');
 	my $mitem = ($mset->items)[0];
 	my $smsg = $lxs->smsg_for($mitem) or BAIL_OUT 'smsg_for broken';
 
-	my $ale = PublicInbox::LeiALE->new("$home/ale");
+	my $ale = PublicInbox::LeiALE::_new("$home/ale");
 	$ale->refresh_externals($lxs);
 	my $exp = [ $smsg->{blob}, 'blob', -s 't/utf8.eml' ];
 	is_deeply([ $ale->git->check($smsg->{blob}) ], $exp, 'ale->git->check');

^ permalink raw reply related	[relevance 59%]

* [PATCH] lei q: trim JSON output
@ 2021-03-20 12:40 63% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-20 12:40 UTC (permalink / raw)
  To: meta

Stop showing `docid' since it's not useful with shards.

`bytes' and `lines' are probably noise, but maybe could be
visible in some "fuller" view.
---
 lib/PublicInbox/LeiOverview.pm | 8 ++++++--
 lib/PublicInbox/LeiXSearch.pm  | 3 ++-
 t/lei-import.t                 | 2 +-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 521bca50..1ce2a098 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -136,7 +136,10 @@ sub ovv_end {
 sub _unbless_smsg {
 	my ($smsg, $mitem) = @_;
 
-	delete @$smsg{qw(lines bytes num tid)};
+	# TODO: make configurable
+	# num/tid are nonsensical with multi-inbox search,
+	# lines/bytes are not generally useful
+	delete @$smsg{qw(num tid lines bytes)};
 	$smsg->{rt} = _iso8601(delete $smsg->{ts}); # JMAP receivedAt
 	$smsg->{dt} = _iso8601(delete $smsg->{ds}); # JMAP UTCDate
 	$smsg->{pct} = get_pct($mitem) if $mitem;
@@ -151,7 +154,8 @@ sub _unbless_smsg {
 		$smsg->{substr($f, 0, 1)} = pairs($v);
 	}
 	$smsg->{'s'} = delete $smsg->{subject};
-	scalar { %$smsg }; # unbless
+	my $kw = delete($smsg->{kw});
+	scalar { %$smsg, ($kw && scalar(@$kw) ? (kw => $kw) : ()) }; # unbless
 }
 
 sub ovv_atexit_child {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 57717b87..17171a7f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -70,6 +70,8 @@ sub mitem_kw ($$;$) {
 	my ($smsg, $mitem, $flagged) = @_;
 	my $kw = xap_terms('K', $mitem->get_document);
 	$kw->{flagged} = 1 if $flagged;
+	# we keep the empty array here to prevent expensive work in
+	# ->xsmsg_vmd, _unbless_smsg will clobber it iff it's empty
 	$smsg->{kw} = [ sort keys %$kw ];
 }
 
@@ -85,7 +87,6 @@ sub smsg_for {
 	my $smsg = $ibx->over->get_art($num);
 	return if $smsg->{bytes} == 0;
 	mitem_kw($smsg, $mitem) if $ibx->can('msg_keywords');
-	$smsg->{docid} = $docid;
 	$smsg;
 }
 
diff --git a/t/lei-import.t b/t/lei-import.t
index edb0cd20..e0b517f4 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -48,7 +48,7 @@ lei_ok([qw(import --no-kw -F eml -)], undef, $opt,
 lei(qw(q m:v@y));
 $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
-is_deeply($res->[0]->{kw}, [], 'no keywords set');
+is($res->[0]->{kw}, undef, 'no keywords set');
 
 # see t/lei_to_mail.t for "import -F mbox*"
 });

^ permalink raw reply related	[relevance 63%]

* [PATCH 0/3] lei import fix, other fixes
@ 2021-03-21  9:50 71% Eric Wong
  2021-03-21  9:50 36% ` [PATCH 1/3] lei import: vivify external-only messages Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2021-03-21  9:50 UTC (permalink / raw)
  To: meta

Things are finally coming together...

Eric Wong (3):
  lei import: vivify external-only messages
  lei q: fix warning on remote imports
  lei: fix some warnings in tests

 lib/PublicInbox/ContentHash.pm | 15 ++++++++---
 lib/PublicInbox/Import.pm      | 14 ++++++++++-
 lib/PublicInbox/LEI.pm         |  8 +++---
 lib/PublicInbox/LeiDedupe.pm   |  9 ++-----
 lib/PublicInbox/LeiExternal.pm |  2 +-
 lib/PublicInbox/LeiImport.pm   | 22 ++++++++++------
 lib/PublicInbox/LeiP2q.pm      |  2 +-
 lib/PublicInbox/LeiSearch.pm   |  5 +++-
 lib/PublicInbox/LeiStore.pm    | 46 +++++++++++++++++++++++++++++-----
 lib/PublicInbox/LeiXSearch.pm  |  6 ++++-
 lib/PublicInbox/MboxLock.pm    |  8 +++---
 lib/PublicInbox/Over.pm        |  2 +-
 lib/PublicInbox/SearchIdx.pm   | 12 +++++++--
 lib/PublicInbox/TestCommon.pm  |  9 +++++++
 t/lei-q-kw.t                   | 44 ++++++++++++++++++++++++++++++++
 t/lei-q-remote-import.t        |  3 ++-
 16 files changed, 167 insertions(+), 40 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/3] lei q: fix warning on remote imports
  2021-03-21  9:50 71% [PATCH 0/3] lei import fix, other fixes Eric Wong
  2021-03-21  9:50 36% ` [PATCH 1/3] lei import: vivify external-only messages Eric Wong
@ 2021-03-21  9:50 57% ` Eric Wong
  2021-03-21  9:50 52% ` [PATCH 3/3] lei: fix some warnings in tests Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-21  9:50 UTC (permalink / raw)
  To: meta

This will let us tie keywords from remote externals
to those which only exist in local externals.
---
 lib/PublicInbox/ContentHash.pm | 15 ++++++++++++---
 lib/PublicInbox/LeiDedupe.pm   |  9 ++-------
 lib/PublicInbox/LeiXSearch.pm  |  6 +++++-
 t/lei-q-remote-import.t        |  3 ++-
 4 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/ContentHash.pm b/lib/PublicInbox/ContentHash.pm
index 4dbe7b50..112b1ea6 100644
--- a/lib/PublicInbox/ContentHash.pm
+++ b/lib/PublicInbox/ContentHash.pm
@@ -8,9 +8,9 @@
 # See L<public-inbox-v2-format(5)> manpage for more details.
 package PublicInbox::ContentHash;
 use strict;
-use warnings;
-use base qw/Exporter/;
-our @EXPORT_OK = qw/content_hash content_digest/;
+use v5.10.1;
+use parent qw(Exporter);
+our @EXPORT_OK = qw(content_hash content_digest git_sha);
 use PublicInbox::MID qw(mids references);
 use PublicInbox::MsgIter;
 
@@ -94,4 +94,13 @@ sub content_hash ($) {
 	content_digest($_[0])->digest;
 }
 
+sub git_sha ($$) {
+	my ($n, $eml) = @_;
+	my $dig = Digest::SHA->new($n);
+	my $buf = $eml->as_string;
+	$dig->add('blob '.length($buf)."\0");
+	$dig->add($buf);
+	$dig;
+}
+
 1;
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 5fec9384..a62b3a7c 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -3,7 +3,7 @@
 package PublicInbox::LeiDedupe;
 use strict;
 use v5.10.1;
-use PublicInbox::ContentHash qw(content_hash);
+use PublicInbox::ContentHash qw(content_hash git_sha);
 use Digest::SHA ();
 
 # n.b. mutt sets most of these headers not sure about Bytes
@@ -18,12 +18,7 @@ sub _regen_oid ($) {
 		push @stash, [ $k, \@v ];
 		$eml->header_set($k); # restore below
 	}
-	my $dig = Digest::SHA->new(1); # XXX SHA256 later
-	my $buf = $eml->as_string;
-	$dig->add('blob '.length($buf)."\0");
-	$dig->add($buf);
-	undef $buf;
-
+	my $dig = git_sha(1, $eml);
 	for my $kv (@stash) { # restore stashed headers
 		my ($k, @v) = @$kv;
 		$eml->header_set($k, @v);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 17171a7f..b6aaf3e1 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -18,6 +18,7 @@ use PublicInbox::MID qw(mids);
 use PublicInbox::Smsg;
 use PublicInbox::Eml;
 use Fcntl qw(SEEK_SET F_SETFL O_APPEND O_RDWR);
+use PublicInbox::ContentHash qw(git_sha);
 
 sub new {
 	my ($class) = @_;
@@ -207,10 +208,13 @@ sub query_mset { # non-parallel for non-"--threads" users
 
 sub each_remote_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self, $lei, $each_smsg) = @_;
-	if ($self->{import_sto} && !$lei->{ale}->xoids_for($eml, 1)) {
+	my $xoids = $lei->{ale}->xoids_for($eml, 1);
+	if ($self->{import_sto} && !$xoids) {
 		$self->{import_sto}->ipc_do('add_eml', $eml);
 	}
 	my $smsg = bless {}, 'PublicInbox::Smsg';
+	$smsg->{blob} = $xoids ? (keys(%$xoids))[0]
+				: git_sha(1, $eml)->hexdigest;
 	$smsg->populate($eml);
 	$smsg->parse_references($eml, mids($eml));
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
diff --git a/t/lei-q-remote-import.t b/t/lei-q-remote-import.t
index 25e461ac..93828a24 100644
--- a/t/lei-q-remote-import.t
+++ b/t/lei-q-remote-import.t
@@ -65,8 +65,9 @@ test_lei({ tmpdir => $tmpdir }, sub {
 		$im->add(eml_load('t/utf8.eml')) or BAIL_OUT '->add';
 	};
 	lei_ok(qw(add-external -q), $ibx->{inboxdir});
-	lei_ok(qw(q -o), "mboxrd:$o", '--only', $url,
+	lei_ok(qw(q -q -o), "mboxrd:$o", '--only', $url,
 		'm:testmessage@example.com');
+	is($lei_err, '', 'no warnings or errors');
 	ok(-s $o, 'got result from remote external');
 	my $exp = eml_load('t/utf8.eml');
 	is_deeply($slurp_emls->($o), [$exp], 'got expected result');

^ permalink raw reply related	[relevance 57%]

* [PATCH 3/3] lei: fix some warnings in tests
  2021-03-21  9:50 71% [PATCH 0/3] lei import fix, other fixes Eric Wong
  2021-03-21  9:50 36% ` [PATCH 1/3] lei import: vivify external-only messages Eric Wong
  2021-03-21  9:50 57% ` [PATCH 2/3] lei q: fix warning on remote imports Eric Wong
@ 2021-03-21  9:50 52% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-21  9:50 UTC (permalink / raw)
  To: meta

And then test the contents of $lei_err to ensure it doesn't
happen again.

We'll also make MboxLock emit nicer warnings without the line
number, since the line number is irrelevant to the user fixing
an mbox lock contention problem.

Finally, we'll also allow showing loud warnings via
TEST_LEI_ERR_LOUD=1
---
 lib/PublicInbox/LEI.pm         | 8 ++++----
 lib/PublicInbox/LeiExternal.pm | 2 +-
 lib/PublicInbox/LeiP2q.pm      | 2 +-
 lib/PublicInbox/MboxLock.pm    | 8 ++++----
 lib/PublicInbox/TestCommon.pm  | 9 +++++++++
 5 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 72a0e52c..bf97a680 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -757,7 +757,7 @@ sub lei__complete {
 	my ($proto, undef, @spec) = @$info;
 	my $cur = pop @argv;
 	my $re = defined($cur) ? qr/\A\Q$cur\E/ : qr/./;
-	if (substr($cur // '-', 0, 1) eq '-') { # --switches
+	if (substr(my $_cur = $cur // '-', 0, 1) eq '-') { # --switches
 		# gross special case since the only git-config options
 		# Consider moving to a table if we need more special cases
 		# we use Getopt::Long for are the ones we reject, so these
@@ -781,7 +781,7 @@ sub lei__complete {
 			}
 			map {
 				my $x = length > 1 ? "--$_" : "-$_";
-				$x eq $cur ? () : $x;
+				$x eq $_cur ? () : $x;
 			} grep(!/_/, split(/\|/, $_, -1)) # help|h
 		} grep { $OPTDESC{"$_\t$cmd"} || $OPTDESC{$_} } @spec);
 	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
@@ -796,13 +796,13 @@ sub lei__complete {
 			my @v = ref($v) ? split(/\|/, $v->[0]) : ();
 			# get rid of ALL CAPS placeholder (e.g "OUT")
 			# (TODO: completion for external paths)
-			shift(@v) if uc($v[0]) eq $v[0];
+			shift(@v) if scalar(@v) && uc($v[0]) eq $v[0];
 			@v;
 		} grep(/\A(?:[\w-]+\|)*$opt\b.*?(?:\t$cmd)?\z/, keys %OPTDESC);
 	}
 	$cmd =~ tr/-/_/;
 	if (my $sub = $self->can("_complete_$cmd")) {
-		puts $self, $sub->($self, @argv, $cur);
+		puts $self, $sub->($self, @argv, $cur ? ($cur) : ());
 	}
 	# TODO: URLs, pathnames, OIDs, MIDs, etc...  See optparse() for
 	# proto parsing.
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index b5dd85e1..f4e24c2a 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -222,7 +222,7 @@ sub _complete_url_common ($) {
 	# Maybe there's a better way to go about this in
 	# contrib/completion/lei-completion.bash
 	my $re = '';
-	my $cur = pop @$argv;
+	my $cur = pop(@$argv) // '';
 	if (@$argv) {
 		my @x = @$argv;
 		if ($cur eq ':' && @x) {
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index e7ddc852..c5718603 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -144,7 +144,7 @@ sub do_p2q { # via wq_do
 			my $end = ($pfx =~ s/([0-9\*]+)\z//) ? $1 : '';
 			my $x = delete($lei->{qterms}->{$pfx}) or next;
 			my $star = $end =~ tr/*//d ? '*' : '';
-			my $min_len = ($end // 0) + 0;
+			my $min_len = ($end || 0) + 0;
 
 			# no wildcards for bool_pfx_external
 			$star = '' if $pfx =~ /\A(dfpre|dfpost|mid)\z/;
diff --git a/lib/PublicInbox/MboxLock.pm b/lib/PublicInbox/MboxLock.pm
index 4e2a2d9a..bea0e325 100644
--- a/lib/PublicInbox/MboxLock.pm
+++ b/lib/PublicInbox/MboxLock.pm
@@ -43,13 +43,13 @@ EOF
 		}
 		select(undef, undef, undef, $self->{delay});
 	} while (now < $end);
-	croak "fcntl lock $self->{f}: $!";
+	die "fcntl lock timeout $self->{f}: $!\n";
 }
 
 sub acq_dotlock {
 	my ($self) = @_;
 	my $dot_lock = "$self->{f}.lock";
-	my ($pfx, $base) = ($self->{f} =~ m!(\A.*?/)([^/]+)\z!);
+	my ($pfx, $base) = ($self->{f} =~ m!(\A.*?/)?([^/]+)\z!);
 	$pfx //= '';
 	my $pid = $$;
 	my $end = now + $self->{timeout};
@@ -68,7 +68,7 @@ sub acq_dotlock {
 			croak "open $tmp (for $dot_lock): $!" if !$!{EXIST};
 		}
 	} while (now < $end);
-	croak "dotlock $dot_lock";
+	die "dotlock timeout $dot_lock\n";
 }
 
 sub acq_flock {
@@ -80,7 +80,7 @@ sub acq_flock {
 		return if flock($self->{fh}, $op);
 		select(undef, undef, undef, $self->{delay});
 	} while (now < $end);
-	croak "flock $self->{f}: $!";
+	die "flock timeout $self->{f}: $!\n";
 }
 
 sub acq {
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 0d15514e..e67e94ea 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -457,6 +457,15 @@ sub lei (@) {
 	my $res = run_script(['lei', @$cmd], $env, $xopt // $lei_opt);
 	$err_skip and
 		$lei_err = join('', grep(!/$err_skip/, split(/^/m, $lei_err)));
+	if ($lei_err ne '') {
+		if ($lei_err =~ /Use of uninitialized/ ||
+			$lei_err =~ m!\bArgument .*? isn't numeric in !) {
+			fail "lei_err=$lei_err";
+		} else {
+			state $loud = $ENV{TEST_LEI_ERR_LOUD};
+			diag "lei_err=$lei_err" if $loud;
+		}
+	}
 	$res;
 };
 

^ permalink raw reply related	[relevance 52%]

* [PATCH 1/3] lei import: vivify external-only messages
  2021-03-21  9:50 71% [PATCH 0/3] lei import fix, other fixes Eric Wong
@ 2021-03-21  9:50 36% ` Eric Wong
  2021-03-21  9:50 57% ` [PATCH 2/3] lei q: fix warning on remote imports Eric Wong
  2021-03-21  9:50 52% ` [PATCH 3/3] lei: fix some warnings in tests Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-21  9:50 UTC (permalink / raw)
  To: meta

Keyword storage for external-only messages was preventing
messages from being explicitly imported.  Teach lei_store
to vivify keyword-only entries into fully-indexed messages
on import.
---
 lib/PublicInbox/Import.pm    | 14 ++++++++++-
 lib/PublicInbox/LeiImport.pm | 22 +++++++++++------
 lib/PublicInbox/LeiSearch.pm |  5 +++-
 lib/PublicInbox/LeiStore.pm  | 46 +++++++++++++++++++++++++++++++-----
 lib/PublicInbox/Over.pm      |  2 +-
 lib/PublicInbox/SearchIdx.pm | 12 ++++++++--
 t/lei-q-kw.t                 | 44 ++++++++++++++++++++++++++++++++++
 7 files changed, 127 insertions(+), 18 deletions(-)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index b8fa5c21..34738279 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -413,7 +413,19 @@ sub add {
 		$smsg->{blob} = $self->get_mark(":$blob");
 		$smsg->set_bytes($raw_email, $n);
 		if (my $oidx = delete $smsg->{-oidx}) { # used by LeiStore
-			return if $oidx->blob_exists($smsg->{blob});
+			my @docids = $oidx->blob_exists($smsg->{blob});
+			my @vivify_xvmd;
+			for my $id (@docids) {
+				if (my $cur = $oidx->get_art($id)) {
+					# already imported if bytes > 0
+					return if $cur->{bytes} > 0;
+					push @vivify_xvmd, $id;
+				} else {
+					warn "W: $smsg->{blob} ",
+						"#$id gone (bug?)\n";
+				}
+			}
+			$smsg->{-vivify_xvmd} = \@vivify_xvmd;
 		}
 	}
 	my $ref = $self->{ref};
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 137c22fc..ae24a1fa 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -10,9 +10,14 @@ use PublicInbox::Eml;
 use PublicInbox::PktOp qw(pkt_do);
 
 sub _import_eml { # MboxReader callback
-	my ($eml, $sto, $set_kw) = @_;
-	$sto->ipc_do('set_eml', $eml, $set_kw ?
-		{ kw => PublicInbox::MboxReader::mbox_keywords($eml) } : ());
+	my ($eml, $lei, $mbox_keywords) = @_;
+	my $vmd;
+	if ($mbox_keywords) {
+		my $kw = $mbox_keywords->($eml);
+		$vmd = { kw => $kw } if scalar(@$kw);
+	}
+	my $xoids = $lei->{ale}->xoids_for($eml);
+	$lei->{sto}->ipc_do('set_eml', $eml, $vmd, $xoids);
 }
 
 sub import_done_wait { # dwaitpid callback
@@ -41,6 +46,7 @@ sub net_merge_complete { # callback used by LeiAuth
 sub import_start {
 	my ($lei) = @_;
 	my $self = $lei->{imp};
+	$lei->ale;
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $net = $lei->{net}) {
 		# $j = $net->net_concurrency($j); TODO
@@ -130,7 +136,8 @@ sub ipc_atfork_child {
 
 sub _import_fh {
 	my ($lei, $fh, $input, $ifmt) = @_;
-	my $set_kw = $lei->{opt}->{kw};
+	my $kw = $lei->{opt}->{kw} ?
+		PublicInbox::MboxReader->can('mbox_keywords') : undef;
 	eval {
 		if ($ifmt eq 'eml') {
 			my $buf = do { local $/; <$fh> } //
@@ -138,11 +145,11 @@ sub _import_fh {
 error reading $input: $!
 
 			my $eml = PublicInbox::Eml->new(\$buf);
-			_import_eml($eml, $lei->{sto}, $set_kw);
+			_import_eml($eml, $lei, $kw);
 		} else { # some mbox (->can already checked in call);
 			my $cb = PublicInbox::MboxReader->can($ifmt) //
 				die "BUG: bad fmt=$ifmt";
-			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
+			$cb->(undef, $fh, \&_import_eml, $lei, $kw);
 		}
 	};
 	$lei->child_error(1 << 8, "$input: $@") if $@;
@@ -193,7 +200,8 @@ EOM
 sub import_stdin {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'in-format'});
+	my $in = delete $self->{0};
+	_import_fh($lei, $in, '<stdin>', $lei->{opt}->{'in-format'});
 }
 
 no warnings 'once'; # the following works even when LeiAuth is lazy-loaded
diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm
index 360a37e5..bbb00661 100644
--- a/lib/PublicInbox/LeiSearch.pm
+++ b/lib/PublicInbox/LeiSearch.pm
@@ -63,7 +63,10 @@ sub _cmp_1st { # git->cat_async callback
 	}
 }
 
-sub xoids_for { # returns { OID => docid } mapping for $eml matches
+# returns { OID => num } mapping for $eml matches
+# The `num' hash value only makes sense from LeiSearch itself
+# and is nonsense from the PublicInbox::LeiALE subclass
+sub xoids_for {
 	my ($self, $eml, $min) = @_;
 	my ($chash, $mids) = content_key($eml);
 	my @overs = ($self->over // $self->overs_all);
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index c66d3dc2..b390b318 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -161,7 +161,7 @@ sub remove_eml_vmd {
 }
 
 sub add_eml {
-	my ($self, $eml, $vmd) = @_;
+	my ($self, $eml, $vmd, $xoids) = @_;
 	my $im = $self->importer; # may create new epoch
 	my $eidx = eidx_init($self); # writes ALL.git/objects/info/alternates
 	my $oidx = $eidx->{oidx}; # PublicInbox::Import::add checks this
@@ -169,7 +169,40 @@ sub add_eml {
 	$im->add($eml, undef, $smsg) or return; # duplicate returns undef
 
 	local $self->{current_info} = $smsg->{blob};
-	if (my @docids = _docids_for($self, $eml)) {
+	my $vivify_xvmd = delete($smsg->{-vivify_xvmd}) // []; # exact matches
+	if ($xoids) { # fuzzy matches from externals in ale->xoids_for
+		delete $xoids->{$smsg->{blob}}; # added later
+		if (scalar keys %$xoids) {
+			my %docids = map { $_ => 1 } @$vivify_xvmd;
+			for my $oid (keys %$xoids) {
+				my @id = $oidx->blob_exists($oid);
+				@docids{@id} = @id;
+			}
+			@$vivify_xvmd = sort { $a <=> $b } keys(%docids);
+		}
+	}
+	if (@$vivify_xvmd) {
+		$xoids //= {};
+		$xoids->{$smsg->{blob}} = 1;
+		for my $docid (@$vivify_xvmd) {
+			my $cur = $oidx->get_art($docid);
+			my $idx = $eidx->idx_shard($docid);
+			if (!$cur || $cur->{bytes} == 0) { # really vivifying
+				$smsg->{num} = $docid;
+				$oidx->add_overview($eml, $smsg);
+				$smsg->{-merge_vmd} = 1;
+				$idx->index_eml($eml, $smsg);
+			} else { # lse fuzzy hit off ale
+				$idx->ipc_do('add_eidx_info', $docid, '.', $eml);
+			}
+			for my $oid (keys %$xoids) {
+				$oidx->add_xref3($docid, -1, $oid, '.');
+			}
+			$idx->ipc_do('add_vmd', $docid, $vmd) if $vmd;
+		}
+		$vivify_xvmd;
+	} elsif (my @docids = _docids_for($self, $eml)) {
+		# fuzzy match from within lei/store
 		for my $docid (@docids) {
 			my $idx = $eidx->idx_shard($docid);
 			$oidx->add_xref3($docid, -1, $smsg->{blob}, '.');
@@ -178,20 +211,21 @@ sub add_eml {
 			$idx->ipc_do('add_vmd', $docid, $vmd) if $vmd;
 		}
 		\@docids;
-	} else {
+	} else { # totally new message
 		$smsg->{num} = $oidx->adj_counter('eidx_docid', '+');
 		$oidx->add_overview($eml, $smsg);
 		$oidx->add_xref3($smsg->{num}, -1, $smsg->{blob}, '.');
 		my $idx = $eidx->idx_shard($smsg->{num});
 		$idx->index_eml($eml, $smsg);
-		$idx->ipc_do('add_vmd', $smsg->{num}, $vmd ) if $vmd;
+		$idx->ipc_do('add_vmd', $smsg->{num}, $vmd) if $vmd;
 		$smsg;
 	}
 }
 
 sub set_eml {
-	my ($self, $eml, $vmd) = @_;
-	add_eml($self, $eml, $vmd) // set_eml_vmd($self, $eml, $vmd);
+	my ($self, $eml, $vmd, $xoids) = @_;
+	add_eml($self, $eml, $vmd, $xoids) //
+		set_eml_vmd($self, $eml, $vmd);
 }
 
 # set or update keywords for external message, called via ipc_do
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index 587e0516..0e191c47 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -353,7 +353,7 @@ sub blob_exists {
 	my ($self, $oidhex) = @_;
 	if (wantarray) {
 		my $sth = $self->dbh->prepare_cached(<<'', undef, 1);
-SELECT docid FROM xref3 WHERE oidbin = ?
+SELECT docid FROM xref3 WHERE oidbin = ? ORDER BY docid ASC
 
 		$sth->bind_param(1, pack('H*', $oidhex), SQL_BLOB);
 		$sth->execute;
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 3237aadc..3f933121 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -11,6 +11,7 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::Search PublicInbox::Lock Exporter);
 use PublicInbox::Eml;
+use PublicInbox::Search qw(xap_terms);
 use PublicInbox::InboxWritable;
 use PublicInbox::MID qw(mids_for_index mids);
 use PublicInbox::MsgIter;
@@ -34,6 +35,7 @@ use constant DEBUG => !!$ENV{DEBUG};
 my $xapianlevels = qr/\A(?:full|medium)\z/;
 my $hex = '[a-f0-9]';
 my $OID = $hex .'{40,}';
+my @VMD_MAP = (kw => 'K', label => 'L');
 our $INDEXLEVELS = qr/\A(?:full|medium|basic)\z/;
 
 sub new {
@@ -428,7 +430,15 @@ sub eml2doc ($$$;$) {
 sub add_xapian ($$$$) {
 	my ($self, $eml, $smsg, $mids) = @_;
 	begin_txn_lazy($self);
+	my $merge_vmd = delete $smsg->{-merge_vmd};
 	my $doc = eml2doc($self, $eml, $smsg, $mids);
+	if (my $old = $merge_vmd ? _get_doc($self, $smsg->{num}) : undef) {
+		my @x = @VMD_MAP;
+		while (my ($field, $pfx) = splice(@x, 0, 2)) {
+			my $vals = xap_terms($pfx, $old);
+			$doc->add_boolean_term($pfx.$_) for keys %$vals;
+		}
+	}
 	$self->{xdb}->replace_document($smsg->{num}, $doc);
 }
 
@@ -531,8 +541,6 @@ sub remove_eidx_info {
 	$self->{xdb}->replace_document($docid, $doc);
 }
 
-my @VMD_MAP = (kw => 'K', label => 'L');
-
 sub set_vmd {
 	my ($self, $docid, $vmd) = @_;
 	begin_txn_lazy($self);
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index b5e22e9b..4db27363 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -161,5 +161,49 @@ like($s, qr/^Status: O\nX-Status: AF\n/ms,
 lei_ok(qw(q --pretty), "m:$m", @inc);
 like($lei_out, qr/^  "kw": \["answered", "flagged"\],\n/sm,
 	'--pretty JSON output shows kw: on one line');
+
+# ensure import on previously external-only message works
+lei_ok('q', "m:$m");
+is_deeply(json_utf8->decode($lei_out), [ undef ],
+	'to-be-imported message non-existent');
+lei_ok(qw(import -F eml t/x-unknown-alpine.eml));
+is($lei_err, '', 'no errors importing previous external-only message');
+lei_ok('q', "m:$m");
+$res = json_utf8->decode($lei_out);
+is($res->[1], undef, 'got one result');
+is_deeply($res->[0]->{kw}, [ qw(answered flagged) ], 'kw preserved on exact');
+
+# ensure fuzzy match import works, too
+$m = 'multipart@example.com';
+$o = "$ENV{HOME}/fuzz";
+lei_ok('q', '-o', $o, "m:$m", @inc);
+@fn = glob("$o/cur/*");
+scalar(@fn) == 1 or BAIL_OUT "wrote multiple or zero files: ".explain(\@fn);
+rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
+lei_ok('q', '-o', $o, "m:$m");
+is_deeply([glob("$o/cur/*")], [], 'clobbered output results');
+my $eml = eml_load('t/plack-2-txt-bodies.eml');
+$eml->header_set('List-Id', '<list.example.com>');
+my $in = $eml->as_string;
+lei_ok([qw(import -F eml --stdin)], undef, { 0 => \$in, %$lei_opt });
+is($lei_err, '', 'no errors from import');
+lei_ok(qw(q -f mboxrd), "m:$m");
+open $fh, '<', \$lei_out or BAIL_OUT $!;
+my @res;
+PublicInbox::MboxReader->mboxrd($fh, sub { push @res, shift });
+is($res[0]->header('Status'), 'RO', 'seen kw set');
+$res[0]->header_set('Status');
+is_deeply(\@res, [ $eml ], 'imported message matches w/ List-Id');
+
+$eml->header_set('List-Id', '<another.example.com>');
+$in = $eml->as_string;
+lei_ok([qw(import -F eml --stdin)], undef, { 0 => \$in, %$lei_opt });
+is($lei_err, '', 'no errors from 2nd import');
+lei_ok(qw(q -f mboxrd), "m:$m", 'l:another.example.com');
+my @another;
+open $fh, '<', \$lei_out or BAIL_OUT $!;
+PublicInbox::MboxReader->mboxrd($fh, sub { push @another, shift });
+is($another[0]->header('Status'), 'RO', 'seen kw set');
+
 }); # test_lei
 done_testing;

^ permalink raw reply related	[relevance 36%]

* [PATCH] lei: simplify lazy-loading
@ 2021-03-21 11:24 59% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-21 11:24 UTC (permalink / raw)
  To: meta

This makes it slightly easier to implement future commands,
since there'll be a couple more relatively self-contained
ones.
---
 lib/PublicInbox/LEI.pm        | 22 ++++++----------------
 lib/PublicInbox/LeiConvert.pm |  6 +++---
 lib/PublicInbox/LeiImport.pm  |  6 +++---
 lib/PublicInbox/LeiP2q.pm     |  6 +++---
 4 files changed, 15 insertions(+), 25 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index bf97a680..b6d21af6 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -598,7 +598,12 @@ sub dispatch {
 	}
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
-	if (my $cb = __PACKAGE__->can($func)) {
+	my $cb = __PACKAGE__->can($func) // ($CMD{$cmd} ? do {
+		my $mod = "PublicInbox::Lei\u$cmd";
+		($INC{"PublicInbox/Lei\u$cmd.pm"} //
+			eval("require $mod")) ? $mod->can($func) : undef;
+	} : undef);
+	if ($cb) {
 		optparse($self, $cmd, \@argv) or return;
 		if (my $chdir = $self->{opt}->{C}) {
 			for my $d (@$chdir) {
@@ -685,21 +690,6 @@ sub lei_config {
 	x_it($self, $?) if $?;
 }
 
-sub lei_import {
-	require PublicInbox::LeiImport;
-	PublicInbox::LeiImport->call(@_);
-}
-
-sub lei_convert {
-	require PublicInbox::LeiConvert;
-	PublicInbox::LeiConvert->call(@_);
-}
-
-sub lei_p2q {
-	require PublicInbox::LeiP2q;
-	PublicInbox::LeiP2q->call(@_);
-}
-
 sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index fcc67f0b..8d3b221a 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -77,11 +77,11 @@ sub do_convert { # via wq_do
 	delete $self->{wcb}; # commit
 }
 
-sub call { # the main "lei convert" method
-	my ($cls, $lei, @inputs) = @_;
+sub lei_convert { # the main "lei convert" method
+	my ($lei, @inputs) = @_;
 	my $opt = $lei->{opt};
 	$opt->{kw} //= 1;
-	my $self = $lei->{cnv} = bless {}, $cls;
+	my $self = $lei->{cnv} = bless {}, __PACKAGE__;
 	my $in_fmt = $opt->{'in-format'};
 	my (@f, @d);
 	$opt->{dedupe} //= 'none';
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index ae24a1fa..0e2a96e8 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -63,13 +63,13 @@ sub import_start {
 	while ($op && $op->{sock}) { $op->event_step }
 }
 
-sub call { # the main "lei import" method
-	my ($cls, $lei, @inputs) = @_;
+sub lei_import { # the main "lei import" method
+	my ($lei, @inputs) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
 	my ($net, @f, @d);
 	$lei->{opt}->{kw} //= 1;
-	my $self = $lei->{imp} = bless { inputs => \@inputs }, $cls;
+	my $self = $lei->{imp} = bless { inputs => \@inputs }, __PACKAGE__;
 	if ($lei->{opt}->{stdin}) {
 		@inputs and return $lei->fail("--stdin and @inputs do not mix");
 		$lei->check_input_format or return;
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index c5718603..302d7864 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -174,9 +174,9 @@ sub do_p2q { # via wq_do
 	$lei->out(@q, "\n");
 }
 
-sub call { # the "lei patch-to-query" entry point
-	my ($cls, $lei, $input) = @_;
-	my $self = $lei->{p2q} = bless {}, $cls;
+sub lei_p2q { # the "lei patch-to-query" entry point
+	my ($lei, $input) = @_;
+	my $self = $lei->{p2q} = bless {}, __PACKAGE__;
 	if ($lei->{opt}->{stdin}) {
 		$self->{0} = delete $lei->{0}; # guard from lei_atfork_child
 	} else {

^ permalink raw reply related	[relevance 59%]

* [PATCH 0/8] lei input handling improvements
@ 2021-03-22  7:53 70% Eric Wong
  2021-03-22  7:53 33% ` [PATCH 1/8] lei: support -c <name>=<value> to overrides Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-03-22  7:53 UTC (permalink / raw)
  To: meta

lei <convert|import> share a bit more code, now; and being
able to set "-c imap.debug" on the command-line should make
future work easier.

All this should set us up nicely for implementing "lei mark"
to add/remove keywords and labels.

Eric Wong (8):
  lei: support -c <name>=<value> to overrides
  net_reader: escape nasty chars from Net::NNTP->message
  lei: share input code between convert and import
  lei: simplify workers_start and callers
  mbox_reader: add ->reads method to avoid nonsensical formats
  lei_input: common filehandle reader for eml + mbox
  lei_input: drop "From " line on single "eml" (message/rfc822)
  lei import: ignore Status headers in "eml" messages

 MANIFEST                         |   1 +
 lib/PublicInbox/InboxWritable.pm |   2 +-
 lib/PublicInbox/LEI.pm           | 137 ++++++++++++++++++-------------
 lib/PublicInbox/LeiConvert.pm    |  94 ++++-----------------
 lib/PublicInbox/LeiExternal.pm   |   2 +-
 lib/PublicInbox/LeiImport.pm     | 107 +++++-------------------
 lib/PublicInbox/LeiInput.pm      | 106 ++++++++++++++++++++++++
 lib/PublicInbox/LeiP2q.pm        |   4 +-
 lib/PublicInbox/MboxReader.pm    |   5 ++
 lib/PublicInbox/NetReader.pm     |  10 ++-
 t/lei-import.t                   |  37 +++++++--
 t/lei.t                          |   9 ++
 12 files changed, 278 insertions(+), 236 deletions(-)
 create mode 100644 lib/PublicInbox/LeiInput.pm

^ permalink raw reply	[relevance 70%]

* [PATCH 4/8] lei: simplify workers_start and callers
  2021-03-22  7:53 70% [PATCH 0/8] lei input handling improvements Eric Wong
  2021-03-22  7:53 33% ` [PATCH 1/8] lei: support -c <name>=<value> to overrides Eric Wong
  2021-03-22  7:53 38% ` [PATCH 3/8] lei: share input code between convert and import Eric Wong
@ 2021-03-22  7:53 64% ` Eric Wong
  2021-03-22  7:54 55% ` [PATCH 8/8] lei import: ignore Status headers in "eml" messages Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-22  7:53 UTC (permalink / raw)
  To: meta

Since workers_start is in the common PublicInbox::LEI
package, we can just use \&METHOD_NAME instead of relying
on UNIVERSAL->can to avoid a method dispatch.

Most of our worker code can just use lei->dclose, so default
to doing that unless it's been overridden.
---
 lib/PublicInbox/LEI.pm        | 11 ++++++-----
 lib/PublicInbox/LeiConvert.pm |  4 +---
 lib/PublicInbox/LeiP2q.pm     |  4 +---
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0bd52a46..1e720b89 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -481,12 +481,13 @@ sub lei_atfork_child {
 sub workers_start {
 	my ($lei, $wq, $ident, $jobs, $ops) = @_;
 	$ops = {
-		'!' => [ $lei->can('fail_handler'), $lei ],
-		'|' => [ $lei->can('sigpipe_handler'), $lei ],
-		'x_it' => [ $lei->can('x_it'), $lei ],
-		'child_error' => [ $lei->can('child_error'), $lei ],
-		%$ops
+		'!' => [ \&fail_handler, $lei ],
+		'|' => [ \&sigpipe_handler, $lei ],
+		'x_it' => [ \&x_it, $lei ],
+		'child_error' => [ \&child_error, $lei ],
+		($ops ? %$ops : ()),
 	};
+	$ops->{''} //= [ \&dclose, $lei ];
 	require PublicInbox::PktOp;
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	$wq->wq_workers_start($ident, $jobs, $lei->oldset, { lei => $lei });
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 0aa13229..8685c194 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -87,9 +87,7 @@ sub lei_convert { # the main "lei convert" method
 		$lei->fail("output not specified or is not a mail destination");
 	$lei->{opt}->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	$self->prepare_inputs($lei, \@inputs) or return;
-	my $op = $lei->workers_start($self, 'lei_convert', 1, {
-		'' => [ $lei->can('dclose'), $lei ]
-	});
+	my $op = $lei->workers_start($self, 'lei_convert', 1);
 	$self->wq_io_do('do_convert', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index 302d7864..4abe1345 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -182,9 +182,7 @@ sub lei_p2q { # the "lei patch-to-query" entry point
 	} else {
 		$self->{input} = $input;
 	}
-	my $op = $lei->workers_start($self, 'lei patch2query', 1, {
-		'' => [ $lei->{p2q_done} // $lei->can('dclose'), $lei ]
-	});
+	my $op = $lei->workers_start($self, 'lei patch2query', 1);
 	$self->wq_io_do('do_p2q', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }

^ permalink raw reply related	[relevance 64%]

* [PATCH 8/8] lei import: ignore Status headers in "eml" messages
  2021-03-22  7:53 70% [PATCH 0/8] lei input handling improvements Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-22  7:53 64% ` [PATCH 4/8] lei: simplify workers_start and callers Eric Wong
@ 2021-03-22  7:54 55% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-22  7:54 UTC (permalink / raw)
  To: meta

Those headers only have meaning with for mboxes.  Don't surprise
users by trying to make sense of a header that is defined for mboxes.

It's possible to send email with (Status|X-Status) headers and
have those headers show up in a recipient's IMAP mailbox.

This was bad because an IMAP user may want to import a single
message through their MUA and pipe its contents to "lei import"
without noticing a mischievious sender stuck "X-Status: F"
(flagged/important) in there.
---
 lib/PublicInbox/LeiImport.pm | 14 +++++++-------
 t/lei-import.t               | 27 ++++++++++++++++++++++-----
 2 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 767cae60..9ad2ff12 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -10,19 +10,19 @@ use PublicInbox::Eml;
 use PublicInbox::PktOp qw(pkt_do);
 
 sub eml_cb { # used by PublicInbox::LeiInput::input_fh
-	my ($self, $eml) = @_;
-	my $vmd;
-	if ($self->{-import_kw}) { # FIXME
-		my $kw = PublicInbox::MboxReader::mbox_keywords($eml);
-		$vmd = { kw => $kw } if scalar(@$kw);
-	}
+	my ($self, $eml, $vmd) = @_;
 	my $xoids = $self->{lei}->{ale}->xoids_for($eml);
 	$self->{lei}->{sto}->ipc_do('set_eml', $eml, $vmd, $xoids);
 }
 
 sub mbox_cb { # MboxReader callback used by PublicInbox::LeiInput::input_fh
 	my ($eml, $self) = @_;
-	eml_cb($self, $eml);
+	my $vmd;
+	if ($self->{-import_kw}) {
+		my $kw = PublicInbox::MboxReader::mbox_keywords($eml);
+		$vmd = { kw => $kw } if scalar(@$kw);
+	}
+	eml_cb($self, $eml, $vmd);
 }
 
 sub import_done_wait { # dwaitpid callback
diff --git a/t/lei-import.t b/t/lei-import.t
index eef1e4e2..a697d756 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -39,27 +39,44 @@ lei_ok(qw(q m:testmessage@example.com));
 is(json_utf8->decode($lei_out)->[0]->{'blob'}, $oid,
 	'got expected OID w/o From');
 
-my $str = <<'';
+my $eml_str = <<'';
 From: a@b
 Message-ID: <x@y>
 Status: RO
 
-my $opt = { %$lei_opt, 0 => \$str };
+my $opt = { %$lei_opt, 0 => \$eml_str };
 lei_ok([qw(import -F eml -)], undef, $opt,
 	\'import single file with keywords from stdin');
 lei_ok(qw(q m:x@y));
 my $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
-is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
+is($res->[0]->{'m'}, 'x@y', 'got expected message');
+is($res->[0]->{kw}, undef, 'Status ignored for eml');
+lei_ok(qw(q -f mboxrd m:x@y));
+unlike($lei_out, qr/^Status:/, 'no Status: in imported message');
 
-$str =~ tr/x/v/; # v@y
-lei_ok([qw(import --no-kw -F eml -)], undef, $opt,
+
+$eml->header_set('Message-ID', '<v@y>');
+$eml->header_set('Status', 'RO');
+$in = 'From v@y Fri Oct  2 00:00:00 1993'."\n".$eml->as_string;
+lei_ok([qw(import --no-kw -F mboxrd -)], undef, { %$lei_opt, 0 => \$in },
 	\'import single file with --no-kw from stdin');
 lei(qw(q m:v@y));
 $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
+is($res->[0]->{'m'}, 'v@y', 'got expected message');
 is($res->[0]->{kw}, undef, 'no keywords set');
 
+$eml->header_set('Message-ID', '<k@y>');
+$in = 'From k@y Fri Oct  2 00:00:00 1993'."\n".$eml->as_string;
+lei_ok([qw(import -F mboxrd -)], undef, { %$lei_opt, 0 => \$in },
+	\'import single file with --kw (default) from stdin');
+lei(qw(q m:k@y));
+$res = json_utf8->decode($lei_out);
+is($res->[1], undef, 'only one result');
+is($res->[0]->{'m'}, 'k@y', 'got expected message');
+is_deeply($res->[0]->{kw}, ['seen'], "`seen' keywords set");
+
 # see t/lei_to_mail.t for "import -F mbox*"
 });
 done_testing;

^ permalink raw reply related	[relevance 55%]

* [PATCH 3/8] lei: share input code between convert and import
  2021-03-22  7:53 70% [PATCH 0/8] lei input handling improvements Eric Wong
  2021-03-22  7:53 33% ` [PATCH 1/8] lei: support -c <name>=<value> to overrides Eric Wong
@ 2021-03-22  7:53 38% ` Eric Wong
  2021-03-22  7:53 64% ` [PATCH 4/8] lei: simplify workers_start and callers Eric Wong
  2021-03-22  7:54 55% ` [PATCH 8/8] lei import: ignore Status headers in "eml" messages Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-22  7:53 UTC (permalink / raw)
  To: meta

These commands accept mail the same way, and this forces
us to maintain consistent input format support between
commands.

We'll be using this for "lei mark", too.
---
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        | 17 -------
 lib/PublicInbox/LeiConvert.pm | 60 +++----------------------
 lib/PublicInbox/LeiImport.pm  | 57 ++---------------------
 lib/PublicInbox/LeiInput.pm   | 85 +++++++++++++++++++++++++++++++++++
 5 files changed, 94 insertions(+), 126 deletions(-)
 create mode 100644 lib/PublicInbox/LeiInput.pm

diff --git a/MANIFEST b/MANIFEST
index b6b4a3ab..df8440ef 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -187,6 +187,7 @@ lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
 lib/PublicInbox/LeiHelp.pm
 lib/PublicInbox/LeiImport.pm
+lib/PublicInbox/LeiInput.pm
 lib/PublicInbox/LeiMirror.pm
 lib/PublicInbox/LeiOverview.pm
 lib/PublicInbox/LeiP2q.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9e3bb9b7..0bd52a46 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -419,23 +419,6 @@ sub fail ($$;$) {
 	undef;
 }
 
-sub check_input_format ($;$) {
-	my ($self, $files) = @_;
-	my $opt_key = 'in-format';
-	my $fmt = $self->{opt}->{$opt_key};
-	if (!$fmt) {
-		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
-		return fail($self, "--$opt_key unset for $err");
-	}
-	require PublicInbox::MboxLock if $files;
-	require PublicInbox::MboxReader;
-	return 1 if $fmt eq 'eml';
-	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
-	PublicInbox::MboxReader->can($fmt) or
-		return fail($self, "--$opt_key=$fmt unrecognized");
-	1;
-}
-
 sub out ($;@) {
 	my $self = shift;
 	return if print { $self->{1} // return } @_; # likely
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 8d3b221a..0aa13229 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -5,7 +5,7 @@
 package PublicInbox::LeiConvert;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::IPC);
+use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
 use PublicInbox::Eml;
 use PublicInbox::LeiStore;
 use PublicInbox::LeiOverview;
@@ -79,64 +79,14 @@ sub do_convert { # via wq_do
 
 sub lei_convert { # the main "lei convert" method
 	my ($lei, @inputs) = @_;
-	my $opt = $lei->{opt};
-	$opt->{kw} //= 1;
+	$lei->{opt}->{kw} //= 1;
+	$lei->{opt}->{dedupe} //= 'none';
 	my $self = $lei->{cnv} = bless {}, __PACKAGE__;
-	my $in_fmt = $opt->{'in-format'};
-	my (@f, @d);
-	$opt->{dedupe} //= 'none';
 	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
 	$lei->{l2m} or return
 		$lei->fail("output not specified or is not a mail destination");
-	my $net = $lei->{net}; # NetWriter may be created by l2m
-	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
-	if ($opt->{stdin}) {
-		@inputs and return $lei->fail("--stdin and @inputs do not mix");
-		$lei->check_input_format(undef) or return;
-		$self->{0} = $lei->{0};
-	}
-	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
-	for my $input (@inputs) {
-		my $input_path = $input;
-		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
-			require PublicInbox::NetReader;
-			$net //= PublicInbox::NetReader->new;
-			$net->add_url($input);
-		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
-			my $ifmt = lc $1;
-			if (($in_fmt // $ifmt) ne $ifmt) {
-				return $lei->fail(<<"");
---in-format=$in_fmt and `$ifmt:' conflict
-
-			}
-			if (-f $input_path) {
-				require PublicInbox::MboxLock;
-				require PublicInbox::MboxReader;
-				PublicInbox::MboxReader->can($ifmt) or return
-					$lei->fail("$ifmt not supported");
-			} elsif (-d _) {
-				require PublicInbox::MdirReader;
-				$ifmt eq 'maildir' or return
-					$lei->fail("$ifmt not supported");
-			} else {
-				return $lei->fail("Unable to handle $input");
-			}
-		} elsif (-f $input) { push @f, $input }
-		elsif (-d _) { push @d, $input }
-		else { return $lei->fail("Unable to handle $input") }
-	}
-	if (@f) { $lei->check_input_format(\@f) or return }
-	if (@d) { # TODO: check for MH vs Maildir, here
-		require PublicInbox::MdirReader;
-	}
-	$self->{inputs} = \@inputs;
-	if ($net) {
-		if (my $err = $net->errors) {
-			return $lei->fail($err);
-		}
-		$net->{quiet} = $opt->{quiet};
-		$lei->{net} //= $net;
-	}
+	$lei->{opt}->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
+	$self->prepare_inputs($lei, \@inputs) or return;
 	my $op = $lei->workers_start($self, 'lei_convert', 1, {
 		'' => [ $lei->can('dclose'), $lei ]
 	});
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 0e2a96e8..e769fba8 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -5,7 +5,7 @@
 package PublicInbox::LeiImport;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::IPC);
+use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
 use PublicInbox::Eml;
 use PublicInbox::PktOp qw(pkt_do);
 
@@ -67,60 +67,9 @@ sub lei_import { # the main "lei import" method
 	my ($lei, @inputs) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
-	my ($net, @f, @d);
 	$lei->{opt}->{kw} //= 1;
-	my $self = $lei->{imp} = bless { inputs => \@inputs }, __PACKAGE__;
-	if ($lei->{opt}->{stdin}) {
-		@inputs and return $lei->fail("--stdin and @inputs do not mix");
-		$lei->check_input_format or return;
-		$self->{0} = $lei->{0};
-	}
-
-	my $fmt = $lei->{opt}->{'in-format'};
-	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
-	for my $input (@inputs) {
-		my $input_path = $input;
-		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
-			require PublicInbox::NetReader;
-			$net //= PublicInbox::NetReader->new;
-			$net->add_url($input);
-		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
-			my $ifmt = lc $1;
-			if (($fmt // $ifmt) ne $ifmt) {
-				return $lei->fail(<<"");
---in-format=$fmt and `$ifmt:' conflict
-
-			}
-			if (-f $input_path) {
-				require PublicInbox::MboxLock;
-				require PublicInbox::MboxReader;
-				PublicInbox::MboxReader->can($ifmt) or return
-					$lei->fail("$ifmt not supported");
-			} elsif (-d _) {
-				require PublicInbox::MdirReader;
-				$ifmt eq 'maildir' or return
-					$lei->fail("$ifmt not supported");
-			} else {
-				return $lei->fail("Unable to handle $input");
-			}
-		} elsif (-f $input) { push @f, $input
-		} elsif (-d _) { push @d, $input
-		} else { return $lei->fail("Unable to handle $input") }
-	}
-	if (@f) { $lei->check_input_format(\@f) or return }
-	if (@d) { # TODO: check for MH vs Maildir, here
-		require PublicInbox::MdirReader;
-	}
-	$self->{inputs} = \@inputs;
-	if ($net) {
-		if (my $err = $net->errors) {
-			return $lei->fail($err);
-		}
-		$net->{quiet} = $lei->{opt}->{quiet};
-		$lei->{net} = $net;
-		require PublicInbox::LeiAuth;
-		$lei->{auth} = PublicInbox::LeiAuth->new;
-	}
+	my $self = $lei->{imp} = bless {}, __PACKAGE__;
+	$self->prepare_inputs($lei, \@inputs) or return;
 	import_start($lei);
 }
 
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
new file mode 100644
index 00000000..89585a52
--- /dev/null
+++ b/lib/PublicInbox/LeiInput.pm
@@ -0,0 +1,85 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# parent class for LeiImport, LeiConvert
+package PublicInbox::LeiInput;
+use strict;
+use v5.10.1;
+
+sub check_input_format ($;$) {
+	my ($lei, $files) = @_;
+	my $opt_key = 'in-format';
+	my $fmt = $lei->{opt}->{$opt_key};
+	if (!$fmt) {
+		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
+		return $lei->fail("--$opt_key unset for $err");
+	}
+	require PublicInbox::MboxLock if $files;
+	require PublicInbox::MboxReader;
+	return 1 if $fmt eq 'eml';
+	# XXX: should this handle {gz,bz2,xz}? that's currently in LeiToMail
+	PublicInbox::MboxReader->can($fmt) or
+		return $lei->fail("--$opt_key=$fmt unrecognized");
+	1;
+}
+
+
+sub prepare_inputs {
+	my ($self, $lei, $inputs) = @_;
+	my $in_fmt = $lei->{opt}->{'in-format'};
+	if ($lei->{opt}->{stdin}) {
+		@$inputs and return
+			$lei->fail("--stdin and @$inputs do not mix");
+		check_input_format($lei) or return;
+		$self->{0} = $lei->{0};
+	}
+	my $net = $lei->{net}; # NetWriter may be created by l2m
+	my $fmt = $lei->{opt}->{'in-format'};
+	my (@f, @d);
+	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
+	for my $input (@$inputs) {
+		my $input_path = $input;
+		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
+			require PublicInbox::NetReader;
+			$net //= PublicInbox::NetReader->new;
+			$net->add_url($input);
+		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
+			my $ifmt = lc $1;
+			if (($in_fmt // $ifmt) ne $ifmt) {
+				return $lei->fail(<<"");
+--in-format=$in_fmt and `$ifmt:' conflict
+
+			}
+			if (-f $input_path) {
+				require PublicInbox::MboxLock;
+				require PublicInbox::MboxReader;
+				PublicInbox::MboxReader->can($ifmt) or return
+					$lei->fail("$ifmt not supported");
+			} elsif (-d _) {
+				require PublicInbox::MdirReader;
+				$ifmt eq 'maildir' or return
+					$lei->fail("$ifmt not supported");
+			} else {
+				return $lei->fail("Unable to handle $input");
+			}
+		} elsif (-f $input) { push @f, $input }
+		elsif (-d _) { push @d, $input }
+		else { return $lei->fail("Unable to handle $input") }
+	}
+	if (@f) { check_input_format($lei, \@f) or return }
+	if (@d) { # TODO: check for MH vs Maildir, here
+		require PublicInbox::MdirReader;
+	}
+	if ($net) {
+		if (my $err = $net->errors) {
+			return $lei->fail($err);
+		}
+		$net->{quiet} = $lei->{opt}->{quiet};
+		require PublicInbox::LeiAuth;
+		$lei->{auth} //= PublicInbox::LeiAuth->new;
+		$lei->{net} //= $net;
+	}
+	$self->{inputs} = $inputs;
+}
+
+1;

^ permalink raw reply related	[relevance 38%]

* [PATCH 1/8] lei: support -c <name>=<value> to overrides
  2021-03-22  7:53 70% [PATCH 0/8] lei input handling improvements Eric Wong
@ 2021-03-22  7:53 33% ` Eric Wong
  2021-03-22  7:53 38% ` [PATCH 3/8] lei: share input code between convert and import Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-22  7:53 UTC (permalink / raw)
  To: meta

It's a bit nasty, but seems to mostly work for debugging
IMAP and NNTP commands.
---
 lib/PublicInbox/LEI.pm         | 109 ++++++++++++++++++++++-----------
 lib/PublicInbox/LeiExternal.pm |   2 +-
 t/lei.t                        |   9 +++
 3 files changed, 83 insertions(+), 37 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b6d21af6..9e3bb9b7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -119,6 +119,8 @@ sub index_opt {
 	batch_size|batch-size=s skip-docdata)
 }
 
+my @c_opt = qw(c=s@ C=s@ quiet|q);
+
 # we generate shell completion + help using %CMD and %OPTDESC,
 # see lei__complete() and PublicInbox::LeiHelp
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
@@ -129,82 +131,80 @@ our %CMD = ( # sorted in order of importance/use:
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g augment|a
 	import-remote! import-before! lock=s@ rsyncable
-	alert=s@ mua=s no-torsocks torsocks=s verbose|v+ quiet|q C=s@),
+	alert=s@ mua=s no-torsocks torsocks=s verbose|v+), @c_opt,
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
-	qw(type=s solve! format|f=s dedupe|d=s threads|t remote local! C=s@),
-	pass_through('git show') ],
+	qw(type=s solve! format|f=s dedupe|d=s threads|t remote local!
+	verbose|v+), @c_opt, pass_through('git show') ],
 
 'add-external' => [ 'LOCATION',
 	'add/set priority of a publicinbox|extindex for extra matches',
-	qw(boost=i c=s@ mirror=s no-torsocks torsocks=s inbox-version=i),
-	qw(quiet|q verbose|v+ C=s@),
-	index_opt(), PublicInbox::LeiQuery::curl_opt() ],
+	qw(boost=i mirror=s no-torsocks torsocks=s inbox-version=i
+	verbose|v+), @c_opt, index_opt(),
+	PublicInbox::LeiQuery::curl_opt() ],
 'ls-external' => [ '[FILTER]', 'list publicinbox|extindex locations',
-	qw(format|f=s z|0 globoff|g invert-match|v local remote C=s@) ],
+	qw(format|f=s z|0 globoff|g invert-match|v local remote), @c_opt ],
 'forget-external' => [ 'LOCATION...|--prune',
 	'exclude further results from a publicinbox|extindex',
-	qw(prune quiet|q C=s@) ],
+	qw(prune), @c_opt ],
 
 'ls-query' => [ '[FILTER...]', 'list saved search queries',
-		qw(name-only format|f=s C=s@) ],
-'rm-query' => [ 'QUERY_NAME', 'remove a saved search', qw(C=s@) ],
-'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search', qw(C=s@) ],
+		qw(name-only format|f=s), @c_opt ],
+'rm-query' => [ 'QUERY_NAME', 'remove a saved search', @c_opt ],
+'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search', @c_opt ],
 
 'plonk' => [ '--threads|--from=IDENT',
 	'exclude mail matching From: or threads from non-Message-ID searches',
-	qw(stdin| threads|t from|f=s mid=s oid=s C=s@) ],
+	qw(stdin| threads|t from|f=s mid=s oid=s), @c_opt ],
 'mark' => [ 'MESSAGE_FLAGS...',
 	'set/unset keywords on message(s) from stdin',
-	qw(stdin| oid=s exact by-mid|mid:s C=s@) ],
+	qw(stdin| oid=s exact by-mid|mid:s), @c_opt ],
 'forget' => [ '[--stdin|--oid=OID|--by-mid=MID]',
 	"exclude message(s) on stdin from `q' search results",
-	qw(stdin| oid=s exact by-mid|mid:s quiet|q C=s@) ],
+	qw(stdin| oid=s exact by-mid|mid:s), @c_opt ],
 
 'purge-mailsource' => [ 'LOCATION|--all',
 	'remove imported messages from IMAP, Maildirs, and MH',
-	qw(exact! all jobs:i indexed C=s@) ],
+	qw(exact! all jobs:i indexed), @c_opt ],
 
 # code repos are used for `show' to solve blobs from patch mails
 'add-coderepo' => [ 'DIRNAME', 'add or set priority of a git code repo',
-	qw(boost=i C=s@) ],
+	qw(boost=i), @c_opt ],
 'ls-coderepo' => [ '[FILTER_TERMS...]',
-		'list known code repos', qw(format|f=s z C=s@) ],
+		'list known code repos', qw(format|f=s z), @c_opt ],
 'forget-coderepo' => [ 'DIRNAME',
 	'stop using repo to solve blobs from patches',
-	qw(prune C=s@) ],
+	qw(prune), @c_opt ],
 
 'add-watch' => [ 'LOCATION', 'watch for new messages and flag changes',
 	qw(import! kw|keywords|flags! interval=s recursive|r
-	exclude=s include=s C=s@) ],
+	exclude=s include=s), @c_opt ],
 'ls-watch' => [ '[FILTER...]', 'list active watches with numbers and status',
-		qw(format|f=s z C=s@) ],
-'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote C=s@) ],
-'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote C=s@) ],
+		qw(format|f=s z), @c_opt ],
+'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote), @c_opt ],
+'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote), @c_opt ],
 'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
-	qw(prune C=s@) ],
+	qw(prune), @c_opt ],
 
 'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
-	lock=s@ in-format|F=s kw|keywords|flags! C=s@),
-	],
+	lock=s@ in-format|F=s kw|keywords|flags! verbose|v+), @c_opt ],
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
-	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s quiet|q
-	lock=s@ kw|keywords|flags! C=s@),
-	],
+	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s
+	lock=s@ kw|keywords|flags!), @c_opt ],
 'p2q' => [ 'FILE|COMMIT_OID|--stdin',
 	"use a patch to generate a query for `lei q --stdin'",
-	qw(stdin| want|w=s@ uri debug) ],
+	qw(stdin| want|w=s@ uri debug), @c_opt ],
 'config' => [ '[...]', sub {
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
-	 qw(C=s@), pass_through('git config') ],
+	 qw(c=s@ C=s@), pass_through('git config') ],
 'init' => [ '[DIRNAME]', sub {
 	"initialize storage, default: ".store_path($_[0]);
-	}, qw(quiet|q C=s@) ],
+	}, @c_opt ],
 'daemon-kill' => [ '[-SIGNAL]', 'signal the lei-daemon',
 	# "-C DIR" conflicts with -CHLD, here, and chdir makes no sense, here
 	opt_dash('signal|s=s', '[0-9]+|(?:[A-Z][A-Z0-9]+)') ],
@@ -216,7 +216,7 @@ our %CMD = ( # sorted in order of importance/use:
 
 'reorder-local-store-and-break-history' => [ '[REFNAME]',
 	'rewrite git history in an attempt to improve compression',
-	qw(gc! C=s@) ],
+	qw(gc!), @c_opt ],
 
 # internal commands are prefixed with '_'
 '_complete' => [ '[...]', 'internal shell completion helper',
@@ -235,6 +235,7 @@ my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 # we use \x{a0} (non-breaking SP) to avoid wrapping in PublicInbox::LeiHelp
 my %OPTDESC = (
 'help|h' => 'show this built-in help',
+'c=s@' => [ 'NAME=VALUE', 'set config option' ],
 'C=s@' => [ 'DIR', 'chdir to specify to directory' ],
 'quiet|q' => 'be quiet',
 'lock=s@' => [ 'METHOD|dotlock|fcntl|flock|none',
@@ -472,7 +473,8 @@ sub lei_atfork_child {
 		unless ($self->{oneshot}) {
 			close($_) for @io;
 		}
-	} else {
+	} else { # worker, Net::NNTP (Net::Cmd) uses STDERR directly
+		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
 		delete $self->{0};
 	}
 	delete @$self{qw(cnv)};
@@ -586,14 +588,47 @@ sub optparse ($$$) {
 	$err ? fail($self, "usage: lei $cmd $proto\nE: $err") : 1;
 }
 
+sub _tmp_cfg { # for lei -c <name>=<value> ...
+	my ($self) = @_;
+	my $cfg = _lei_cfg($self, 1);
+	require File::Temp;
+	my $ft = File::Temp->new(TEMPLATE => 'lei_cfg-XXXX', TMPDIR => 1);
+	my $tmp = { '-f' => $ft->filename, -tmp => $ft };
+	$ft->autoflush(1);
+	print $ft <<EOM or return fail($self, "$tmp->{-f}: $!");
+[include]
+	path = $cfg->{-f}
+EOM
+	$tmp = $self->{cfg} = bless { %$cfg, %$tmp }, ref($cfg);
+	for (@{$self->{opt}->{c}}) {
+		/\A([^=\.]+\.[^=]+)(?:=(.*))?\z/ or return fail($self, <<EOM);
+`-c $_' is not of the form -c <name>=<value>'
+EOM
+		my $name = $1;
+		my $value = $2 // 1;
+		_config($self, '--add', $name, $value);
+		if (defined(my $v = $tmp->{$name})) {
+			if (ref($v) eq 'ARRAY') {
+				push @$v, $value;
+			} else {
+				$tmp->{$name} = [ $v, $value ];
+			}
+		} else {
+			$tmp->{$name} = $value;
+		}
+	}
+}
+
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
 	local $current_lei = $self; # for __WARN__
 	dump_and_clear_log("from previous run\n");
 	return _help($self, 'no command given') unless defined($cmd);
-	while ($cmd eq '-C') { # do not support Getopt bundling for this
-		my $d = shift(@argv) // return fail($self, '-C DIRECTORY');
-		push @{$self->{opt}->{C}}, $d;
+	# do not support Getopt bundling for this
+	while ($cmd eq '-C' || $cmd eq '-c') {
+		my $v = shift(@argv) // return fail($self, $cmd eq '-C' ?
+					'-C DIRECTORY' : '-c <name>=<value>');
+		push @{$self->{opt}->{substr($cmd, 1, 1)}}, $v;
 		$cmd = shift(@argv) // return _help($self, 'no command given');
 	}
 	my $func = "lei_$cmd";
@@ -605,6 +640,7 @@ sub dispatch {
 	} : undef);
 	if ($cb) {
 		optparse($self, $cmd, \@argv) or return;
+		$self->{opt}->{c} and (_tmp_cfg($self) // return);
 		if (my $chdir = $self->{opt}->{C}) {
 			for my $d (@$chdir) {
 				next if $d eq ''; # same as git(1)
@@ -623,6 +659,7 @@ sub dispatch {
 
 sub _lei_cfg ($;$) {
 	my ($self, $creat) = @_;
+	return $self->{cfg} if $self->{cfg};
 	my $f = _config_path($self);
 	my @st = stat($f);
 	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index f4e24c2a..9a555831 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -149,7 +149,7 @@ sub lei_add_external {
 	my $mirror = $opt->{mirror} // do {
 		my @fail;
 		for my $sw ($self->index_opt, $self->curl_opt,
-				qw(c no-torsocks torsocks inbox-version)) {
+				qw(no-torsocks torsocks inbox-version)) {
 			my ($f) = (split(/|/, $sw, 2))[0];
 			next unless defined $opt->{$f};
 			$f = length($f) == 1 ? "-$f" : "--$f";
diff --git a/t/lei.t b/t/lei.t
index 2bf4b862..0cf97866 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -93,6 +93,15 @@ my $test_config = sub {
 			'config set var with -f fails');
 	like($lei_err, qr/not supported/, 'not supported noted');
 	ok(!-f "$home/config/f", 'no file created');
+
+	lei_ok(qw(-c imap.debug config --bool imap.debug));
+	is($lei_out, "true\n", "-c sets w/o value");
+	lei_ok(qw(-c imap.debug=1 config --bool imap.debug));
+	is($lei_out, "true\n", "-c coerces value");
+	lei_ok(qw(-c imap.debug=tr00 config imap.debug));
+	is($lei_out, "tr00\n", "-c string value passed as-is");
+	lei_ok(qw(-c imap.debug=a -c imap.debug=b config --get-all imap.debug));
+	is($lei_out, "a\nb\n", '-c and --get-all work together');
 };
 
 my $test_completion = sub {

^ permalink raw reply related	[relevance 33%]

* [PATCH 0/2] lei mark: volatile metadata tagging
@ 2021-03-23  5:02 71% Eric Wong
  2021-03-23  5:02 29% ` [PATCH 1/2] lei mark: command for (un)setting keywords and labels Eric Wong
  2021-03-23  5:02 56% ` [PATCH 2/2] lei mark: add support for (bash) completion Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-03-23  5:02 UTC (permalink / raw)
  To: meta

I'm not sure if this should be called "mark", but maybe
"lei tag" is a better name?

It allows us to set and unset keywords (aka IMAP/Maildir flags)
and labels (aka JMAP mailbox name) for messages already known
to lei/store.

We don't support reading labels, yet...

Eric Wong (2):
  lei mark: command for (un)setting keywords and labels
  lei mark: add support for (bash) completion

 MANIFEST                     |   2 +
 lib/PublicInbox/LEI.pm       |  42 +++----
 lib/PublicInbox/LeiImport.pm |  15 +--
 lib/PublicInbox/LeiInput.pm  |  11 +-
 lib/PublicInbox/LeiMark.pm   | 220 +++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiStore.pm  |  19 +++
 lib/PublicInbox/SearchIdx.pm |  23 ++++
 t/lei-mark.t                 |  47 ++++++++
 8 files changed, 347 insertions(+), 32 deletions(-)
 create mode 100644 lib/PublicInbox/LeiMark.pm
 create mode 100644 t/lei-mark.t


^ permalink raw reply	[relevance 71%]

* [PATCH 2/2] lei mark: add support for (bash) completion
  2021-03-23  5:02 71% [PATCH 0/2] lei mark: volatile metadata tagging Eric Wong
  2021-03-23  5:02 29% ` [PATCH 1/2] lei mark: command for (un)setting keywords and labels Eric Wong
@ 2021-03-23  5:02 56% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-03-23  5:02 UTC (permalink / raw)
  To: meta

Only lightly tested, this seems to suffer from the same
problem as external completions for network URLs with
colons in them.  In any case, its usable enough for me.

The core LEI module now supports completions for lazy-loaded
commands, too, so we'll be able to do completions for other
commands more easily.
---
 lib/PublicInbox/LEI.pm     | 27 ++++++++++++++----------
 lib/PublicInbox/LeiMark.pm | 43 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 91c95239..0be417eb 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -604,6 +604,19 @@ EOM
 	}
 }
 
+sub lazy_cb ($$$) {
+	my ($self, $cmd, $pfx) = @_;
+	my $ucmd = $cmd;
+	$ucmd =~ tr/-/_/;
+	my $cb;
+	$cb = $self->can($pfx.$ucmd) and return $cb;
+	my $base = $ucmd;
+	$base =~ s/_([a-z])/\u$1/g;
+	my $pkg = "PublicInbox::Lei\u$base";
+	($INC{"PublicInbox/Lei\u$base.pm"} // eval("require $pkg")) ?
+		$pkg->can($pfx.$ucmd) : undef;
+}
+
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
 	local $current_lei = $self; # for __WARN__
@@ -616,14 +629,7 @@ sub dispatch {
 		push @{$self->{opt}->{substr($cmd, 1, 1)}}, $v;
 		$cmd = shift(@argv) // return _help($self, 'no command given');
 	}
-	my $func = "lei_$cmd";
-	$func =~ tr/-/_/;
-	my $cb = __PACKAGE__->can($func) // ($CMD{$cmd} ? do {
-		my $mod = "PublicInbox::Lei\u$cmd";
-		($INC{"PublicInbox/Lei\u$cmd.pm"} //
-			eval("require $mod")) ? $mod->can($func) : undef;
-	} : undef);
-	if ($cb) {
+	if (my $cb = lazy_cb(__PACKAGE__, $cmd, 'lei_')) {
 		optparse($self, $cmd, \@argv) or return;
 		$self->{opt}->{c} and (_tmp_cfg($self) // return);
 		if (my $chdir = $self->{opt}->{C}) {
@@ -808,9 +814,8 @@ sub lei__complete {
 			@v;
 		} grep(/\A(?:[\w-]+\|)*$opt\b.*?(?:\t$cmd)?\z/, keys %OPTDESC);
 	}
-	$cmd =~ tr/-/_/;
-	if (my $sub = $self->can("_complete_$cmd")) {
-		puts $self, $sub->($self, @argv, $cur ? ($cur) : ());
+	if (my $cb = lazy_cb($self, $cmd, '_complete_')) {
+		puts $self, $cb->($self, @argv, $cur ? ($cur) : ());
 	}
 	# TODO: URLs, pathnames, OIDs, MIDs, etc...  See optparse() for
 	# proto parsing.
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
index aa52ad5a..7b50aa51 100644
--- a/lib/PublicInbox/LeiMark.pm
+++ b/lib/PublicInbox/LeiMark.pm
@@ -174,4 +174,47 @@ sub ipc_atfork_child {
 	PublicInbox::OnDestroy->new($$, \&note_missing, $self);
 }
 
+# Workaround bash word-splitting s to ['kw', ':', 'keyword' ...]
+# Maybe there's a better way to go about this in
+# contrib/completion/lei-completion.bash
+sub _complete_mark_common ($) {
+	my ($argv) = @_;
+	# Workaround bash word-splitting URLs to ['https', ':', '//' ...]
+	# Maybe there's a better way to go about this in
+	# contrib/completion/lei-completion.bash
+	my $re = '';
+	my $cur = pop(@$argv) // '';
+	if (@$argv) {
+		my @x = @$argv;
+		if ($cur eq ':' && @x) {
+			push @x, $cur;
+			$cur = '';
+		}
+		while (@x > 2 && $x[0] !~ /\A[+\-](?:kw|L)\z/ &&
+					$x[1] ne ':') {
+			shift @x;
+		}
+		if (@x >= 2) { # qw(kw : $KEYWORD) or qw(kw :)
+			$re = join('', @x);
+		} else { # just return everything and hope for the best
+			$re = join('', @$argv);
+		}
+		$re = quotemeta($re);
+	}
+	($cur, $re);
+}
+
+# FIXME: same problems as _complete_forget_external and similar
+sub _complete_mark {
+	my ($self, @argv) = @_;
+	my @all = map { ("+kw:$_", "-kw:$_") } @KW;
+	return @all if !@argv;
+	my ($cur, $re) = _complete_mark_common(\@argv);
+	map {
+		# only return the part specified on the CLI
+		# don't duplicate if already 100% completed
+		/\A$re(\Q$cur\E.*)/ ? ($cur eq $1 ? () : $1) : ();
+	} grep(/$re\Q$cur/, @all);
+}
+
 1;

^ permalink raw reply related	[relevance 56%]

* [PATCH 1/2] lei mark: command for (un)setting keywords and labels
  2021-03-23  5:02 71% [PATCH 0/2] lei mark: volatile metadata tagging Eric Wong
@ 2021-03-23  5:02 29% ` Eric Wong
  2021-03-23  5:02 56% ` [PATCH 2/2] lei mark: add support for (bash) completion Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-03-23  5:02 UTC (permalink / raw)
  To: meta

Only tested for keywords and labels with file inputs, so far;
but it seems to do what it needs to do.  There's a bit more
redundant code than I'd like, and more opportunities for code
sharing in the future

"lei import" will be expanded to support +kw:$KEYWORD and
+L:$LABEL in the future.
---
 MANIFEST                     |   2 +
 lib/PublicInbox/LEI.pm       |  15 ++-
 lib/PublicInbox/LeiImport.pm |  15 +--
 lib/PublicInbox/LeiInput.pm  |  11 ++-
 lib/PublicInbox/LeiMark.pm   | 177 +++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiStore.pm  |  19 ++++
 lib/PublicInbox/SearchIdx.pm |  23 +++++
 t/lei-mark.t                 |  47 ++++++++++
 8 files changed, 288 insertions(+), 21 deletions(-)
 create mode 100644 lib/PublicInbox/LeiMark.pm
 create mode 100644 t/lei-mark.t

diff --git a/MANIFEST b/MANIFEST
index df8440ef..87e4b616 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -188,6 +188,7 @@ lib/PublicInbox/LeiExternal.pm
 lib/PublicInbox/LeiHelp.pm
 lib/PublicInbox/LeiImport.pm
 lib/PublicInbox/LeiInput.pm
+lib/PublicInbox/LeiMark.pm
 lib/PublicInbox/LeiMirror.pm
 lib/PublicInbox/LeiOverview.pm
 lib/PublicInbox/LeiP2q.pm
@@ -377,6 +378,7 @@ t/lei-import-imap.t
 t/lei-import-maildir.t
 t/lei-import-nntp.t
 t/lei-import.t
+t/lei-mark.t
 t/lei-mirror.t
 t/lei-p2q.t
 t/lei-q-kw.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1e720b89..91c95239 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -157,9 +157,10 @@ our %CMD = ( # sorted in order of importance/use:
 'plonk' => [ '--threads|--from=IDENT',
 	'exclude mail matching From: or threads from non-Message-ID searches',
 	qw(stdin| threads|t from|f=s mid=s oid=s), @c_opt ],
-'mark' => [ 'MESSAGE_FLAGS...',
-	'set/unset keywords on message(s) from stdin',
-	qw(stdin| oid=s exact by-mid|mid:s), @c_opt ],
+'mark' => [ 'KEYWORDS...',
+	'set/unset keywords on message(s)',
+	qw(stdin| in-format|F=s input|i=s@ oid=s@ mid=s@), @c_opt,
+	pass_through('-kw:foo for delete') ],
 'forget' => [ '[--stdin|--oid=OID|--by-mid=MID]',
 	"exclude message(s) on stdin from `q' search results",
 	qw(stdin| oid=s exact by-mid|mid:s), @c_opt ],
@@ -348,7 +349,7 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-my @WQ_KEYS = qw(lxs l2m imp mrr cnv p2q); # internal workers
+my @WQ_KEYS = qw(lxs l2m imp mrr cnv p2q mark); # internal workers
 
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
@@ -460,7 +461,7 @@ sub lei_atfork_child {
 		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
 		delete $self->{0};
 	}
-	delete @$self{qw(cnv)};
+	delete @$self{qw(cnv mark imp)};
 	for (delete @$self{qw(3 old_1 au_done)}) {
 		close($_) if defined($_);
 	}
@@ -690,10 +691,6 @@ sub lei_show {
 	my ($self, @argv) = @_;
 }
 
-sub lei_mark {
-	my ($self, @argv) = @_;
-}
-
 sub _config {
 	my ($self, @argv) = @_;
 	my %env = (%{$self->{env}}, GIT_CONFIG => undef);
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 9ad2ff12..21af28a3 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -78,16 +78,6 @@ sub lei_import { # the main "lei import" method
 	import_start($lei);
 }
 
-sub ipc_atfork_child {
-	my ($self) = @_;
-	my $lei = $self->{lei};
-	delete $lei->{imp}; # drop circular ref
-	$lei->lei_atfork_child;
-	$self->SUPER::ipc_atfork_child;
-	$lei->{auth}->do_auth_atfork($self) if $lei->{auth};
-	undef;
-}
-
 sub _import_maildir { # maildir_each_eml cb
 	my ($f, $kw, $eml, $sto, $set_kw) = @_;
 	$sto->ipc_do('set_eml', $eml, $set_kw ? { kw => $kw }: ());
@@ -137,6 +127,9 @@ sub import_stdin {
 	$self->input_fh($lei->{opt}->{'in-format'}, $in, '<stdin>');
 }
 
-no warnings 'once'; # the following works even when LeiAuth is lazy-loaded
+no warnings 'once';
+*ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
+
+# the following works even when LeiAuth is lazy-loaded
 *net_merge_all = \&PublicInbox::LeiAuth::net_merge_all;
 1;
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 859fdb11..6ad57772 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -45,7 +45,7 @@ error reading $name: $!
 	}
 }
 
-sub prepare_inputs {
+sub prepare_inputs { # returns undef on error
 	my ($self, $lei, $inputs) = @_;
 	my $in_fmt = $lei->{opt}->{'in-format'};
 	if ($lei->{opt}->{stdin}) {
@@ -103,4 +103,13 @@ sub prepare_inputs {
 	$self->{inputs} = $inputs;
 }
 
+sub input_only_atfork_child {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	$lei->lei_atfork_child;
+	PublicInbox::IPC::ipc_atfork_child($self);
+	$lei->{auth}->do_auth_atfork($self) if $lei->{auth};
+	undef;
+}
+
 1;
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
new file mode 100644
index 00000000..aa52ad5a
--- /dev/null
+++ b/lib/PublicInbox/LeiMark.pm
@@ -0,0 +1,177 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# handles "lei mark" command
+package PublicInbox::LeiMark;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
+use PublicInbox::Eml;
+use PublicInbox::PktOp qw(pkt_do);
+
+# JMAP RFC 8621 4.1.1
+my @KW = (qw(seen answered flagged draft), # system
+	qw(forwarded phishing junk notjunk)); # reserved
+# note: RFC 8621 states "Users may add arbitrary keywords to an Email",
+# but is it good idea?  Stick to the system and reserved ones, for now.
+# The "system" ones map to Maildir flags and mbox Status/X-Status headers.
+my %KW = map { $_ => 1 } @KW;
+my $L_MAX = 244; # Xapian term limit - length('L')
+
+# RFC 8621, sec 2 (Mailboxes) a "label" for us is a JMAP Mailbox "name"
+# "Servers MAY reject names that violate server policy"
+my %ERR = (
+	L => sub {
+		my ($label) = @_;
+		length($label) >= $L_MAX and
+			return "`$label' too long (must be <= $L_MAX)";
+		$label =~ m{\A[a-z0-9_][a-z0-9_\-\./\@\!,]*[a-z0-9]\z} ?
+			undef : "`$label' is invalid";
+	},
+	kw => sub {
+		my ($kw) = @_;
+		$KW{$kw} ? undef : <<EOM;
+`$kw' is not one of: `seen', `flagged', `answered', `draft'
+`junk', `notjunk', `phishing' or `forwarded'
+EOM
+
+	}
+);
+
+# like Getopt::Long, but for +kw:FOO and -kw:FOO to prepare
+# for update_xvmd -> update_vmd
+sub vmd_mod_extract {
+	my $argv = $_[-1];
+	my $vmd_mod = {};
+	my @new_argv;
+	for my $x (@$argv) {
+		if ($x =~ /\A(\+|\-)(kw|L):(.+)\z/) {
+			my ($op, $pfx, $val) = ($1, $2, $3);
+			if (my $err = $ERR{$pfx}->($val)) {
+				push @{$vmd_mod->{err}}, $err;
+			} else { # set "+kw", "+L", "-L", "-kw"
+				push @{$vmd_mod->{$op.$pfx}}, $val;
+			}
+		} else {
+			push @new_argv, $x;
+		}
+	}
+	@$argv = @new_argv;
+	$vmd_mod;
+}
+
+sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+	my ($self, $eml) = @_;
+	if (my $xoids = $self->{lei}->{ale}->xoids_for($eml)) {
+		$self->{lei}->{sto}->ipc_do('update_xvmd', $xoids,
+						$self->{vmd_mod});
+	} else {
+		++$self->{missing};
+	}
+}
+
+sub mbox_cb { eml_cb($_[1], $_[0]) } # used by PublicInbox::LeiInput::input_fh
+
+sub mark_done_wait { # dwaitpid callback
+	my ($arg, $pid) = @_;
+	my ($mark, $lei) = @$arg;
+	$lei->child_error($?, 'non-fatal errors during mark') if $?;
+	my $sto = delete $lei->{sto};
+	my $wait = $sto->ipc_do('done') if $sto; # PublicInbox::LeiStore::done
+	$lei->dclose;
+}
+
+sub mark_done { # EOF callback for main daemon
+	my ($lei) = @_;
+	my $mark = delete $lei->{mark} or return;
+	$mark->wq_wait_old(\&mark_done_wait, $lei);
+}
+
+sub net_merge_complete { # callback used by LeiAuth
+	my ($self) = @_;
+	for my $input (@{$self->{inputs}}) {
+		$self->wq_io_do('mark_path_url', [], $input);
+	}
+	$self->wq_close(1);
+}
+
+sub _mark_maildir { # maildir_each_eml cb
+	my ($f, $kw, $eml, $self) = @_;
+	eml_cb($self, $eml);
+}
+
+sub _mark_net { # imap_each, nntp_each cb
+	my ($url, $uid, $kw, $eml, $self) = @_;
+	eml_cb($self, $eml)
+}
+
+sub lei_mark { # the "lei mark" method
+	my ($lei, @argv) = @_;
+	my $sto = $lei->_lei_store(1);
+	my $self = $lei->{mark} = bless { missing => 0 }, __PACKAGE__;
+	$sto->write_prepare($lei);
+	$lei->ale; # refresh and prepare
+	my $vmd_mod = vmd_mod_extract(\@argv);
+	return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
+	$self->prepare_inputs($lei, \@argv) or return;
+	grep(defined, @$vmd_mod{qw(+kw +L -L -kw)}) or
+		return $lei->fail('no keywords or labels specified');
+	my $ops = { '' => [ \&mark_done, $lei ] };
+	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
+	$self->{vmd_mod} = $vmd_mod;
+	my $op = $lei->workers_start($self, 'lei_mark', 1, $ops);
+	$self->wq_io_do('mark_stdin', []) if $self->{0};
+	net_merge_complete($self) unless $lei->{auth};
+	while ($op && $op->{sock}) { $op->event_step }
+}
+
+sub mark_path_url {
+	my ($self, $input) = @_;
+	my $lei = $self->{lei};
+	my $ifmt = lc($lei->{opt}->{'in-format'} // '');
+	# TODO auto-detect?
+	if ($input =~ m!\Aimaps?://!i) {
+		$lei->{net}->imap_each($input, \&_mark_net, $self);
+		return;
+	} elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
+		$lei->{net}->nntp_each($input, \&_mark_net, $self);
+		return;
+	} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
+		$ifmt = lc $1;
+	}
+	if (-f $input) {
+		my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
+				PublicInbox::MboxLock->defaults);
+		my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
+		$self->input_fh($ifmt, $mbl->{fh}, $input);
+	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
+		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
+$input appears to a be a maildir, not $ifmt
+EOM
+		PublicInbox::MdirReader::maildir_each_eml($input,
+					\&_mark_maildir, $self);
+	} else {
+		$lei->fail("$input unsupported (TODO)");
+	}
+}
+
+sub mark_stdin {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	my $in = delete $self->{0};
+	$self->input_fh($lei->{opt}->{'in-format'}, $in, '<stdin>');
+}
+
+sub note_missing {
+	my ($self) = @_;
+	$self->{lei}->child_error(1 << 8) if $self->{missing};
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	PublicInbox::LeiInput::input_only_atfork_child($self);
+	# this goes out-of-scope at worker process exit:
+	PublicInbox::OnDestroy->new($$, \&note_missing, $self);
+}
+
+1;
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index b390b318..b5d43b7e 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -228,12 +228,30 @@ sub set_eml {
 		set_eml_vmd($self, $eml, $vmd);
 }
 
+sub update_xvmd {
+	my ($self, $xoids, $vmd_mod) = @_;
+	my $eidx = eidx_init($self);
+	my $oidx = $eidx->{oidx};
+	my %seen;
+	for my $oid (keys %$xoids) {
+		my @docids = $oidx->blob_exists($oid) or next;
+		scalar(@docids) > 1 and
+			warn "W: $oid indexed as multiple docids: @docids\n";
+		for my $docid (@docids) {
+			next if $seen{$docid}++;
+			my $idx = $eidx->idx_shard($docid);
+			$idx->ipc_do('update_vmd', $docid, $vmd_mod);
+		}
+	}
+}
+
 # set or update keywords for external message, called via ipc_do
 sub set_xvmd {
 	my ($self, $xoids, $eml, $vmd) = @_;
 
 	my $eidx = eidx_init($self);
 	my $oidx = $eidx->{oidx};
+	my %seen;
 
 	# see if we can just update existing docs
 	for my $oid (keys %$xoids) {
@@ -241,6 +259,7 @@ sub set_xvmd {
 		scalar(@docids) > 1 and
 			warn "W: $oid indexed as multiple docids: @docids\n";
 		for my $docid (@docids) {
+			next if $seen{$docid}++;
 			my $idx = $eidx->idx_shard($docid);
 			$idx->ipc_do('set_vmd', $docid, $vmd);
 		}
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 3f933121..7d46489c 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -597,6 +597,29 @@ sub remove_vmd {
 	$self->{xdb}->replace_document($docid, $doc) if $replace;
 }
 
+sub update_vmd {
+	my ($self, $docid, $vmd_mod) = @_;
+	begin_txn_lazy($self);
+	my $doc = _get_doc($self, $docid) or return;
+	my $updated = 0;
+	my @x = @VMD_MAP;
+	while (my ($field, $pfx) = splice(@x, 0, 2)) {
+		# field: "label" or "kw"
+		for my $val (@{$vmd_mod->{"-$field"} // []}) {
+			eval {
+				$doc->remove_term($pfx . $val);
+				++$updated;
+			};
+		}
+		for my $val (@{$vmd_mod->{"+$field"} // []}) {
+			$doc->add_boolean_term($pfx . $val);
+			++$updated;
+		}
+	}
+	$self->{xdb}->replace_document($docid, $doc) if $updated;
+	$updated;
+}
+
 sub xdb_remove {
 	my ($self, @docids) = @_;
 	$self->begin_txn_lazy;
diff --git a/t/lei-mark.t b/t/lei-mark.t
new file mode 100644
index 00000000..ddf5634c
--- /dev/null
+++ b/t/lei-mark.t
@@ -0,0 +1,47 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(json DBD::SQLite Search::Xapian));
+my $check_kw = sub {
+	my ($exp, %opt) = @_;
+	my $mid = $opt{mid} // 'testmessage@example.com';
+	lei_ok('q', "m:$mid");
+	my $res = json_utf8->decode($lei_out);
+	is($res->[1], undef, 'only got one result');
+	my $msg = $opt{msg} ? " $opt{msg}" : '';
+	($exp ? is_deeply($res->[0]->{kw}, $exp, "got @$exp$msg")
+		: is($res->[0]->{kw}, undef, "got undef$msg")) or
+			diag explain($res);
+};
+
+test_lei(sub {
+	lei_ok(qw(import -F eml t/utf8.eml));
+	lei_ok(qw(mark -F eml t/utf8.eml +kw:flagged));
+	$check_kw->(['flagged']);
+	ok(!lei(qw(mark -F eml t/utf8.eml +kw:seeen)), 'bad kw rejected');
+	like($lei_err, qr/`seeen' is not one of/, 'got helpful error');
+	ok(!lei(qw(mark -F eml t/utf8.eml +k:seen)), 'bad prefix rejected');
+	ok(!lei(qw(mark -F eml t/utf8.eml)), 'no keywords');
+	my $mb = "$ENV{HOME}/mb";
+	my $md = "$ENV{HOME}/md";
+	lei_ok(qw(q m:testmessage@example.com -o), "mboxrd:$mb");
+	ok(-s $mb, 'wrote mbox result');
+	lei_ok(qw(q m:testmessage@example.com -o), $md);
+	my @fn = glob("$md/cur/*");
+	scalar(@fn) == 1 or BAIL_OUT 'no mail '.explain(\@fn);
+	rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
+	$check_kw->(['flagged'], msg => 'after bad request');
+	lei_ok(qw(mark -F eml t/utf8.eml -kw:flagged));
+	$check_kw->(undef, msg => 'keyword cleared');
+	lei_ok(qw(mark -F mboxrd +kw:seen), $mb);
+	$check_kw->(['seen'], msg => 'mbox Status ignored');
+	lei_ok(qw(mark -kw:seen +kw:answered), $md);
+	$check_kw->(['answered'], msg => 'Maildir Status ignored');
+
+	open my $in, '<', 't/utf8.eml' or BAIL_OUT $!;
+	lei_ok([qw(mark -F eml - +kw:seen)], undef, { %$lei_opt, 0 => $in });
+	$check_kw->(['answered', 'seen'], msg => 'stdin works');
+});
+done_testing;

^ permalink raw reply related	[relevance 29%]

* [PATCH] lei: hide *_atfork_child from command-line
@ 2021-03-23  6:51 56% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-23  6:51 UTC (permalink / raw)
  To: meta

Otherwise we could get non-sensical results if somebody tries
running "lei atfork_child" from the command-line.
---
 lib/PublicInbox/LEI.pm        | 2 +-
 lib/PublicInbox/LeiConvert.pm | 2 +-
 lib/PublicInbox/LeiInput.pm   | 2 +-
 lib/PublicInbox/LeiMirror.pm  | 2 +-
 lib/PublicInbox/LeiP2q.pm     | 4 ++--
 lib/PublicInbox/LeiStore.pm   | 2 +-
 lib/PublicInbox/LeiToMail.pm  | 2 +-
 lib/PublicInbox/LeiXSearch.pm | 2 +-
 8 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0be417eb..17ca637e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -449,7 +449,7 @@ sub note_sigpipe { # triggers sigpipe_handler
 	x_it($self, 13);
 }
 
-sub lei_atfork_child {
+sub _lei_atfork_child {
 	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
 	if ($persist) {
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 51a233bd..49e2b7af 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -86,7 +86,7 @@ sub lei_convert { # the main "lei convert" method
 sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	$lei->lei_atfork_child;
+	$lei->_lei_atfork_child;
 	my $l2m = delete $lei->{l2m};
 	if (my $net = $lei->{net}) { # may prompt user once
 		$net->{mics_cached} = $net->imap_common_init($lei);
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 6ad57772..2a4968d4 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -106,7 +106,7 @@ sub prepare_inputs { # returns undef on error
 sub input_only_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	$lei->lei_atfork_child;
+	$lei->_lei_atfork_child;
 	PublicInbox::IPC::ipc_atfork_child($self);
 	$lei->{auth}->do_auth_atfork($self) if $lei->{auth};
 	undef;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 65818796..c916f2d0 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -288,7 +288,7 @@ sub start {
 
 sub ipc_atfork_child {
 	my ($self) = @_;
-	$self->{lei}->lei_atfork_child;
+	$self->{lei}->_lei_atfork_child;
 	$SIG{TERM} = sub { exit(128 + 15) }; # trigger OnDestroy $reap
 	$self->SUPER::ipc_atfork_child;
 }
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index 4abe1345..0f7ffb5f 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -178,7 +178,7 @@ sub lei_p2q { # the "lei patch-to-query" entry point
 	my ($lei, $input) = @_;
 	my $self = $lei->{p2q} = bless {}, __PACKAGE__;
 	if ($lei->{opt}->{stdin}) {
-		$self->{0} = delete $lei->{0}; # guard from lei_atfork_child
+		$self->{0} = delete $lei->{0}; # guard from _lei_atfork_child
 	} else {
 		$self->{input} = $input;
 	}
@@ -191,7 +191,7 @@ sub lei_p2q { # the "lei patch-to-query" entry point
 sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	$lei->lei_atfork_child;
+	$lei->_lei_atfork_child;
 	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
 }
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index b5d43b7e..fa03f93c 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -322,7 +322,7 @@ sub done {
 sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	$lei->lei_atfork_child(1) if $lei;
+	$lei->_lei_atfork_child(1) if $lei;
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e9ab939c..1be15707 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -636,7 +636,7 @@ sub do_post_auth {
 sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	$lei->lei_atfork_child;
+	$lei->_lei_atfork_child;
 	$lei->{auth}->do_auth_atfork($self) if $lei->{auth};
 	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index b6aaf3e1..c6b82eeb 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -401,7 +401,7 @@ sub incr_start_query { # called whenever an l2m shard starts do_post_auth
 
 sub ipc_atfork_child {
 	my ($self) = @_;
-	$self->{lei}->lei_atfork_child;
+	$self->{lei}->_lei_atfork_child;
 	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
 }

^ permalink raw reply related	[relevance 56%]

* [PATCH 0/5] lei: more input + worker-related stuff
@ 2021-03-23 11:48 71% Eric Wong
  2021-03-23 11:48 70% ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

Drop a bunch of redundant code, yay!

Eric Wong (5):
  net_reader: nntp_each: pass keywords as `undef'
  test_common: check lei/errors.log
  lei: persistent workers (lei_store) run in /
  lei_input: more common code between <mark|convert|import>
  lei: improve management around short-lived workers

 lib/PublicInbox/LEI.pm         |  2 +-
 lib/PublicInbox/LeiConvert.pm  | 50 +++++-------------
 lib/PublicInbox/LeiExternal.pm |  3 +-
 lib/PublicInbox/LeiImport.pm   | 94 +++++++++-------------------------
 lib/PublicInbox/LeiInput.pm    | 45 ++++++++++++++--
 lib/PublicInbox/LeiMark.pm     | 59 ++++-----------------
 lib/PublicInbox/LeiMirror.pm   |  2 +-
 lib/PublicInbox/LeiP2q.pm      |  5 +-
 lib/PublicInbox/LeiQuery.pm    |  2 +-
 lib/PublicInbox/NetReader.pm   |  5 +-
 lib/PublicInbox/TestCommon.pm  | 13 +++--
 11 files changed, 109 insertions(+), 171 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 3/5] lei: persistent workers (lei_store) run in /
  2021-03-23 11:48 71% [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
  2021-03-23 11:48 70% ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
@ 2021-03-23 11:48 71% ` Eric Wong
  2021-03-23 11:48 47% ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

Since each lei->event_step can change the directory of
lei-daemon, we need to ensure the lei_store runs in a
directory that is stable.
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17ca637e..d3ac19b2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -453,6 +453,7 @@ sub _lei_atfork_child {
 	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
 	if ($persist) {
+		chdir '/' or die "chdir(/): $!";
 		my @io = delete @$self{qw(0 1 2 sock)};
 		unless ($self->{oneshot}) {
 			close($_) for @io;

^ permalink raw reply related	[relevance 71%]

* [PATCH 2/5] test_common: check lei/errors.log
  2021-03-23 11:48 71% [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
@ 2021-03-23 11:48 70% ` Eric Wong
  2021-03-23 11:48 71% ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
  2021-03-23 11:48 47% ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

This will make it easier to diagnose some large internal
rewrites.
---
 lib/PublicInbox/TestCommon.pm | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index e67e94ea..d4117b6c 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -507,7 +507,7 @@ SKIP: {
 Socket::MsgHdr missing or Inline::C is unconfigured/missing
 EOM
 	$lei_opt = { 1 => \$lei_out, 2 => \$lei_err };
-	my ($daemon_pid, $for_destroy);
+	my ($daemon_pid, $for_destroy, $daemon_xrd);
 	my $tmpdir = $test_opt->{tmpdir};
 	($tmpdir, $for_destroy) = tmpdir unless $tmpdir;
 	SKIP: {
@@ -515,9 +515,9 @@ EOM
 		my $home = "$tmpdir/lei-daemon";
 		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
 		local $ENV{HOME} = $home;
-		my $xrd = "$home/xdg_run";
-		mkdir($xrd, 0700) or BAIL_OUT "mkdir: $!";
-		local $ENV{XDG_RUNTIME_DIR} = $xrd;
+		$daemon_xrd = "$home/xdg_run";
+		mkdir($daemon_xrd, 0700) or BAIL_OUT "mkdir: $!";
+		local $ENV{XDG_RUNTIME_DIR} = $daemon_xrd;
 		$cb->();
 		lei_ok(qw(daemon-pid), \"daemon-pid after $t");
 		chomp($daemon_pid = $lei_out);
@@ -547,6 +547,11 @@ EOM
 			tick;
 		}
 		ok(!kill(0, $daemon_pid), "$t daemon stopped after oneshot");
+		my $f = "$daemon_xrd/lei/errors.log";
+		open my $fh, '<', $f or BAIL_OUT "$f: $!";
+		my @l = <$fh>;
+		is_deeply(\@l, [],
+			"$t daemon XDG_RUNTIME_DIR/lei/errors.log empty");
 	}
 }; # SKIP if missing git 2.6+ || Xapian || SQLite || json
 } # /test_lei

^ permalink raw reply related	[relevance 70%]

* [PATCH 5/5] lei: improve management around short-lived workers
  2021-03-23 11:48 71% [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
  2021-03-23 11:48 70% ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
  2021-03-23 11:48 71% ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
@ 2021-03-23 11:48 47% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
  To: meta

Instead of creating a short-lived circular reference,
ensure they don't exist in the first place.

Note the following changes to hold an extra ref to $sto:

	-	$self->_lei_store(1)->write_prepare($self);
	+	my $sto = $self->_lei_store(1);
	+	$sto->write_prepare($self);

I'm not a perlguts expert, but I actually wanted to switch
to the one-line version for LeiImport, but xt/lei-auth-fail.t
was getting stuck for some reason.  It seems the extra ref
to the LeiStore ($sto) object is necessary.
---
 lib/PublicInbox/LEI.pm         |  1 -
 lib/PublicInbox/LeiConvert.pm  |  3 ++-
 lib/PublicInbox/LeiExternal.pm |  3 ++-
 lib/PublicInbox/LeiImport.pm   | 20 +++++++-------------
 lib/PublicInbox/LeiMark.pm     |  2 +-
 lib/PublicInbox/LeiMirror.pm   |  2 +-
 lib/PublicInbox/LeiP2q.pm      |  5 +++--
 lib/PublicInbox/LeiQuery.pm    |  2 +-
 8 files changed, 17 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index d3ac19b2..8cbaac01 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -462,7 +462,6 @@ sub _lei_atfork_child {
 		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
 		delete $self->{0};
 	}
-	delete @$self{qw(cnv mark imp)};
 	for (delete @$self{qw(3 old_1 au_done)}) {
 		close($_) if defined($_);
 	}
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index bc86fe25..0cc65108 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -46,13 +46,14 @@ sub lei_convert { # the main "lei convert" method
 	my ($lei, @inputs) = @_;
 	$lei->{opt}->{kw} //= 1;
 	$lei->{opt}->{dedupe} //= 'none';
-	my $self = $lei->{cnv} = bless {}, __PACKAGE__;
+	my $self = bless {}, __PACKAGE__;
 	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
 	$lei->{l2m} or return
 		$lei->fail("output not specified or is not a mail destination");
 	$lei->{opt}->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	$self->prepare_inputs($lei, \@inputs) or return;
 	my $op = $lei->workers_start($self, 'lei_convert', 1);
+	$lei->{cnv} = $self;
 	$self->wq_io_do('do_convert', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 9a555831..56d6ef39 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -144,7 +144,8 @@ sub add_external_finish {
 
 sub lei_add_external {
 	my ($self, $location) = @_;
-	$self->_lei_store(1)->write_prepare($self);
+	my $sto = $self->_lei_store(1);
+	$sto->write_prepare($self);
 	my $opt = $self->{opt};
 	my $mirror = $opt->{mirror} // do {
 		my @fail;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 991c84f2..9da6b7f9 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -58,9 +58,13 @@ sub net_merge_complete { # callback used by LeiAuth
 	$self->wq_close(1);
 }
 
-sub import_start {
-	my ($lei) = @_;
-	my $self = $lei->{imp};
+sub lei_import { # the main "lei import" method
+	my ($lei, @inputs) = @_;
+	my $sto = $lei->_lei_store(1);
+	$sto->write_prepare($lei);
+	my $self = bless {}, __PACKAGE__;
+	$self->{-import_kw} = $lei->{opt}->{kw} // 1;
+	$self->prepare_inputs($lei, \@inputs) or return;
 	$lei->ale; # initialize for workers to read
 	my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my $net = $lei->{net}) {
@@ -79,16 +83,6 @@ sub import_start {
 	while ($op && $op->{sock}) { $op->event_step }
 }
 
-sub lei_import { # the main "lei import" method
-	my ($lei, @inputs) = @_;
-	my $sto = $lei->_lei_store(1);
-	$sto->write_prepare($lei);
-	my $self = $lei->{imp} = bless {}, __PACKAGE__;
-	$self->{-import_kw} = $lei->{opt}->{kw} // 1;
-	$self->prepare_inputs($lei, \@inputs) or return;
-	import_start($lei);
-}
-
 no warnings 'once';
 *ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
 
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
index 3b5e6c2c..9d77f4b4 100644
--- a/lib/PublicInbox/LeiMark.pm
+++ b/lib/PublicInbox/LeiMark.pm
@@ -105,8 +105,8 @@ sub input_net_cb { # imap_each, nntp_each cb
 sub lei_mark { # the "lei mark" method
 	my ($lei, @argv) = @_;
 	my $sto = $lei->_lei_store(1);
-	my $self = $lei->{mark} = bless { missing => 0 }, __PACKAGE__;
 	$sto->write_prepare($lei);
+	my $self = bless { missing => 0 }, __PACKAGE__;
 	$lei->ale; # refresh and prepare
 	my $vmd_mod = vmd_mod_extract(\@argv);
 	return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index c916f2d0..6e62625d 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -269,7 +269,6 @@ sub do_mirror { # via wq_io_do
 sub start {
 	my ($cls, $lei, $src, $dst) = @_;
 	my $self = bless { lei => $lei, src => $src, dst => $dst }, $cls;
-	$lei->{mrr} = $self;
 	if ($src =~ m!https?://!) {
 		require URI;
 		require PublicInbox::LeiCurl;
@@ -281,6 +280,7 @@ sub start {
 	my $op = $lei->workers_start($self, 'lei_mirror', 1, {
 		'' => [ \&mirror_done, $lei ]
 	});
+	$lei->{mrr} = $self;
 	$self->wq_io_do('do_mirror', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index 0f7ffb5f..fda055fe 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -176,13 +176,14 @@ sub do_p2q { # via wq_do
 
 sub lei_p2q { # the "lei patch-to-query" entry point
 	my ($lei, $input) = @_;
-	my $self = $lei->{p2q} = bless {}, __PACKAGE__;
+	my $self = bless {}, __PACKAGE__;
 	if ($lei->{opt}->{stdin}) {
 		$self->{0} = delete $lei->{0}; # guard from _lei_atfork_child
 	} else {
 		$self->{input} = $input;
 	}
-	my $op = $lei->workers_start($self, 'lei patch2query', 1);
+	my $op = $lei->workers_start($self, 'lei_p2q', 1);
+	$lei->{p2q} = $self;
 	$self->wq_io_do('do_p2q', []);
 	$self->wq_close(1);
 	while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 148e8524..84996e7e 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -50,11 +50,11 @@ sub lei_q {
 	# --local is enabled by default unless --only is used
 	# we'll allow "--only $LOCATION --local"
 	my $sto = $self->_lei_store(1);
-	my $lse = $sto->search;
 	if (($opt->{'import-remote'} //= 1) |
 			(($opt->{'import-before'} //= \1) ? 1 : 0)) {
 		$sto->write_prepare($self);
 	}
+	my $lse = $sto->search;
 	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		$lxs->prepare_external($lse);
 	}

^ permalink raw reply related	[relevance 47%]

* [PATCH 3/9] lei: drop circular reference in lei_store process
  2021-03-24  9:23 70% [PATCH 0/9] lei: various corner case leak fixes Eric Wong
@ 2021-03-24  9:23 71% ` Eric Wong
  2021-03-24  9:23 71% ` [PATCH 4/9] lei: update {3} after -C chdirs Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-24  9:23 UTC (permalink / raw)
  To: meta

I'm not sure if this was causing real problems, but it's sure ugly.
---
 lib/PublicInbox/LEI.pm | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8cbaac01..ee991f80 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -458,6 +458,9 @@ sub _lei_atfork_child {
 		unless ($self->{oneshot}) {
 			close($_) for @io;
 		}
+		if (my $cfg = $self->{cfg}) {
+			delete $cfg->{-lei_store};
+		}
 	} else { # worker, Net::NNTP (Net::Cmd) uses STDERR directly
 		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
 		delete $self->{0};

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/9] lei: various corner case leak fixes
@ 2021-03-24  9:23 70% Eric Wong
  2021-03-24  9:23 71% ` [PATCH 3/9] lei: drop circular reference in lei_store process Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-03-24  9:23 UTC (permalink / raw)
  To: meta

Making the test suite use a single lei-daemon for all tests has
uncovered a number of bugs that wouldn't get uncovered during
normal usage.  These fixes could be useful if lei-daemon is
deployed a multi-tenant server with both shared and "private"
storage support running under a single Unix user.

Eric Wong (9):
  ds: improve DS->Reset fork-safety
  mbox_lock: dotlock: chdir for relative lock paths
  lei: drop circular reference in lei_store process
  lei: update {3} after -C chdirs
  lei: clean up pkt_op consumer on exception, too
  lei_store: give process a better name
  v2writable: cleanup SQLite handles on --xapian-only
  lei_mirror: fix circular reference
  lei-daemon: do not leak FDs on bogus requests

 lib/PublicInbox/DS.pm         | 76 +++++++++++++++++++++--------------
 lib/PublicInbox/LEI.pm        | 45 ++++++++++++++++-----
 lib/PublicInbox/LeiMirror.pm  |  2 +-
 lib/PublicInbox/LeiStore.pm   |  6 ++-
 lib/PublicInbox/LeiXSearch.pm |  9 +----
 lib/PublicInbox/MboxLock.pm   | 14 +++++++
 lib/PublicInbox/V2Writable.pm |  1 +
 t/lei-daemon.t                | 29 +++++++++++++
 t/mbox_lock.t                 | 12 ++++++
 t/v2reindex.t                 | 10 ++++-
 10 files changed, 153 insertions(+), 51 deletions(-)

^ permalink raw reply	[relevance 70%]

* [PATCH 4/9] lei: update {3} after -C chdirs
  2021-03-24  9:23 70% [PATCH 0/9] lei: various corner case leak fixes Eric Wong
  2021-03-24  9:23 71% ` [PATCH 3/9] lei: drop circular reference in lei_store process Eric Wong
@ 2021-03-24  9:23 71% ` Eric Wong
  2021-03-24  9:23 61% ` [PATCH 5/9] lei: clean up pkt_op consumer on exception, too Eric Wong
  2021-03-24  9:23 58% ` [PATCH 9/9] lei-daemon: do not leak FDs on bogus requests Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-24  9:23 UTC (permalink / raw)
  To: meta

This is necessary for lei->rel2abs correctness, and may
eventually be useful if we can use *at syscalls.
---
 lib/PublicInbox/LEI.pm | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ee991f80..74372532 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -640,6 +640,11 @@ sub dispatch {
 				next if $d eq ''; # same as git(1)
 				chdir $d or return fail($self, "cd $d: $!");
 			}
+			if (delete $self->{3}) { # update cwd for rel2abs
+				opendir my $dh, '.' or
+					return fail($self, "opendir . $!");
+				$self->{3} = $dh;
+			}
 		}
 		$cb->($self, @argv);
 	} elsif (grep(/\A-/, $cmd, @argv)) { # --help or -h only

^ permalink raw reply related	[relevance 71%]

* [PATCH 5/9] lei: clean up pkt_op consumer on exception, too
  2021-03-24  9:23 70% [PATCH 0/9] lei: various corner case leak fixes Eric Wong
  2021-03-24  9:23 71% ` [PATCH 3/9] lei: drop circular reference in lei_store process Eric Wong
  2021-03-24  9:23 71% ` [PATCH 4/9] lei: update {3} after -C chdirs Eric Wong
@ 2021-03-24  9:23 61% ` Eric Wong
  2021-03-24  9:23 58% ` [PATCH 9/9] lei-daemon: do not leak FDs on bogus requests Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-24  9:23 UTC (permalink / raw)
  To: meta

We need to consistently ensure pkt_op_c doesn't lead to a
long-lived circular reference if an exception is thrown in
pre_augment.  Maybe the API could be better, but this fixes an
FD leak when attempting to --augment a FIFO.

Followup-to: b9524082ba39e665 ("lei_xsearch: cleanup {pkt_op_p} on exceptions")
---
 lib/PublicInbox/LEI.pm        | 22 ++++++++++++++++++++--
 lib/PublicInbox/LeiXSearch.pm |  9 ++-------
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 74372532..878685f1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -482,6 +482,24 @@ sub _lei_atfork_child {
 	$current_lei = $persist ? undef : $self; # for SIG{__WARN__}
 }
 
+sub _delete_pkt_op { # OnDestroy callback to prevent leaks on die
+	my ($self) = @_;
+	if (my $op = delete $self->{pkt_op_c}) { # in case of die
+		$op->close; # PublicInbox::PktOp::close
+	}
+	my $unclosed_after_die = delete($self->{pkt_op_p}) or return;
+	close $unclosed_after_die;
+}
+
+sub pkt_op_pair {
+	my ($self, $ops) = @_;
+	require PublicInbox::OnDestroy;
+	require PublicInbox::PktOp;
+	my $end = PublicInbox::OnDestroy->new($$, \&_delete_pkt_op, $self);
+	@$self{qw(pkt_op_c pkt_op_p)} = PublicInbox::PktOp->pair($ops);
+	$end;
+}
+
 sub workers_start {
 	my ($lei, $wq, $ident, $jobs, $ops) = @_;
 	$ops = {
@@ -492,11 +510,11 @@ sub workers_start {
 		($ops ? %$ops : ()),
 	};
 	$ops->{''} //= [ \&dclose, $lei ];
-	require PublicInbox::PktOp;
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	my $end = $lei->pkt_op_pair($ops);
 	$wq->wq_workers_start($ident, $jobs, $lei->oldset, { lei => $lei });
 	delete $lei->{pkt_op_p};
 	my $op = delete $lei->{pkt_op_c};
+	@$end = ();
 	$lei->event_step_init;
 	# oneshot needs $op, daemon-mode uses DS->EventLoop to handle $op
 	$lei->{oneshot} ? $op : undef;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index c6b82eeb..58b6cfc0 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -406,11 +406,6 @@ sub ipc_atfork_child {
 	$self->SUPER::ipc_atfork_child;
 }
 
-sub delete_pkt_op { # OnDestroy callback
-	my $unclosed_after_die = delete($_[0])->{pkt_op_p} or return;
-	close $unclosed_after_die;
-}
-
 sub do_query {
 	my ($self, $lei) = @_;
 	my $l2m = $lei->{l2m};
@@ -426,8 +421,7 @@ sub do_query {
 		'incr_start_query' => [ \&incr_start_query, $self, $l2m ],
 	};
 	$lei->{auth}->op_merge($ops, $l2m) if $l2m && $lei->{auth};
-	my $od = PublicInbox::OnDestroy->new($$, \&delete_pkt_op, $lei);
-	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	my $end = $lei->pkt_op_pair($ops);
 	$lei->{1}->autoflush(1);
 	$lei->start_pager if delete $lei->{need_pager};
 	$lei->{ovv}->ovv_begin($lei);
@@ -446,6 +440,7 @@ sub do_query {
 				$lei->oldset, { lei => $lei });
 	my $op = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
+	@$end = ();
 	$self->{threads} = $lei->{opt}->{threads};
 	if ($l2m) {
 		$l2m->net_merge_complete unless $lei->{auth};

^ permalink raw reply related	[relevance 61%]

* [PATCH 9/9] lei-daemon: do not leak FDs on bogus requests
  2021-03-24  9:23 70% [PATCH 0/9] lei: various corner case leak fixes Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-24  9:23 61% ` [PATCH 5/9] lei: clean up pkt_op consumer on exception, too Eric Wong
@ 2021-03-24  9:23 58% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-24  9:23 UTC (permalink / raw)
  To: meta

If a client passes us the incorrect number of FDs, we'll vivify
them into PerlIO objects so they can be auto-closed.  Using
POSIX::close was considered, but it would've been more code to
handle an uncommon case.
---
 lib/PublicInbox/LEI.pm | 15 +++++++--------
 t/lei-daemon.t         | 29 +++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 878685f1..e5211764 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -981,17 +981,16 @@ sub accept_dispatch { # Listener {post_accept} callback
 		return send($sock, 'timed out waiting to recv FDs', MSG_EOR);
 	# (4096 * 33) >MAX_ARG_STRLEN
 	my @fds = $recv_cmd->($sock, my $buf, 4096 * 33) or return; # EOF
-	if (scalar(@fds) == 4) {
-		for my $i (0..3) {
-			my $fd = shift(@fds);
-			open($self->{$i}, '+<&=', $fd) and next;
-			send($sock, "open(+<&=$fd) (FD=$i): $!", MSG_EOR);
-		}
-	} elsif (!defined($fds[0])) {
+	if (!defined($fds[0])) {
 		warn(my $msg = "recv_cmd failed: $!");
 		return send($sock, $msg, MSG_EOR);
 	} else {
-		return;
+		my $i = 0;
+		for my $fd (@fds) {
+			open($self->{$i++}, '+<&=', $fd) and next;
+			send($sock, "open(+<&=$fd) (FD=$i): $!", MSG_EOR);
+		}
+		return if scalar(@fds) != 4;
 	}
 	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	# $ENV_STR = join('', map { "\0$_=$ENV{$_}" } keys %ENV);
diff --git a/t/lei-daemon.t b/t/lei-daemon.t
index c30e5ac1..35e059b9 100644
--- a/t/lei-daemon.t
+++ b/t/lei-daemon.t
@@ -2,8 +2,16 @@
 # Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
+use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
+use PublicInbox::Spawn qw(which);
 
 test_lei({ daemon_only => 1 }, sub {
+	my $send_cmd = PublicInbox::Spawn->can('send_cmd4') // do {
+		require PublicInbox::CmdIPC4;
+		PublicInbox::CmdIPC4->can('send_cmd4');
+	};
+	$send_cmd or BAIL_OUT 'started testing lei-daemon w/o send_cmd4!';
+
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
 	my $err_log = "$ENV{XDG_RUNTIME_DIR}/lei/errors.log";
 	lei_ok('daemon-pid');
@@ -22,6 +30,27 @@ test_lei({ daemon_only => 1 }, sub {
 	is($pid, $pid_again, 'daemon-pid idempotent');
 	like($lei_err, qr/phail/, 'got mock "phail" error previous run');
 
+	SKIP: {
+		skip 'only testing open files on Linux', 1 if $^O ne 'linux';
+		my $d = "/proc/$pid/fd";
+		skip "no $d on Linux" unless -d $d;
+		my @before = sort(glob("$d/*"));
+		my $addr = pack_sockaddr_un($sock);
+		open my $null, '<', '/dev/null' or BAIL_OUT "/dev/null: $!";
+		my @fds = map { fileno($null) } (0..2);
+		for (0..10) {
+			socket(my $c, AF_UNIX, SOCK_SEQPACKET, 0) or
+							BAIL_OUT "socket: $!";
+			connect($c, $addr) or BAIL_OUT "connect: $!";
+			$send_cmd->($c, \@fds, 'hi',  MSG_EOR);
+		}
+		lei_ok('daemon-pid');
+		chomp($pid = $lei_out);
+		is($pid, $pid_again, 'pid unchanged after failed reqs');
+		my @after = sort(glob("$d/*"));
+		is_deeply(\@before, \@after, 'open files unchanged') or
+			diag explain([\@before, \@after]);;
+	}
 	lei_ok(qw(daemon-kill));
 	is($lei_out, '', 'no output from daemon-kill');
 	is($lei_err, '', 'no error from daemon-kill');

^ permalink raw reply related	[relevance 58%]

* [PATCH 00/10] lei testing improvements
@ 2021-03-25  4:20 68% Eric Wong
  2021-03-25  4:20 71% ` [PATCH 02/10] lei: janky $PATH2CFG garbage collection Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 200+ results
From: Eric Wong @ 2021-03-25  4:20 UTC (permalink / raw)
  To: meta

[7/10] is the centerpiece and gives a ~10% speedup for tests,
which are still slow to me.  Speedup or not, it's uncovered a
bunch of subtle bugs over the past few days so I'm glad I
worked on it.

There's still some rare errors that come from looping
"make check-run TEST_LEI_ERR_LOUD" which I'm still trying
to figure out...

Eric Wong (10):
  test_common: cleanup inbox objects after use
  lei: janky $PATH2CFG garbage collection
  test_common: TEST_LEI_ERR_LOUD does not hide path names
  lei add-external: do not initialize writable store
  lei_mirror: don't show success on failure
  t/*: drop unnecessary v1-specific index calls
  tests: "check-run" uses persistent lei daemon
  lei import: force store, improve test diagnostics
  t/cmd_ipc: workaround signal handling raciness
  t/lei: add more diagnostics for failures

 lib/PublicInbox/LEI.pm         |  6 +++++
 lib/PublicInbox/LeiExternal.pm |  2 --
 lib/PublicInbox/LeiImport.pm   |  6 ++---
 lib/PublicInbox/LeiMirror.pm   | 11 ++++++---
 lib/PublicInbox/TestCommon.pm  | 41 +++++++++++++++++++++++-----------
 t/cmd_ipc.t                    | 28 ++++++++++++++++-------
 t/inbox_idle.t                 |  2 --
 t/lei-externals.t              |  7 ++++--
 t/lei-import-maildir.t         | 13 +++++++----
 t/lei-mark.t                   |  2 +-
 t/lei-mirror.t                 | 18 +++++++++++++++
 t/lei-q-kw.t                   |  6 ++---
 t/lei-q-thread.t               | 15 +++++++------
 t/nntpd.t                      |  4 ----
 t/run.perl                     | 19 ++++++++++++++++
 t/v2mda.t                      |  4 ----
 t/watch_filter_rubylang.t      |  7 ++----
 17 files changed, 130 insertions(+), 61 deletions(-)

^ permalink raw reply	[relevance 68%]

* [PATCH 02/10] lei: janky $PATH2CFG garbage collection
  2021-03-25  4:20 68% [PATCH 00/10] lei testing improvements Eric Wong
@ 2021-03-25  4:20 71% ` Eric Wong
  2021-03-25  4:20 71% ` [PATCH 04/10] lei add-external: do not initialize writable store Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-25  4:20 UTC (permalink / raw)
  To: meta

We need to rely on this to keep our config cache (and lei_store
pipes) under control with tests each creating a new config and
directory.
---
 lib/PublicInbox/LEI.pm | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e5211764..d534f1d0 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -705,6 +705,12 @@ sub _lei_cfg ($;$) {
 			File::Spec->canonpath($cfg->{'leistore.dir'})) {
 		$cfg->{-lei_store} = $sto;
 	}
+	if (scalar(keys %PATH2CFG) > 5) {
+		# FIXME: use inotify/EVFILT_VNODE to detect unlinked configs
+		for my $k (keys %PATH2CFG) {
+			delete($PATH2CFG{$k}) unless -f $k
+		}
+	}
 	$self->{cfg} = $PATH2CFG{$f} = $cfg;
 }
 

^ permalink raw reply related	[relevance 71%]

* [PATCH 04/10] lei add-external: do not initialize writable store
  2021-03-25  4:20 68% [PATCH 00/10] lei testing improvements Eric Wong
  2021-03-25  4:20 71% ` [PATCH 02/10] lei: janky $PATH2CFG garbage collection Eric Wong
@ 2021-03-25  4:20 71% ` Eric Wong
  2021-03-25  4:20 57% ` [PATCH 07/10] tests: "check-run" uses persistent lei daemon Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-25  4:20 UTC (permalink / raw)
  To: meta

There's no need to create or write lei/store when adding
an external, we just need to write to the config file.
---
 lib/PublicInbox/LeiExternal.pm | 2 --
 t/lei-externals.t              | 3 +--
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 56d6ef39..5e8dc71a 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -144,8 +144,6 @@ sub add_external_finish {
 
 sub lei_add_external {
 	my ($self, $location) = @_;
-	my $sto = $self->_lei_store(1);
-	$sto->write_prepare($self);
 	my $opt = $self->{opt};
 	my $mirror = $opt->{mirror} // do {
 		my @fail;
diff --git a/t/lei-externals.t b/t/lei-externals.t
index 2045691f..afd90d19 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -93,8 +93,7 @@ test_lei(sub {
 				\'added external');
 		is($lei_out.$lei_err, '', 'no output');
 	});
-	ok(-s $config_file && -e $store_dir,
-		'add-external created config + store');
+	ok(-s $config_file, 'add-external created config');
 	my $lcfg = PublicInbox::Config->new($config_file);
 	$cfg->each_inbox(sub {
 		my ($ibx) = @_;

^ permalink raw reply related	[relevance 71%]

* [PATCH 07/10] tests: "check-run" uses persistent lei daemon
  2021-03-25  4:20 68% [PATCH 00/10] lei testing improvements Eric Wong
  2021-03-25  4:20 71% ` [PATCH 02/10] lei: janky $PATH2CFG garbage collection Eric Wong
  2021-03-25  4:20 71% ` [PATCH 04/10] lei add-external: do not initialize writable store Eric Wong
@ 2021-03-25  4:20 57% ` Eric Wong
  2021-03-25  4:20 52% ` [PATCH 08/10] lei import: force store, improve test diagnostics Eric Wong
  2021-03-25  4:20 56% ` [PATCH 10/10] t/lei: add more diagnostics for failures Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-25  4:20 UTC (permalink / raw)
  To: meta

We'll use a lei-daemon if it's already running and
TEST_LEI_DAEMON_PERSIST_DIR is set, but we can also start
one and manage it from t/run.perl

This drops "make check-run TEST_LEI_DAEMON_ONLY=1"
time by ~10% for me.
---
 lib/PublicInbox/TestCommon.pm | 22 +++++++++++++++-------
 t/lei-externals.t             |  4 ++++
 t/run.perl                    | 19 +++++++++++++++++++
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index ffff5902..ca165a04 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -515,22 +515,30 @@ EOM
 	my ($daemon_pid, $for_destroy, $daemon_xrd);
 	my $tmpdir = $test_opt->{tmpdir};
 	($tmpdir, $for_destroy) = tmpdir unless $tmpdir;
+	state $persist_xrd = $ENV{TEST_LEI_DAEMON_PERSIST_DIR};
 	SKIP: {
 		skip 'TEST_LEI_ONESHOT set', 1 if $ENV{TEST_LEI_ONESHOT};
 		my $home = "$tmpdir/lei-daemon";
 		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
 		local $ENV{HOME} = $home;
-		$daemon_xrd = "$home/xdg_run";
-		mkdir($daemon_xrd, 0700) or BAIL_OUT "mkdir: $!";
+		my $persist;
+		if ($persist_xrd && !$test_opt->{daemon_only}) {
+			$persist = $daemon_xrd = $persist_xrd;
+		} else {
+			$daemon_xrd = "$home/xdg_run";
+			mkdir($daemon_xrd, 0700) or BAIL_OUT "mkdir: $!";
+		}
 		local $ENV{XDG_RUNTIME_DIR} = $daemon_xrd;
 		$cb->();
-		lei_ok(qw(daemon-pid), \"daemon-pid after $t");
-		chomp($daemon_pid = $lei_out);
-		if ($daemon_pid) {
+		unless ($persist) {
+			lei_ok(qw(daemon-pid), \"daemon-pid after $t");
+			chomp($daemon_pid = $lei_out);
+			if (!$daemon_pid) {
+				fail("daemon not running after $t");
+				skip 'daemon died unexpectedly', 2;
+			}
 			ok(kill(0, $daemon_pid), "daemon running after $t");
 			lei_ok(qw(daemon-kill), \"daemon-kill after $t");
-		} else {
-			fail("daemon not running after $t");
 		}
 	}; # SKIP for lei_daemon
 	unless ($test_opt->{daemon_only}) {
diff --git a/t/lei-externals.t b/t/lei-externals.t
index afd90d19..488bf5ad 100644
--- a/t/lei-externals.t
+++ b/t/lei-externals.t
@@ -57,6 +57,8 @@ SKIP: {
 		chomp(my $pid_after = $lei_out);
 		is($pid_after, $pid_before, 'pid unchanged') or
 			skip 'daemon died', 1;
+		skip 'not killing persistent lei-daemon', 2 if
+				$ENV{TEST_LEI_DAEMON_PERSIST_DIR};
 		lei_ok 'daemon-kill';
 		my $alive = 1;
 		for (1..100) {
@@ -262,6 +264,8 @@ test_lei(sub {
 	}
 
 	{
+		skip 'TEST_LEI_DAEMON_PERSIST_DIR in use', 1 if
+					$ENV{TEST_LEI_DAEMON_PERSIST_DIR};
 		opendir my $dh, '.' or BAIL_OUT "opendir(.) $!";
 		my $od = PublicInbox::OnDestroy->new($$, sub {
 			chdir $dh or BAIL_OUT "chdir: $!"
diff --git a/t/run.perl b/t/run.perl
index e8512e18..1fb1c5f0 100755
--- a/t/run.perl
+++ b/t/run.perl
@@ -14,6 +14,7 @@ use strict;
 use v5.10.1;
 use IO::Handle; # ->autoflush
 use PublicInbox::TestCommon;
+use PublicInbox::Spawn;
 use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
 use Errno qw(EINTR);
 use Fcntl qw(:seek);
@@ -40,6 +41,20 @@ $OLDERR->autoflush(1);
 
 key2sub($_) for @tests; # precache
 
+my ($for_destroy, $lei_env, $lei_daemon_pid, $owner_pid);
+if (!$ENV{TEST_LEI_DAEMON_PERSIST_DIR} &&
+		(PublicInbox::Spawn->can('recv_cmd4') ||
+			eval { require Socket::MsgHdr })) {
+	$lei_env = {};
+	($lei_env->{XDG_RUNTIME_DIR}, $for_destroy) = tmpdir;
+	$ENV{TEST_LEI_DAEMON_PERSIST_DIR} = $lei_env->{XDG_RUNTIME_DIR};
+	run_script([qw(lei daemon-pid)], $lei_env, { 1 => \$lei_daemon_pid });
+	chomp $lei_daemon_pid;
+	$lei_daemon_pid =~ /\A[0-9]+\z/ or die "no daemon pid: $lei_daemon_pid";
+	kill(0, $lei_daemon_pid) or die "kill $lei_daemon_pid: $!";
+	$owner_pid = $$;
+}
+
 if ($shuffle) {
 	require List::Util;
 } elsif (open(my $prove_state, '<', '.prove') && eval { require YAML::XS }) {
@@ -209,3 +224,7 @@ for (my $i = $repeat; $i != 0; $i--) {
 }
 
 print $OLDOUT "1..".($repeat * scalar(@tests))."\n" if $repeat >= 0;
+if ($lei_env && $$ == $owner_pid) {
+	my $opt = {}; # 1 => $OLDOUT, 2 => $OLDERR };
+	run_script([qw(lei daemon-kill)], $lei_env, $opt);
+}

^ permalink raw reply related	[relevance 57%]

* [PATCH 08/10] lei import: force store, improve test diagnostics
  2021-03-25  4:20 68% [PATCH 00/10] lei testing improvements Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-25  4:20 57% ` [PATCH 07/10] tests: "check-run" uses persistent lei daemon Eric Wong
@ 2021-03-25  4:20 52% ` Eric Wong
  2021-03-25  4:20 56% ` [PATCH 10/10] t/lei: add more diagnostics for failures Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-25  4:20 UTC (permalink / raw)
  To: meta

"lei import" should never be without a {sto}, and *_done should
not be called multiple times, so ensure we can fail if it's
missing.

Update some existing tests to complain loudly by introducing a
handy "xbail" function which wraps "explain" and BAIL_OUT.
BAIL_OUT was painful to type and concatenating the result of
"explain" doesn't work as I thought it would since "explain"
always returns an array, and BAIL_OUT only accepts a single
scalar arg (unlike "die").
---
 lib/PublicInbox/LeiImport.pm  | 6 +++---
 lib/PublicInbox/TestCommon.pm | 4 +++-
 t/lei-mark.t                  | 2 +-
 t/lei-q-kw.t                  | 6 +++---
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 9da6b7f9..7c5b7d09 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -39,14 +39,14 @@ sub import_done_wait { # dwaitpid callback
 	my ($arg, $pid) = @_;
 	my ($imp, $lei) = @$arg;
 	$lei->child_error($?, 'non-fatal errors during import') if $?;
-	my $sto = delete $lei->{sto};
-	my $wait = $sto->ipc_do('done') if $sto; # PublicInbox::LeiStore::done
+	my $sto = delete $lei->{sto} // return $lei->fail('BUG: {sto} gone');
+	my $wait = $sto->ipc_do('done'); # PublicInbox::LeiStore::done
 	$lei->dclose;
 }
 
 sub import_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	my $imp = delete $lei->{imp} or return;
+	my $imp = delete $lei->{imp} // return $lei->fail('BUG: {imp} gone');
 	$imp->wq_wait_old(\&import_done_wait, $lei);
 }
 
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index ca165a04..72617a78 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -17,7 +17,7 @@ BEGIN {
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes create_inbox
 		tcp_host_port test_lei lei lei_ok $lei_out $lei_err $lei_opt
-		test_httpd);
+		test_httpd xbail);
 	require Test::More;
 	my @methods = grep(!/\W/, @Test::More::EXPORT);
 	eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
@@ -25,6 +25,8 @@ BEGIN {
 	push @EXPORT, @methods;
 }
 
+sub xbail (@) { BAIL_OUT join(' ', map { ref ? (explain($_)) : ($_) } @_) }
+
 sub eml_load ($) {
 	my ($path, $cb) = @_;
 	open(my $fh, '<', $path) or die "open $path: $!";
diff --git a/t/lei-mark.t b/t/lei-mark.t
index ddf5634c..76995589 100644
--- a/t/lei-mark.t
+++ b/t/lei-mark.t
@@ -30,7 +30,7 @@ test_lei(sub {
 	ok(-s $mb, 'wrote mbox result');
 	lei_ok(qw(q m:testmessage@example.com -o), $md);
 	my @fn = glob("$md/cur/*");
-	scalar(@fn) == 1 or BAIL_OUT 'no mail '.explain(\@fn);
+	scalar(@fn) == 1 or xbail $lei_err, 'no mail', \@fn;
 	rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
 	$check_kw->(['flagged'], msg => 'after bad request');
 	lei_ok(qw(mark -F eml t/utf8.eml -kw:flagged));
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index 4db27363..c17411fb 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -21,7 +21,7 @@ lei_ok(qw(import -F eml t/plack-qp.eml));
 my $o = "$ENV{HOME}/dst";
 lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
 my @fn = glob("$o/cur/*:2,");
-scalar(@fn) == 1 or BAIL_OUT "wrote multiple or zero files: ".explain(\@fn);
+scalar(@fn) == 1 or xbail $lei_err, 'wrote multiple or zero files:', \@fn;
 rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
 
 lei_ok(qw(q -o), "maildir:$o", qw(m:bogus-noresults@example.com));
@@ -124,7 +124,7 @@ lei_ok(qw(q -o), $o, "m:$m", @inc);
 
 # emulate MUA marking a Maildir message as read:
 @fn = glob("$o/cur/*");
-scalar(@fn) == 1 or BAIL_OUT "wrote multiple or zero files: ".explain(\@fn);
+scalar(@fn) == 1 or xbail $lei_err, 'wrote multiple or zero files:', \@fn;
 rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
 
 lei_ok(qw(q -o), $o, 'bogus', \'clobber output dir to import keywords');
@@ -178,7 +178,7 @@ $m = 'multipart@example.com';
 $o = "$ENV{HOME}/fuzz";
 lei_ok('q', '-o', $o, "m:$m", @inc);
 @fn = glob("$o/cur/*");
-scalar(@fn) == 1 or BAIL_OUT "wrote multiple or zero files: ".explain(\@fn);
+scalar(@fn) == 1 or xbail $lei_err, "wrote multiple or zero files", \@fn;
 rename($fn[0], "$fn[0]S") or BAIL_OUT "rename $!";
 lei_ok('q', '-o', $o, "m:$m");
 is_deeply([glob("$o/cur/*")], [], 'clobbered output results');

^ permalink raw reply related	[relevance 52%]

* [PATCH 10/10] t/lei: add more diagnostics for failures
  2021-03-25  4:20 68% [PATCH 00/10] lei testing improvements Eric Wong
                   ` (3 preceding siblings ...)
  2021-03-25  4:20 52% ` [PATCH 08/10] lei import: force store, improve test diagnostics Eric Wong
@ 2021-03-25  4:20 56% ` Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-25  4:20 UTC (permalink / raw)
  To: meta

This seems to error out while looping the test suite and
I'm not 100% sure why.
---
 t/lei-import-maildir.t | 13 +++++++++----
 t/lei-q-thread.t       | 15 ++++++++-------
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index bd89677a..6706b014 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -11,17 +11,20 @@ test_lei(sub {
 	symlink(abs_path('t/data/0001.patch'), "$md/cur/x:2,S") or
 		BAIL_OUT "symlink $md $!";
 	lei_ok(qw(import), $md, \'import Maildir');
+	my $imp_err = $lei_err;
 	lei_ok(qw(q s:boolean));
 	my $res = json_utf8->decode($lei_out);
-	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
+	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result')
+			or diag explain($imp_err, $res);
 	is_deeply($res->[0]->{kw}, ['seen'], 'keyword set');
 	is($res->[1], undef, 'only got one result');
 
 	lei_ok(qw(import), $md, \'import Maildir again');
+	$imp_err = $lei_err;
 	lei_ok(qw(q -d none s:boolean), \'lei q w/o dedupe');
 	my $r2 = json_utf8->decode($lei_out);
-	is_deeply($r2, $res, 'idempotent import');
-
+	is_deeply($r2, $res, 'idempotent import')
+			or diag explain($imp_err, $res);
 	rename("$md/cur/x:2,S", "$md/cur/x:2,SR") or BAIL_OUT "rename: $!";
 	lei_ok('import', "maildir:$md", \'import Maildir after +answered');
 	lei_ok(qw(q -d none s:boolean), \'lei q after +answered');
@@ -33,8 +36,10 @@ test_lei(sub {
 	symlink(abs_path('t/utf8.eml'), "$md/cur/u:2,ST") or
 		BAIL_OUT "symlink $md $!";
 	lei_ok('import', "maildir:$md", \'import Maildir w/ trashed message');
+	$imp_err = $lei_err;
 	lei_ok(qw(q -d none m:testmessage@example.com));
 	$res = json_utf8->decode($lei_out);
-	is_deeply($res, [ undef ], 'trashed message not imported');
+	is_deeply($res, [ undef ], 'trashed message not imported')
+			or diag explain($imp_err, $res);
 });
 done_testing;
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
index c999d12b..26d06eec 100644
--- a/t/lei-q-thread.t
+++ b/t/lei-q-thread.t
@@ -13,7 +13,8 @@ test_lei(sub {
 
 	lei_ok qw(q -t m:testmessage@example.com);
 	my $res = json_utf8->decode($lei_out);
-	is_deeply($res->[0]->{kw}, [ 'seen' ], 'q -t sets keywords');
+	is_deeply($res->[0]->{kw}, [ 'seen' ], 'q -t sets keywords') or
+		diag explain($res);
 
 	$eml = eml_load('t/utf8.eml');
 	$eml->header_set('References', $eml->header('Message-ID'));
@@ -28,9 +29,9 @@ test_lei(sub {
 	pop @$res;
 	my %m = map { $_->{'m'} => $_ } @$res;
 	is_deeply($m{'testmessage@example.com'}->{kw}, ['seen'],
-		'flag set in direct hit');
-	'TODO' or is_deeply($m{'a-reply@miss'}->{kw}, ['draft'],
-		'flag set in thread hit');
+		'flag set in direct hit') or diag explain($res);
+	is_deeply($m{'a-reply@miss'}->{kw}, ['draft'],
+		'flag set in thread hit') or diag explain($res);
 
 	lei_ok qw(q -t -t m:testmessage@example.com);
 	$res = json_utf8->decode($lei_out);
@@ -38,9 +39,9 @@ test_lei(sub {
 	pop @$res;
 	%m = map { $_->{'m'} => $_ } @$res;
 	is_deeply($m{'testmessage@example.com'}->{kw}, ['flagged', 'seen'],
-		'flagged set in direct hit');
-	'TODO' or is_deeply($m{'testmessage@example.com'}->{kw}, ['draft'],
-		'flagged set in direct hit');
+		'flagged set in direct hit') or diag explain($res);
+	is_deeply($m{'a-reply@miss'}->{kw}, ['draft'],
+		'set in thread hit') or diag explain($res);
 	lei_ok qw(q -tt m:testmessage@example.com --only), "$ro_home/t2";
 	$res = json_utf8->decode($lei_out);
 	is_deeply($res->[0]->{kw}, [ qw(flagged seen) ],

^ permalink raw reply related	[relevance 56%]

* is "lei mark" a good name?
@ 2021-03-25  5:22 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-03-25  5:22 UTC (permalink / raw)
  To: meta

It can set/unset volatile metadata for any number of messages.

"Volatile metadata" being "labels" (aka "mailboxes" in
JMAP-speak) and "keywords" (seen|flagged|answered|...),
(aka "flags" in IMAP/Maildir-speak).

	"lei mark +kw:seen"		# makes sense

	"lei mark +L:some-folder-name"	# might sound odd...


AFAIK, notmuch uses "notmuch tag" which combines both labels
and keywords into one thing: "tags".  But I'm also not a
notmuch user...

Would "lei tag" be better?

Anything else?

^ permalink raw reply	[relevance 71%]

Results 201-400 of ~1312   |  | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-12-18 12:09     [PATCH 00/26] lei: basic UI + IPC work Eric Wong
2020-12-18 12:09     ` [PATCH 02/26] lei: proposed command-listing and options Eric Wong
2021-02-18 20:42 70%   ` lei q --save-as=... requires too much thinking Eric Wong
2021-02-01  5:57     [PATCH 0/2] doc: initial lei manpages Kyle Meyer
2021-02-01  5:57     ` [PATCH 1/2] doc: start manpages for lei commands Kyle Meyer
2021-02-07 19:58 90%   ` lei q --output vs --mfolder [was: [PATCH 1/2] doc: start manpages for lei commands] Eric Wong
2021-02-07 20:33 90%     ` Kyle Meyer
2021-02-07 20:59 90%       ` Eric Wong
2021-02-07 21:47 90%         ` Kyle Meyer
2021-02-07 21:55 90%           ` Eric Wong
2021-02-07  8:51 63% [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
2021-02-07  8:51 37% ` [PATCH 03/19] lei add-external: handle interrupts with --mirror Eric Wong
2021-02-07  8:51 50% ` [PATCH 12/19] lei: more consistent IPC exit and error handling Eric Wong
2021-02-07  8:51 56% ` [PATCH 13/19] lei: remove --mua-cmd alias for --mua Eric Wong
2021-02-07  8:51 41% ` [PATCH 14/19] lei: replace --thread with --threads Eric Wong
2021-02-07  8:51 33% ` [PATCH 15/19] lei q: improve remote mboxrd UX Eric Wong
2021-02-07  8:51 65% ` [PATCH 16/19] lei q: SIGWINCH process group with the terminal Eric Wong
2021-02-07  8:51 43% ` [PATCH 17/19] lei import: support Maildirs Eric Wong
2021-02-07 10:40 71% ` [PATCH 21/19] lei q: fix arbitrary --mua command handling Eric Wong
2021-02-08  8:49 71% lei q --remote-if-local-missing ? Eric Wong
2021-02-08  9:05 63% [PATCH 00/13] lei approxidate, startup fix, --alert Eric Wong
2021-02-08  9:05 32% ` [PATCHv2 01/13] lei q: improve remote mboxrd UX + MUA Eric Wong
2021-02-08  9:05 64% ` [PATCH 03/13] lei q: SIGWINCH process group with the terminal Eric Wong
2021-02-08  9:05 50% ` [PATCH 04/13] lei q: support --alert=CMD for early MUA users Eric Wong
2021-02-08  9:05 71% ` [PATCH 07/13] lei: start_pager: drop COLUMNS default Eric Wong
2021-02-08  9:05 56% ` [PATCH 08/13] lei: avoid racing on unlink + bind + listen Eric Wong
2021-02-08  9:05 68% ` [PATCH 09/13] lei: drop BSD::Resource usage Eric Wong
2021-02-08  9:05 42% ` [PATCH 11/13] lei q: use git approxidate with d:, dt: and rt: ranges Eric Wong
2021-02-09  8:09     [PATCH 00/11] Maildir code consolidation, test updates Eric Wong
2021-02-09  8:09 36% ` [PATCH 05/11] lei: split out MdirReader package, lazy-require earlier Eric Wong
2021-02-09  8:09 64% ` [PATCH 08/11] lei q: prefix --alert ops with ':' instead of '-' Eric Wong
2021-02-09  8:09 67% ` [PATCH 10/11] lei: replace "I:"-prefixed info messages with "#" Eric Wong
2021-02-09  8:09 81% ` [PATCH 11/11] tests|lei: fixes for TEST_RUN_MODE=0 and lei oneshot Eric Wong
2021-02-10  7:07 71% [PATCH 0/6] more lei stuffs Eric Wong
2021-02-10  7:07 47% ` [PATCH 1/6] lei *external: glob improvements, ls-external filtering Eric Wong
2021-02-10  7:07 71% ` [PATCH 3/6] test_common: support lei-daemon only testing Eric Wong
2021-02-10  7:07 49% ` [PATCH 4/6] lei ls-external: support --local and --remote Eric Wong
2021-02-10  7:07 71% ` [PATCH 5/6] lei: note some TODO items (curl, externals) Eric Wong
2021-02-10 19:57 70% [PATCH 0/2] WWW + "lei q --stdin": support git approxidate Eric Wong
2021-02-10 19:57 37% ` [PATCH 1/2] search: use git approxidate in WWW and "lei q --stdin" Eric Wong
2021-02-12  4:34 71% ` [PATCH 0/2] WWW + "lei q --stdin": support git approxidate Kyle Meyer
2021-02-11  4:04 71% [PATCH 0/4] doc: lei manpages, round 2 Kyle Meyer
2021-02-11  4:04 71% ` [PATCH 1/4] doc: lei q: use 'mfolder' as --output placeholder Kyle Meyer
2021-02-11  4:04 65% ` [PATCH 2/4] doc: lei: prefer 'location' and 'dirname' Kyle Meyer
2021-02-11  4:04 52% ` [PATCH 3/4] doc: add lei-import(1) Kyle Meyer
2021-02-11  4:04 44% ` [PATCH 4/4] doc: lei: update manpages Kyle Meyer
2021-02-11  5:08 71% ` [PATCH 0/4] doc: lei manpages, round 2 Eric Wong
2021-02-15  7:43 71% [PATCH] lei: fail_handler: use correct exit code Eric Wong
2021-02-17  4:40 71% does "lei q" --format/-f need to exist? Eric Wong
2021-02-18  5:28 71% ` Kyle Meyer
2021-02-18 12:07 71%   ` Eric Wong
2021-02-19  3:10 71%     ` Kyle Meyer
2021-02-19 11:13 71%       ` Eric Wong
2021-02-19 13:47 71%         ` Kyle Meyer
2021-02-19 19:06 71%         ` Eric Wong
2021-02-20  7:12 71%           ` Kyle Meyer
2021-02-20  8:07 71%             ` Eric Wong
2021-02-23  3:45 51%               ` [PATCH] doc: lei: favor "-o format:$PATHNAME" over "-f" Kyle Meyer
2021-02-23  6:03 71%                 ` Eric Wong
2021-02-17 10:06 64% [PATCH 00/11] lei IMAP read support Eric Wong
2021-02-17 10:06 71% ` [PATCH 01/11] lei: bless config Eric Wong
2021-02-17 10:07 55% ` [PATCH 04/11] lei import: start rearranging code for IMAP support Eric Wong
2021-02-17 10:07 85% ` [PATCH 05/11] lei import: move check_input_format to lei Eric Wong
2021-02-17 10:07 18% ` [PATCH 08/11] lei convert: mail format conversion sub-command Eric Wong
2021-02-17 10:53 71%   ` Eric Wong
2021-02-18 11:06 69%     ` [PATCHv2 0/4] lei IMAP support take #2 Eric Wong
2021-02-18 11:06 18%       ` [PATCHv2 1/4] lei convert: mail format conversion sub-command Eric Wong
2021-02-18 20:22 68%         ` [PATCHv3 0/4] lei convert IMAP support Eric Wong
2021-02-18 20:22 18%         ` [PATCHv3 1/4] lei convert: mail format conversion sub-command Eric Wong
2021-02-18 20:22 37%         ` [PATCHv3 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support Eric Wong
2021-02-18 20:22 47%         ` [PATCHv3 3/4] lei: consolidate the bulk of the IPC code Eric Wong
2021-02-18 20:22 63%         ` [PATCHv3 4/4] lei: check for IMAP auth errors Eric Wong
2021-02-18 11:06 37%       ` [PATCHv2 2/4] lei import: add IMAP and (maildir|mbox*):$PATHNAME support Eric Wong
2021-02-18 11:06 47%       ` [PATCH (resend) 3/4] lei: consolidate the bulk of the IPC code Eric Wong
2021-02-18 11:06 61%       ` [PATCHv2 4/4] lei: check for IMAP auth errors Eric Wong
2021-02-17 10:07 37% ` [PATCH 09/11] lei import: add IMAP, (maildir|mbox*):$PATHNAME support Eric Wong
2021-02-17 10:07 47% ` [PATCH 10/11] lei: consolidate the bulk of the IPC code Eric Wong
2021-02-17 10:07 62% ` [PATCH 11/11] lei: check for IMAP auth errors Eric Wong
2021-02-18 12:27 71% [PATCH] lei: completion: bash: generalize nospace usage Eric Wong
2021-02-25 10:33 71% ` better "compopt -o nospace" ideas? [was: lei: completion: bash: generalize nospace usage] Eric Wong
2021-02-18 20:28 99% lei stuff that should be in a lei(1) or lei-overview(7) Eric Wong
2021-02-22  3:42 99% ` Eric Wong
2021-02-19 12:09 69% [PATCH 0/6] lei: start working on IMAP writes Eric Wong
2021-02-19 12:09 63% ` [PATCH 1/6] t/lei-externals: favor "-o format:$PATHNAME" over "-f" Eric Wong
2021-02-21  7:41 69% [PATCH 0/7] "lei q -o imaps://..." support Eric Wong
2021-02-21  7:41 34% ` [PATCH 2/7] lei q: support IMAP/IMAPS --output destinations Eric Wong
2021-02-21  7:41 58% ` [PATCH 4/7] lei q: move augment into lei2mail workers Eric Wong
2021-02-21 18:28 71% [PATCH] lei-daemon: prefer graceful shutdowns Eric Wong
2021-02-21 19:59 37% [PATCH] t/lei*: drop $lei->(...) sub Eric Wong
2021-02-21 20:42 71% ` [SQUASH 2/1] t/lei-externals: squash fix Eric Wong
2021-02-22  5:37 71% lei: accessing blob after import requires daemon restart Kyle Meyer
2021-02-22 11:21 69% [PATCH 00/10] lei: avoid wasting IMAP connections Eric Wong
2021-02-22 11:22     ` [PATCH 01/10] lei_auth: rename {nrd} field to {net} for clarity Eric Wong
2021-02-22 11:22 68%   ` [PATCH 02/10] lei: keep client {sock} in short-lived workers Eric Wong
2021-02-22 11:22 65%   ` [PATCH 03/10] lei: _lei_cfg: return empty hashref if unconfigured Eric Wong
2021-02-22 11:22 60%   ` [PATCH 04/10] lei convert: auth directly from worker process Eric Wong
2021-02-22 11:22 53%   ` [PATCH 05/10] lei import: no separate auth worker Eric Wong
2021-02-22 11:22 40%   ` [PATCH 07/10] lei q: reduce wasted IMAP connection for auth Eric Wong
2021-02-22 11:22 71%   ` [PATCH 09/10] lei convert: inline convert_start Eric Wong
2021-02-22 21:38     [PATCH 0/2] fix Perl 5.10.1 compatibility Eric Wong
2021-02-22 21:38 53% ` [PATCH 2/2] lei: avoid needless env passing to subcommands Eric Wong
2021-02-23 10:01 71% [PATCH 0/3] lei -C DIR and more Eric Wong
2021-02-23 10:01 33% ` [PATCH 1/3] lei: support "-C" to chdir in all sub commands Eric Wong
2021-02-23 10:01 71% ` [PATCH 2/3] lei q: reduce default lei2mail workers Eric Wong
2021-02-24 11:31 71% [PATCH 0/4] lei <import|convert> nntp:// Eric Wong
2021-02-24 11:31 15% ` [PATCH 2/4] lei <import|convert>: support NNTP sources Eric Wong
2021-02-24 20:49 70% lei: per-message keywords and externals Eric Wong
2021-02-26  9:26 71% ` Eric Wong
2021-03-02  9:28 71%   ` Eric Wong
2021-02-24 23:37 71% [PATCH 0/2] "lei q" remote memoization Eric Wong
2021-02-24 23:37 70% ` [PATCH 2/2] lei q: auto-memoize remote messages into lei/store Eric Wong
2021-02-25 10:11 68% [PATCH 0/4] lei: fleshing out some existing features Eric Wong
2021-02-25 10:11 45% ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
2021-02-25 10:11 44% ` [PATCH 2/4] lei import: use --in-format/-F for consistency Eric Wong
2021-02-25 10:11 44% ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
2021-02-26  3:38 71%   ` Kyle Meyer
2021-02-26  4:13 71%     ` Eric Wong
2021-02-26  4:38 71%       ` Kyle Meyer
2021-02-26  9:41 71% [PATCH 0/5] lei mbox locking Eric Wong
2021-02-26  9:41 71% ` [PATCH 1/5] lei: style fix for $oldset declaration Eric Wong
2021-02-26  9:41 36% ` [PATCH 2/5] lei q: support mbox locking by default Eric Wong
2021-02-26  9:41 54% ` [PATCH 3/5] lei import|convert: support mbox locking on reads Eric Wong
2021-02-27 18:03 71% [PATCH 0/3] doc: lei manpages, round 3 Kyle Meyer
2021-02-27 18:03 48% ` [PATCH 1/3] doc: lei: update manpages Kyle Meyer
2021-02-27 18:03 71% ` [PATCH 2/3] doc: lei-import: drop markup of "stdin" Kyle Meyer
2021-02-27 18:03 70% ` [PATCH 3/3] doc: lei-overview: add performance and bash completion sections Kyle Meyer
2021-02-27 20:20 71% ` [PATCH 0/3] doc: lei manpages, round 3 Eric Wong
2021-02-28 12:25 71% [PATCH 0/3] lei p2q (patch-to-query) Eric Wong
2021-02-28 12:25 51% ` [PATCH 1/3] lei p2q: patch-to-query generator for "lei q --stdin" Eric Wong
2021-02-28 21:40 90%   ` Kyle Meyer
2021-03-01  5:47 58%     ` [PATCH 4/3] lei p2q: fix /dev/null filenames, fix phrase quoting rules Eric Wong
2021-02-28 12:25 64% ` [PATCH 2/3] lei q: fix "-" shortcut for --stdin Eric Wong
2021-02-28 12:25 52% ` [PATCH 3/3] lei q: improve early aborts w/ remote externals Eric Wong
2021-03-02 23:04 71% read-write JMAP for lei? Eric Wong
2021-03-03  3:53 71% should lei attempt to index mail outside of git? Eric Wong
2021-03-03 13:48 71% [PATCH 0/4] lei q: avoiding accidental data loss Eric Wong
2021-03-03 13:48 55% ` [PATCH 3/4] lei: use maildir_each_eml in more places Eric Wong
2021-03-03 13:48 33% ` [PATCH 4/4] lei q: import flags when clobbering/augmenting Maildirs Eric Wong
2021-03-03 22:29 71%   ` RFH: --import-augment naming [was: lei q: import flags when clobbering/augmenting] Eric Wong
2021-03-04  2:39 71%     ` Kyle Meyer
2021-03-04  3:31 71%       ` Eric Wong
2021-03-04  4:10 71%         ` Kyle Meyer
2021-03-04  9:03 71% [PATCH 0/6] lei q --import-augment => --import-before; mbox + IMAP Eric Wong
2021-03-04  9:03 43% ` [PATCH 1/6] lei q: support --import-augment for IMAP Eric Wong
2021-03-04  9:03 71% ` [PATCH 2/6] lei: dclose: do not EPOLL_CTL_DEL w/o event_init Eric Wong
2021-03-04  9:03 43% ` [PATCH 4/6] lei q: --import-augment for mbox and mbox.gz Eric Wong
2021-03-04  9:03 47% ` [PATCH 6/6] lei q: s/import-augment/import-before/g Eric Wong
2021-03-04 18:43 71% angle brackets in "m:" and "refs:" in "lei q" JSON Eric Wong
2021-03-06 18:26 71% ` Kyle Meyer
2021-03-08  8:08 53%   ` [PATCH] lei q: remove angle brackets around Message-IDs Eric Wong
2021-03-05  1:38 51% [PATCH] lei q: fix --import-before default and FIFO output Eric Wong
2021-03-05  2:22 63% "lei q" vs mairix notes Eric Wong
2021-03-05  4:03 66% [PATCH] lei q: one -t shouldn't set `flagged' on external mail Eric Wong
2021-03-05 22:20 64% release timelines (-extindex, JMAP, lei) Eric Wong
2021-03-06 18:31 71% ` Kyle Meyer
2021-03-08  2:54 71%   ` Eric Wong
2021-03-08 21:33 71% ` Konstantin Ryabitsev
2021-03-08 22:16 71%   ` Eric Wong
2021-03-10 13:23     [PATCH 0/5] no trash, glossary doc Eric Wong
2021-03-10 13:23 67% ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
2021-03-10 13:23 68% ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
2021-03-11 22:43 71% final "null" in "lei q" JSON output Eric Wong
2021-03-12  4:22 71% ` Kyle Meyer
2021-03-12 10:39 71% [PATCH 0/3] lei CLI option updates Eric Wong
2021-03-12 10:39 71% ` [PATCH 1/3] lei: add help + completion for --no-external Eric Wong
2021-03-12 10:39 51% ` [PATCH 2/3] lei: rearrange OPT_DESC and drop some TBD switches Eric Wong
2021-03-12 10:39 62% ` [PATCH 3/3] lei q: mbox*: disable changing parallelism, add --rsyncable Eric Wong
2021-03-14 11:12 40% [PATCH] lei q: do not import unnecessarily from externals Eric Wong
2021-03-15  9:32 71% [PATCH] lei: reuse LeiStore object on config changes Eric Wong
2021-03-19 12:35     [PATCH 0/2] newline rejection for new stuff Eric Wong
2021-03-19 12:35 53% ` [PATCH 1/2] lei: disallow "\n" in local externals paths Eric Wong
2021-03-19 12:41 71% [PATCH] t/lei-externals: add diagnostic for warning Eric Wong
2021-03-19 22:38 59% [PATCH] lei q: -I/--include overrides --no-(external|local|remote) Eric Wong
2021-03-20 10:04 67% [PATCH 0/5] lei: preserve keywords across queries Eric Wong
2021-03-20 10:04 33% ` [PATCH 1/5] lei: All Local Externals: bare git dir for alternates Eric Wong
2021-03-20 10:04 25% ` [PATCH 2/5] lei q: support vmd for external-only messages Eric Wong
2021-03-20 10:04 64% ` [PATCH 3/5] lei q: put keywords on one line in --pretty output Eric Wong
2021-03-20 10:04 59% ` [PATCH 5/5] lei: tie ALE lifetime to config file Eric Wong
2021-03-20 12:40 63% [PATCH] lei q: trim JSON output Eric Wong
2021-03-21  9:50 71% [PATCH 0/3] lei import fix, other fixes Eric Wong
2021-03-21  9:50 36% ` [PATCH 1/3] lei import: vivify external-only messages Eric Wong
2021-03-21  9:50 57% ` [PATCH 2/3] lei q: fix warning on remote imports Eric Wong
2021-03-21  9:50 52% ` [PATCH 3/3] lei: fix some warnings in tests Eric Wong
2021-03-21 11:24 59% [PATCH] lei: simplify lazy-loading Eric Wong
2021-03-22  7:53 70% [PATCH 0/8] lei input handling improvements Eric Wong
2021-03-22  7:53 33% ` [PATCH 1/8] lei: support -c <name>=<value> to overrides Eric Wong
2021-03-22  7:53 38% ` [PATCH 3/8] lei: share input code between convert and import Eric Wong
2021-03-22  7:53 64% ` [PATCH 4/8] lei: simplify workers_start and callers Eric Wong
2021-03-22  7:54 55% ` [PATCH 8/8] lei import: ignore Status headers in "eml" messages Eric Wong
2021-03-23  5:02 71% [PATCH 0/2] lei mark: volatile metadata tagging Eric Wong
2021-03-23  5:02 29% ` [PATCH 1/2] lei mark: command for (un)setting keywords and labels Eric Wong
2021-03-23  5:02 56% ` [PATCH 2/2] lei mark: add support for (bash) completion Eric Wong
2021-03-23  6:51 56% [PATCH] lei: hide *_atfork_child from command-line Eric Wong
2021-03-23 11:48 71% [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
2021-03-23 11:48 70% ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
2021-03-23 11:48 71% ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
2021-03-23 11:48 47% ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
2021-03-24  9:23 70% [PATCH 0/9] lei: various corner case leak fixes Eric Wong
2021-03-24  9:23 71% ` [PATCH 3/9] lei: drop circular reference in lei_store process Eric Wong
2021-03-24  9:23 71% ` [PATCH 4/9] lei: update {3} after -C chdirs Eric Wong
2021-03-24  9:23 61% ` [PATCH 5/9] lei: clean up pkt_op consumer on exception, too Eric Wong
2021-03-24  9:23 58% ` [PATCH 9/9] lei-daemon: do not leak FDs on bogus requests Eric Wong
2021-03-25  4:20 68% [PATCH 00/10] lei testing improvements Eric Wong
2021-03-25  4:20 71% ` [PATCH 02/10] lei: janky $PATH2CFG garbage collection Eric Wong
2021-03-25  4:20 71% ` [PATCH 04/10] lei add-external: do not initialize writable store Eric Wong
2021-03-25  4:20 57% ` [PATCH 07/10] tests: "check-run" uses persistent lei daemon Eric Wong
2021-03-25  4:20 52% ` [PATCH 08/10] lei import: force store, improve test diagnostics Eric Wong
2021-03-25  4:20 56% ` [PATCH 10/10] t/lei: add more diagnostics for failures Eric Wong
2021-03-25  5:22 71% is "lei mark" a good name? Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).