user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 00/12] lei: fix various annoyances
@ 2021-09-21  7:41 Eric Wong
  2021-09-21  7:41 ` [PATCH 01/12] lei inspect: convert to WQ worker Eric Wong
                   ` (12 more replies)
  0 siblings, 13 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

Eric Wong (12):
  lei inspect: convert to WQ worker
  lei inspect: support NNTP URLs
  lei_mail_sync: account for non-unique cases
  lei: simplify internal arg2folder usage
  lei lcat: use single queue for ordering
  doc: lei-security: section for WIP auth methods
  lei lcat: support NNTP URLs
  lei: various completion improvements
  lei q: show progress on >1s preparation phase
  search: drop reopen retry message
  lei q: update messages to reflect --save default
  lei q: improve --limit behavior and progress

 Documentation/lei-q.pod               |   6 +-
 Documentation/lei-security.pod        |   8 ++
 lib/PublicInbox/LEI.pm                |   2 +-
 lib/PublicInbox/LeiExportKw.pm        |  12 +--
 lib/PublicInbox/LeiExternal.pm        |   5 ++
 lib/PublicInbox/LeiForgetMailSync.pm  |   9 +--
 lib/PublicInbox/LeiImport.pm          |  20 +++--
 lib/PublicInbox/LeiImportKw.pm        |  12 ++-
 lib/PublicInbox/LeiInspect.pm         | 103 +++++++++++++++++---------
 lib/PublicInbox/LeiLcat.pm            |  91 +++++++++++++----------
 lib/PublicInbox/LeiLsMailSource.pm    |  11 +--
 lib/PublicInbox/LeiMailSync.pm        | 102 +++++++++++++++++--------
 lib/PublicInbox/LeiNoteEvent.pm       |   5 +-
 lib/PublicInbox/LeiQuery.pm           |   9 ++-
 lib/PublicInbox/LeiRefreshMailSync.pm |  18 +++--
 lib/PublicInbox/LeiSavedSearch.pm     |  22 +++---
 lib/PublicInbox/LeiTag.pm             |  14 ++--
 lib/PublicInbox/LeiToMail.pm          |  23 +++++-
 lib/PublicInbox/LeiUp.pm              |  25 +++----
 lib/PublicInbox/LeiXSearch.pm         |  36 ++++++---
 lib/PublicInbox/Search.pm             |   1 -
 lib/PublicInbox/SharedKV.pm           |  19 +++--
 lib/PublicInbox/TestCommon.pm         |  11 ++-
 t/lei-import-nntp.t                   |  44 +++++++++++
 t/lei-q-save.t                        |   6 +-
 t/lei-watch.t                         |   2 +-
 26 files changed, 414 insertions(+), 202 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/12] lei inspect: convert to WQ worker
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 02/12] lei inspect: support NNTP URLs Eric Wong
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

Xapian and SQLite access can be slow when a DB is large and/or
on high-latency storage.
---
 lib/PublicInbox/LeiInspect.pm | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index f06cea61..48da826b 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -8,6 +8,7 @@
 package PublicInbox::LeiInspect;
 use strict;
 use v5.10.1;
+use parent qw(PublicInbox::IPC);
 use PublicInbox::Config;
 use PublicInbox::MID qw(mids);
 
@@ -184,9 +185,11 @@ sub inspect1 ($$$) {
 	1;
 }
 
-sub _inspect_argv ($$) {
-	my ($lei, $argv) = @_;
+sub inspect_argv { # via wq_do
+	my ($self) = @_;
+	my ($lei, $argv) = delete @$self{qw(lei argv)};
 	my $multi = scalar(@$argv) > 1;
+	$lei->{1}->autoflush(0);
 	$lei->out('[') if $multi;
 	while (defined(my $x = shift @$argv)) {
 		inspect1($lei, $x, scalar(@$argv)) or return;
@@ -194,6 +197,16 @@ sub _inspect_argv ($$) {
 	$lei->out(']') if $multi;
 }
 
+sub inspect_start ($$) {
+	my ($lei, $argv) = @_;
+	my $self = bless { lei => $lei, argv => $argv }, __PACKAGE__;
+	my ($op_c, $ops) = $lei->workers_start($self, 1);
+	$lei->{wq1} = $self;
+	$lei->wait_wq_events($op_c, $ops);
+	$self->wq_do('inspect_argv');
+	$self->wq_close(1);
+}
+
 sub ins_add { # InputPipe->consume callback
 	my ($lei) = @_; # $_[1] = $rbuf
 	if (defined $_[1]) {
@@ -201,7 +214,7 @@ sub ins_add { # InputPipe->consume callback
 			my $str = delete $lei->{istr};
 			$str =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
 			my $eml = PublicInbox::Eml->new(\$str);
-			_inspect_argv($lei, [
+			inspect_start($lei, [
 				'blob:'.$lei->git_oid($eml)->hexdigest,
 				map { "mid:$_" } @{mids($eml)} ]);
 		};
@@ -218,20 +231,18 @@ sub lei_inspect {
 		my $sto = $lei->_lei_store;
 		$sto ? $sto->search : undef;
 	} : undef;
-	if ($lei->{opt}->{pretty} || -t $lei->{1}) {
-		$lei->{json}->pretty(1)->indent(2);
-	}
-	$lei->start_pager if -t $lei->{1};
-	$lei->{1}->autoflush(0);
+	my $isatty = -t $lei->{1};
+	$lei->{json}->pretty(1)->indent(2) if $lei->{opt}->{pretty} || $isatty;
+	$lei->start_pager if $isatty;
 	if ($lei->{opt}->{stdin}) {
 		return $lei->fail(<<'') if @argv;
 no args allowed on command-line with --stdin
 
 		require PublicInbox::InputPipe;
 		PublicInbox::InputPipe::consume($lei->{0}, \&ins_add, $lei);
-		return;
+	} else {
+		inspect_start($lei, \@argv);
 	}
-	_inspect_argv($lei, \@argv);
 }
 
 sub _complete_inspect {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 02/12] lei inspect: support NNTP URLs
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
  2021-09-21  7:41 ` [PATCH 01/12] lei inspect: convert to WQ worker Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 03/12] lei_mail_sync: account for non-unique cases Eric Wong
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

No reason not to support them, since there's more
public-inbox-nntpd instances than -imapd instances,
currently.
---
 lib/PublicInbox/LeiInspect.pm  | 48 +++++++++++++++++++++++++++-------
 lib/PublicInbox/LeiMailSync.pm | 40 +++++++++++++++++++++++++++-
 lib/PublicInbox/TestCommon.pm  | 11 ++++++--
 t/lei-import-nntp.t            | 21 +++++++++++++++
 4 files changed, 108 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 48da826b..ab2c98d9 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -11,6 +11,7 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Config;
 use PublicInbox::MID qw(mids);
+use PublicInbox::NetReader qw(imap_uri nntp_uri);
 
 sub inspect_blob ($$) {
 	my ($lei, $oidhex) = @_;
@@ -32,13 +33,33 @@ sub inspect_imap_uid ($$) {
 	my $ent = {};
 	my $lms = $lei->lms or return $ent;
 	my $oidhex = $lms->imap_oid($lei, $uid_uri);
-	if (ref(my $err = $oidhex)) { # art2folder error
+	if (ref(my $err = $oidhex)) { # arg2folder error
 		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
 	}
 	$ent->{$$uid_uri} = $oidhex;
 	$ent;
 }
 
+sub inspect_nntp_range {
+	my ($lei, $uri) = @_;
+	my ($ng, $beg, $end) = $uri->group;
+	$uri = $uri->clone;
+	$uri->group($ng);
+	my $ent = {};
+	my $ret = { "$uri" => $ent };
+	my $lms = $lei->lms or return $ret;
+	my $err = $lms->arg2folder($lei, my $folders = [ $$uri ]);
+	if ($err) {
+		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
+	}
+	$end //= $beg;
+	for my $art ($beg..$end) {
+		my $oidbin = $lms->imap_oidbin($folders->[0], $art);
+		$ent->{$art} = $oidbin ? unpack('H*', $oidbin) : undef;
+	}
+	$ret;
+}
+
 sub inspect_sync_folder ($$) {
 	my ($lei, $folder) = @_;
 	my $ent = {};
@@ -161,14 +182,6 @@ sub inspect1 ($$$) {
 	my $ent;
 	if ($item =~ /\Ablob:(.+)/) {
 		$ent = inspect_blob($lei, $1);
-	} elsif ($item =~ m!\Aimaps?://!i) {
-		require PublicInbox::URIimap;
-		my $uri = PublicInbox::URIimap->new($item);
-		if (defined($uri->uid)) {
-			$ent = inspect_imap_uid($lei, $uri);
-		} else {
-			$ent = inspect_sync_folder($lei, $item);
-		}
 	} elsif ($item =~ m!\A(?:maildir|mh):!i || -d $item) {
 		$ent = inspect_sync_folder($lei, $item);
 	} elsif ($item =~ m!\Adocid:([0-9]+)\z!) {
@@ -177,6 +190,23 @@ sub inspect1 ($$$) {
 		$ent = inspect_num($lei, $1 + 0);
 	} elsif ($item =~ m!\A(?:mid|m):(.+)\z!) {
 		$ent = inspect_mid($lei, $1);
+	} elsif (my $iuri = imap_uri($item)) {
+		if (defined($iuri->uid)) {
+			$ent = inspect_imap_uid($lei, $iuri);
+		} else {
+			$ent = inspect_sync_folder($lei, $item);
+		}
+	} elsif (my $nuri = nntp_uri($item)) {
+		if (defined(my $mid = $nuri->message)) {
+			$ent = inspect_mid($lei, $mid);
+		} else {
+			my ($group, $beg, $end) = $nuri->group;
+			if (defined($beg)) {
+				$ent = inspect_nntp_range($lei, $nuri);
+			} else {
+				$ent = inspect_sync_folder($lei, $item);
+			}
+		}
 	} else { # TODO: more things
 		return $lei->fail("$item not understood");
 	}
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index f185b585..d9b9e117 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -247,12 +247,14 @@ sub location_stats {
 SELECT COUNT(name) FROM blob2name WHERE fid = ?
 
 	$ret->{'name.count'} = $row if $row;
+	my $ntype = ($folder =~ m!\A(?:nntps?|s?news)://!i) ? 'article' :
+		(($folder =~ m!\Aimaps?://!i) ? 'uid' : "TODO<$folder>");
 	for my $op (qw(count min max)) {
 		($row) = $dbh->selectrow_array(<<"", undef, $fid);
 SELECT $op(uid) FROM blob2num WHERE fid = ?
 
 		$row or last;
-		$ret->{"uid.$op"} = $row;
+		$ret->{"$ntype.$op"} = $row;
 	}
 	$ret;
 }
@@ -369,6 +371,30 @@ sub match_imap_url {
 			"E: `$url' is ambiguous:\n\t".join("\n\t", @match)."\n";
 }
 
+sub match_nntp_url ($$$) {
+	my ($self, $url, $all) = @_; # $all = [ $lms->folders ];
+	$all //= [ $self->folders ];
+	require PublicInbox::URInntps;
+	my $want = PublicInbox::URInntps->new($url)->canonical;
+	my ($s, $h, $p) = ($want->scheme, $want->host, $want->port);
+	my $ng = $want->group; # force scalar (no article ranges)
+	my @uri = map { PublicInbox::URInntps->new($_)->canonical }
+		grep(m!\A\Q$s\E://.*?\Q$h\E\b.*?/\Q$ng\E\b!, @$all);
+	my @match;
+	for my $x (@uri) {
+		next if $x->group ne $ng || $x->host ne $h || $x->port != $p;
+		# maybe user was forgotten on CLI:
+		if (defined($x->userinfo) && !defined($want->userinfo)) {
+			push @match, $x;
+		} elsif (($x->userinfo//"\0") eq ($want->userinfo//"\0")) {
+			push @match, $x;
+		}
+	}
+	return @match if wantarray;
+	scalar(@match) <= 1 ? $match[0] :
+			"E: `$url' is ambiguous:\n\t".join("\n\t", @match)."\n";
+}
+
 # returns undef on failure, number on success
 sub group2folders {
 	my ($self, $lei, $all, $folders) = @_;
@@ -428,6 +454,18 @@ sub arg2folder {
 				$_ = $$res;
 				push(@{$err->{qerr}}, <<EOM);
 # using `$res' instead of `$orig'
+EOM
+			} else {
+				$lei->err($res) if defined $res;
+				push @no, $orig;
+			}
+		} elsif (m!\A(?:nntps?|s?news)://!i) {
+			my $orig = $_;
+			my $res = match_nntp_url($self, $orig, \@all);
+			if (ref $res) {
+				$_ = $$res;
+				push(@{$err->{qerr}}, <<EOM);
+# using `$res' instead of `$orig'
 EOM
 			} else {
 				$lei->err($res) if defined $res;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 0ee4b228..9e152394 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -18,7 +18,7 @@ BEGIN {
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes create_inbox
 		tcp_host_port test_lei lei lei_ok $lei_out $lei_err $lei_opt
-		test_httpd xbail require_cmd);
+		test_httpd xbail require_cmd is_xdeeply);
 	require Test::More;
 	my @methods = grep(!/\W/, @Test::More::EXPORT);
 	eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
@@ -520,6 +520,13 @@ sub json_utf8 () {
 	state $x = ref(PublicInbox::Config->json)->new->utf8->canonical;
 }
 
+sub is_xdeeply ($$$) {
+	my ($x, $y, $desc) = @_;
+	my $ok = is_deeply($x, $y, $desc);
+	diag explain([$x, '!=', $y]) if !$ok;
+	$ok;
+}
+
 sub test_lei {
 SKIP: {
 	my ($cb) = pop @_;
@@ -590,7 +597,7 @@ SKIP: {
 		my $f = "$daemon_xrd/lei/errors.log";
 		open my $fh, '<', $f or BAIL_OUT "$f: $!";
 		my @l = <$fh>;
-		is_deeply(\@l, [],
+		is_xdeeply(\@l, [],
 			"$t daemon XDG_RUNTIME_DIR/lei/errors.log empty");
 	}
 }; # SKIP if missing git 2.6+ || Xapian || SQLite || json
diff --git a/t/lei-import-nntp.t b/t/lei-import-nntp.t
index df0594d4..0b080781 100644
--- a/t/lei-import-nntp.t
+++ b/t/lei-import-nntp.t
@@ -49,6 +49,15 @@ test_lei({ tmpdir => $tmpdir }, sub {
 
 	my $end = $high - 1;
 	lei_ok qw(import), "$url/$high";
+	lei_ok('inspect', $url); is_xdeeply(json_utf8->decode($lei_out), {
+		$url => { 'article.count' => 1,
+			  'article.min' => $high,
+			  'article.max' => $high, }
+	}, 'inspect output for URL after single message') or diag $lei_out;
+	lei_ok('inspect', "$url/$high");
+	my $x = json_utf8->decode($lei_out);
+	like($x->{$url}->{$high}, qr/\A[a-f0-9]{40,}\z/, 'inspect shows blob');
+
 	lei_ok 'ls-mail-sync';
 	is($lei_out, "$url\n", 'article number not stored as folder');
 	lei_ok qw(q z:0..); my $one = json_utf8->decode($lei_out);
@@ -57,6 +66,18 @@ test_lei({ tmpdir => $tmpdir }, sub {
 
 	local $ENV{HOME} = "$tmpdir/h3";
 	lei_ok qw(import), "$url/$low-$end";
+	lei_ok('inspect', $url); is_xdeeply(json_utf8->decode($lei_out), {
+		$url => { 'article.count' => $end - $low + 1,
+			  'article.min' => $low,
+			  'article.max' => $end, }
+	}, 'inspect output for URL after range') or diag $lei_out;
+	lei_ok('inspect', "$url/$low-$end");
+	$x = json_utf8->decode($lei_out);
+	is_deeply([ ($low..$end) ], [ sort { $a <=> $b } keys %{$x->{$url}} ],
+		'inspect range shows range');
+	is(scalar(grep(/\A[a-f0-9]{40,}\z/, values %{$x->{$url}})),
+		$end - $low + 1, 'all values are git blobs');
+
 	lei_ok 'ls-mail-sync';
 	is($lei_out, "$url\n", 'article range not stored as folder');
 	lei_ok qw(q z:0..); my $start = json_utf8->decode($lei_out);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 03/12] lei_mail_sync: account for non-unique cases
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
  2021-09-21  7:41 ` [PATCH 01/12] lei inspect: convert to WQ worker Eric Wong
  2021-09-21  7:41 ` [PATCH 02/12] lei inspect: support NNTP URLs Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 04/12] lei: simplify internal arg2folder usage Eric Wong
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

NNTP servers, IMAP servers, and various MUAs may recycle
"unique" identifiers due to software bugs or careless BOFHs.
Warn about them, but always be prepared to account for them.
---
 lib/PublicInbox/LeiImport.pm          |  8 ++++++--
 lib/PublicInbox/LeiImportKw.pm        | 12 ++++++++----
 lib/PublicInbox/LeiInspect.pm         | 14 +++++++-------
 lib/PublicInbox/LeiLcat.pm            |  7 ++-----
 lib/PublicInbox/LeiMailSync.pm        | 20 ++++++++++----------
 lib/PublicInbox/LeiRefreshMailSync.pm |  2 +-
 6 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 40530914..3c30db8d 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -37,8 +37,12 @@ sub pmdir_cb { # called via wq_io_do from LeiPmdir->each_mdir_fn
 	substr($folder, 0, 0) = 'maildir:'; # add prefix
 	my $lse = $self->{lse} //= $self->{lei}->{sto}->search;
 	my $lms = $self->{-lms_ro} //= $self->{lei}->lms; # may be 0 or undef
-	my $oidbin = $lms ? $lms->name_oidbin($folder, $bn) : undef;
-	my @docids = defined($oidbin) ? $lse->over->oidbin_exists($oidbin) : ();
+	my @oidbin = $lms ? $lms->name_oidbin($folder, $bn) : ();
+	@oidbin > 1 and $self->{lei}->err("W: $folder/*/$$bn not unique:\n",
+				map { "\t".unpack('H*', $_)."\n" } @oidbin);
+	my %seen;
+	my @docids = sort { $a <=> $b } grep { !$seen{$_}++ }
+			map { $lse->over->oidbin_exists($_) } @oidbin;
 	my $vmd = $self->{-import_kw} ? { kw => $kw } : undef;
 	if (scalar @docids) {
 		$lse->kw_changed(undef, $kw, \@docids) or return;
diff --git a/lib/PublicInbox/LeiImportKw.pm b/lib/PublicInbox/LeiImportKw.pm
index 379101c2..21c93515 100644
--- a/lib/PublicInbox/LeiImportKw.pm
+++ b/lib/PublicInbox/LeiImportKw.pm
@@ -34,11 +34,15 @@ sub ipc_atfork_child {
 
 sub ck_update_kw { # via wq_io_do
 	my ($self, $url, $uid, $kw) = @_;
-	my $oidbin = $self->{-lms_ro}->imap_oidbin($url, $uid) // return;
-	my @docids = $self->{over}->oidbin_exists($oidbin) or return;
+	my @oidbin = $self->{-lms_ro}->num_oidbin($url, $uid);
+	my $uid_url = "$url/;UID=$uid";
+	@oidbin > 1 and $self->{lei}->err("W: $uid_url not unique:\n",
+				map { "\t".unpack('H*', $_)."\n" } @oidbin);
+	my %seen;
+	my @docids = sort { $a <=> $b } grep { !$seen{$_}++ }
+		map { $self->{over}->oidbin_exists($_) } @oidbin;
 	$self->{lse}->kw_changed(undef, $kw, \@docids) or return;
-	$self->{verbose} and
-		$self->{lei}->qerr('# '.unpack('H*', $oidbin)." => @$kw\n");
+	$self->{verbose} and $self->{lei}->qerr("# $uid_url => @$kw\n");
 	$self->{sto}->wq_do('set_eml_vmd', undef, { kw => $kw }, \@docids);
 }
 
diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index ab2c98d9..722ba5b2 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -32,11 +32,9 @@ sub inspect_imap_uid ($$) {
 	my ($lei, $uid_uri) = @_;
 	my $ent = {};
 	my $lms = $lei->lms or return $ent;
-	my $oidhex = $lms->imap_oid($lei, $uid_uri);
-	if (ref(my $err = $oidhex)) { # arg2folder error
-		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-	}
-	$ent->{$$uid_uri} = $oidhex;
+	my @oidhex = $lms->imap_oidhex($lei, $uid_uri);
+	$ent->{$$uid_uri} = @oidhex == 1 ? $oidhex[0] :
+			((@oidhex == 0) ? undef : \@oidhex);
 	$ent;
 }
 
@@ -54,8 +52,10 @@ sub inspect_nntp_range {
 	}
 	$end //= $beg;
 	for my $art ($beg..$end) {
-		my $oidbin = $lms->imap_oidbin($folders->[0], $art);
-		$ent->{$art} = $oidbin ? unpack('H*', $oidbin) : undef;
+		my @oidhex = map { unpack('H*', $_) }
+			$lms->num_oidbin($folders->[0], $art);
+		$ent->{$art} = @oidhex == 1 ? $oidhex[0] :
+				((@oidhex == 0) ? undef : \@oidhex);
 	}
 	$ret;
 }
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 1e54c3bf..8f8e83bc 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -32,11 +32,8 @@ sub lcat_imap_uri ($$) {
 	my $lms = $lei->lms or return;
 	# cf. LeiXsearch->lcat_dump
 	if (defined $uri->uid) {
-		my $oidhex = $lms->imap_oid($lei, $uri);
-		if (ref(my $err = $oidhex)) { # art2folder error
-			$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-		}
-		push @{$lei->{lcat_blob}}, $oidhex;
+		my @oidhex = $lms->imap_oidhex($lei, $uri);
+		push @{$lei->{lcat_blob}}, @oidhex;
 	} elsif (defined(my $fid = $lms->fid_for($$uri))) {
 		push @{$lei->{lcat_fid}}, $fid;
 	} else {
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index d9b9e117..3e725d30 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -66,6 +66,7 @@ CREATE TABLE IF NOT EXISTS blob2num (
 	oidbin VARBINARY NOT NULL,
 	fid INTEGER NOT NULL, /* folder ID */
 	uid INTEGER NOT NULL, /* NNTP article number, IMAP UID, MH number */
+	/* not UNIQUE(fid, uid), since we may have broken servers */
 	UNIQUE (oidbin, fid, uid)
 )
 
@@ -78,6 +79,7 @@ CREATE TABLE IF NOT EXISTS blob2name (
 	oidbin VARBINARY NOT NULL,
 	fid INTEGER NOT NULL, /* folder ID */
 	name VARBINARY NOT NULL, /* Maildir basename, JMAP blobId */
+	/* not UNIQUE(fid, name), since we may have broken software */
 	UNIQUE (oidbin, fid, name)
 )
 
@@ -522,14 +524,14 @@ EOM
 	}
 }
 
-sub imap_oidbin ($$$) {
-	my ($self, $url, $uid) = @_; # $url MUST have UIDVALIDITY
-	my $fid = $self->{fmap}->{$url} //= fid_for($self, $url) // return;
+sub num_oidbin ($$$) {
+	my ($self, $url, $uid) = @_; # $url MUST have UIDVALIDITY if IMAP
+	my $fid = $self->{fmap}->{$url} //= fid_for($self, $url) // return ();
 	my $sth = $self->{dbh}->prepare_cached(<<EOM, undef, 1);
-SELECT oidbin FROM blob2num WHERE fid = ? AND uid = ?
+SELECT oidbin FROM blob2num WHERE fid = ? AND uid = ? ORDER BY _rowid_
 EOM
 	$sth->execute($fid, $uid);
-	$sth->fetchrow_array;
+	map { $_->[0] } @{$sth->fetchall_arrayref};
 }
 
 sub name_oidbin ($$$) {
@@ -539,10 +541,10 @@ sub name_oidbin ($$$) {
 SELECT oidbin FROM blob2name WHERE fid = ? AND name = ?
 EOM
 	$sth->execute($fid, $nm);
-	$sth->fetchrow_array;
+	map { $_->[0] } @{$sth->fetchall_arrayref};
 }
 
-sub imap_oid {
+sub imap_oidhex {
 	my ($self, $lei, $uid_uri) = @_;
 	my $mailbox_uri = $uid_uri->clone;
 	$mailbox_uri->uid(undef);
@@ -550,12 +552,10 @@ sub imap_oid {
 	if (my $err = $self->arg2folder($lei, $folders)) {
 		if ($err->{fail}) {
 			$lei->qerr("# no sync information for $mailbox_uri");
-			return;
 		}
 		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
 	}
-	my $oidbin = imap_oidbin($self, $folders->[0], $uid_uri->uid);
-	$oidbin ? unpack('H*', $oidbin) : undef;
+	map { unpack('H*',$_) } num_oidbin($self, $folders->[0], $uid_uri->uid)
 }
 
 1;
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 2f105005..92673492 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -36,7 +36,7 @@ sub pmdir_cb { # called via LeiPmdir->each_mdir_fn
 	my ($folder, $bn) = ($f =~ m!\A(.+?)/(?:new|cur)/([^/]+)\z!) or
 		die "BUG: $f was not from a Maildir?";
 	substr($folder, 0, 0) = 'maildir:'; # add prefix
-	return if defined($self->{lms}->name_oidbin($folder, $bn));
+	return if scalar($self->{lms}->name_oidbin($folder, $bn));
 	my $eml = eml_from_path($f) // return;
 	my $oidbin = $self->{lei}->git_oid($eml)->digest;
 	$self->{lms}->set_src($oidbin, $folder, \$bn);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 04/12] lei: simplify internal arg2folder usage
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (2 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 03/12] lei_mail_sync: account for non-unique cases Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 05/12] lei lcat: use single queue for ordering Eric Wong
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

We can set opt->{quiet} for (internal) 'note-event' command
to quiet ->qerr, since we use ->qerr everywhere else.  And
we'll just die() instead of setting a ->{fail} message, since
eval + die are more inline with the rest of our Perl code.
---
 lib/PublicInbox/LEI.pm                |  2 +-
 lib/PublicInbox/LeiExportKw.pm        |  4 +---
 lib/PublicInbox/LeiForgetMailSync.pm  |  4 +---
 lib/PublicInbox/LeiInspect.pm         | 17 +++++------------
 lib/PublicInbox/LeiLcat.pm            |  5 ++---
 lib/PublicInbox/LeiMailSync.pm        | 17 ++++++-----------
 lib/PublicInbox/LeiNoteEvent.pm       |  5 +++--
 lib/PublicInbox/LeiRefreshMailSync.pm |  4 +---
 8 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 148a5b1e..f94bfa45 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1194,7 +1194,7 @@ sub dir_idle_handler ($) { # PublicInbox::DirIdle callback
 				$lei->dispatch('note-event',
 						"maildir:$mdir", $nc, $bn, $fn);
 			};
-			warn "E note-event $f: $@\n" if $@;
+			warn "E: note-event $f: $@\n" if $@;
 		}
 	}
 	if ($ev->can('cancel') && ($ev->IN_IGNORE || $ev->IN_UNMOUNT)) {
diff --git a/lib/PublicInbox/LeiExportKw.pm b/lib/PublicInbox/LeiExportKw.pm
index 8c5fbc13..d5533a2a 100644
--- a/lib/PublicInbox/LeiExportKw.pm
+++ b/lib/PublicInbox/LeiExportKw.pm
@@ -92,9 +92,7 @@ EOM
 		$lms->group2folders($lei, $all, \@folders) or return;
 		@folders = grep(/\A(?:maildir|imaps?):/i, @folders);
 	} else {
-		my $err = $lms->arg2folder($lei, \@folders);
-		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-		return $lei->fail($err->{fail}) if $err->{fail};
+		$lms->arg2folder($lei, \@folders); # may die
 	}
 	$lms->lms_pause;
 	my $self = bless { lse => $sto->search, lms => $lms }, __PACKAGE__;
diff --git a/lib/PublicInbox/LeiForgetMailSync.pm b/lib/PublicInbox/LeiForgetMailSync.pm
index 701f48d2..d85616cc 100644
--- a/lib/PublicInbox/LeiForgetMailSync.pm
+++ b/lib/PublicInbox/LeiForgetMailSync.pm
@@ -16,9 +16,7 @@ sub lei_forget_mail_sync {
 	my ($lei, @folders) = @_;
 	my $lms = $lei->lms or return;
 	$lms->lms_write_prepare;
-	my $err = $lms->arg2folder($lei, \@folders);
-	$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-	return $lei->fail($err->{fail}) if $err->{fail};
+	$lms->arg2folder($lei, \@folders); # may die
 	$lms->forget_folders(@folders);
 }
 
diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 722ba5b2..8e128580 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -46,10 +46,9 @@ sub inspect_nntp_range {
 	my $ent = {};
 	my $ret = { "$uri" => $ent };
 	my $lms = $lei->lms or return $ret;
-	my $err = $lms->arg2folder($lei, my $folders = [ $$uri ]);
-	if ($err) {
-		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-	}
+	my $folders = [ $$uri ];
+	eval { $lms->arg2folder($lei, $folders) };
+	$lei->qerr("# no folders match $$uri (non-fatal)") if $@;
 	$end //= $beg;
 	for my $art ($beg..$end) {
 		my @oidhex = map { unpack('H*', $_) }
@@ -65,14 +64,8 @@ sub inspect_sync_folder ($$) {
 	my $ent = {};
 	my $lms = $lei->lms or return $ent;
 	my $folders = [ $folder ];
-	my $err = $lms->arg2folder($lei, $folders);
-	if ($err) {
-		if ($err->{fail}) {
-			$lei->qerr("# no folders match $folder (non-fatal)");
-			@$folders = ();
-		}
-		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-	}
+	eval { $lms->arg2folder($lei, $folders) };
+	$lei->qerr("# no folders match $folder (non-fatal)") if $@;
 	for my $f (@$folders) {
 		$ent->{$f} = $lms->location_stats($f); # may be undef
 	}
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 8f8e83bc..ccb1823d 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -15,9 +15,8 @@ sub lcat_folder ($$$) {
 	my ($lei, $lms, $folder) = @_;
 	$lms //= $lei->lms or return;
 	my $folders = [ $folder];
-	my $err = $lms->arg2folder($lei, $folders);
-	$lei->qerr(@{$err->{qerr}}) if $err && $err->{qerr};
-	if ($err && $err->{fail}) {
+	eval { $lms->arg2folder($lei, $folders) };
+	if ($@) {
 		$lei->child_error(0, "# unknown folder: $folder");
 	} else {
 		for my $f (@$folders) {
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 3e725d30..f83c7de2 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -437,7 +437,7 @@ sub arg2folder {
 	my ($self, $lei, $folders) = @_;
 	my @all = $self->folders;
 	my %all = map { $_ => 1 } @all;
-	my ($err, @no);
+	my @no;
 	for (@$folders) {
 		next if $all{$_}; # ok
 		if (m!\A(maildir|mh):(.+)!i) {
@@ -454,7 +454,7 @@ sub arg2folder {
 			my $res = match_imap_url($self, $orig, \@all);
 			if (ref $res) {
 				$_ = $$res;
-				push(@{$err->{qerr}}, <<EOM);
+				$lei->qerr(<<EOM);
 # using `$res' instead of `$orig'
 EOM
 			} else {
@@ -466,7 +466,7 @@ EOM
 			my $res = match_nntp_url($self, $orig, \@all);
 			if (ref $res) {
 				$_ = $$res;
-				push(@{$err->{qerr}}, <<EOM);
+				$lei->qerr(<<EOM);
 # using `$res' instead of `$orig'
 EOM
 			} else {
@@ -479,12 +479,11 @@ EOM
 	}
 	if (@no) {
 		my $no = join("\n\t", @no);
-		$err->{fail} = <<EOF;
+		die <<EOF;
 No sync information for: $no
 Run `lei ls-mail-sync' to display valid choices
 EOF
 	}
-	$err;
 }
 
 sub forget_folders {
@@ -549,12 +548,8 @@ sub imap_oidhex {
 	my $mailbox_uri = $uid_uri->clone;
 	$mailbox_uri->uid(undef);
 	my $folders = [ $$mailbox_uri ];
-	if (my $err = $self->arg2folder($lei, $folders)) {
-		if ($err->{fail}) {
-			$lei->qerr("# no sync information for $mailbox_uri");
-		}
-		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-	}
+	eval { $self->arg2folder($lei, $folders) };
+	$lei->qerr("# no sync information for $mailbox_uri") if $@;
 	map { unpack('H*',$_) } num_oidbin($self, $folders->[0], $uid_uri->uid)
 }
 
diff --git a/lib/PublicInbox/LeiNoteEvent.pm b/lib/PublicInbox/LeiNoteEvent.pm
index a0591a09..43d5ed0f 100644
--- a/lib/PublicInbox/LeiNoteEvent.pm
+++ b/lib/PublicInbox/LeiNoteEvent.pm
@@ -68,8 +68,9 @@ sub lei_note_event {
 	return flush_lei($lei) if $folder eq 'done'; # special case
 	my $lms = $lei->lms or return;
 	$lms->lms_write_prepare if $new_cur eq ''; # for ->clear_src below
-	my $err = $lms->arg2folder($lei, [ $folder ]);
-	return if $err->{fail};
+	$lei->{opt}->{quiet} = 1;
+	eval { $lms->arg2folder($lei, [ $folder ]) };
+	return if $@;
 	my $state = $cfg->get_1("watch.$folder", 'state') // 'tag-rw';
 	return if $state eq 'pause';
 	return $lms->clear_src($folder, \$bn) if $new_cur eq '';
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 92673492..51e89b23 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -74,9 +74,7 @@ EOM
 	if (defined(my $all = $lei->{opt}->{all})) {
 		$lms->group2folders($lei, $all, \@folders) or return;
 	} else {
-		my $err = $lms->arg2folder($lei, \@folders);
-		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-		return $lei->fail($err->{fail}) if $err->{fail};
+		$lms->arg2folder($lei, \@folders); # may die
 	}
 	$lms->lms_pause; # must be done before fork
 	$sto->write_prepare($lei);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 05/12] lei lcat: use single queue for ordering
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (3 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 04/12] lei: simplify internal arg2folder usage Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 06/12] doc: lei-security: section for WIP auth methods Eric Wong
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

If lcat-ing multiple argument types (blobs vs folders),
maintain the original order of the arguments instead of
dumping all blobs before folder contents.
---
 lib/PublicInbox/LeiLcat.pm    | 13 ++++++-------
 lib/PublicInbox/LeiXSearch.pm | 16 ++++++++--------
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index ccb1823d..1a4a988e 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -21,7 +21,7 @@ sub lcat_folder ($$$) {
 	} else {
 		for my $f (@$folders) {
 			my $fid = $lms->fid_for($f);
-			push @{$lei->{lcat_fid}}, $fid;
+			push @{$lei->{lcat_todo}}, { fid => $fid };
 		}
 	}
 }
@@ -31,10 +31,9 @@ sub lcat_imap_uri ($$) {
 	my $lms = $lei->lms or return;
 	# cf. LeiXsearch->lcat_dump
 	if (defined $uri->uid) {
-		my @oidhex = $lms->imap_oidhex($lei, $uri);
-		push @{$lei->{lcat_blob}}, @oidhex;
+		push @{$lei->{lcat_todo}}, $lms->imap_oidhex($lei, $uri);
 	} elsif (defined(my $fid = $lms->fid_for($$uri))) {
-		push @{$lei->{lcat_fid}}, $fid;
+		push @{$lei->{lcat_todo}}, { fid => $fid };
 	} else {
 		lcat_folder($lei, $lms, $$uri);
 	}
@@ -46,10 +45,10 @@ sub extract_1 ($$) {
 		my $u = $1;
 		require PublicInbox::URIimap;
 		lcat_imap_uri($lei, PublicInbox::URIimap->new($u));
-		'""'; # blank query, using {lcat_blob} or {lcat_fid}
+		'""'; # blank query, using {lcat_todo}
 	} elsif ($x =~ m!\b(maildir:.+)!i) {
 		lcat_folder($lei, undef, $1);
-		'""'; # blank query, using {lcat_blob} or {lcat_fid}
+		'""'; # blank query, using {lcat_todo}
 	} elsif ($x =~ m!\b([a-z]+?://\S+)!i) {
 		my $u = $1;
 		$u =~ s/[\>\]\)\,\.\;]+\z//;
@@ -82,7 +81,7 @@ sub extract_1 ($$) {
 	} elsif ($x =~ /\bid:(\S+)/) { # notmuch convention
 		"mid:$1";
 	} elsif ($x =~ /\bblob:([0-9a-f]{7,})\b/) {
-		push @{$lei->{lcat_blob}}, $1; # cf. LeiToMail->wq_atexit_child
+		push @{$lei->{lcat_todo}}, $1; # cf. LeiToMail->wq_atexit_child
 		'""'; # blank query
 	} else {
 		undef;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 756183a9..3ce8f32d 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -570,7 +570,7 @@ sub do_query {
 	@$end = ();
 	$self->{opt_threads} = $lei->{opt}->{threads};
 	$self->{opt_sort} = $lei->{opt}->{'sort'};
-	$self->{-do_lcat} = $lei->{lcat_blob} // $lei->{lcat_fid};
+	$self->{-do_lcat} = !!(delete $lei->{lcat_todo});
 	if ($l2m) {
 		$l2m->net_merge_all_done unless $lei->{auth};
 	} else {
@@ -646,13 +646,13 @@ sub lcat_dump { # via wq_io_do
 			$git->cat_async($smsg->{blob}, \&_lcat2smsg, $smsg);
 		};
 	}
-	for my $oid (@{$lei->{lcat_blob} // []}) {
-		$each_smsg->({ blob => $oid, pct => 100 });
-	}
-	if (my $fids = delete $lei->{lcat_fid}) {
-		my $lms = $lei->{lse}->lms;
-		for my $fid (@$fids) {
-			$lms->each_src({fid => $fid}, \&_lcat_i, $each_smsg);
+	my $lms;
+	for my $ent (@{$lei->{lcat_todo}}) {
+		if (ref $ent eq 'HASH') { # { fid => $fid ,.. }
+			$lms //= $lei->{lse}->lms;
+			$lms->each_src($ent, \&_lcat_i, $each_smsg);
+		} else { # oidhex
+			$each_smsg->({ blob => $ent, pct => 100 });
 		}
 	}
 	$git->async_wait_all;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 06/12] doc: lei-security: section for WIP auth methods
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (4 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 05/12] lei lcat: use single queue for ordering Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 07/12] lei lcat: support NNTP URLs Eric Wong
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

Lots of stuff out there that becomes a pain to setup
configuration for and test...
---
 Documentation/lei-security.pod | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/lei-security.pod b/Documentation/lei-security.pod
index 4b712c2d..02305b90 100644
--- a/Documentation/lei-security.pod
+++ b/Documentation/lei-security.pod
@@ -99,6 +99,14 @@ While credentials are not written to the filesystem by default,
 it is possible for them to end up on disk if processes are
 swapped out.  Use of an encrypted swap partition is recommended.
 
+=head1 AUTHENTICATION METHODS
+
+LOGIN (username + password) is known to work over IMAP(S),
+as does AUTH=ANONYMOUS (which is used by L<public-inbox-imapd(1)>
+as part of our test suite).  AUTHINFO may work for NNTP, but
+is untested.  Testers will be needed for other authentication
+methods.
+
 =head1 DENIAL-OF-SERVICE VECTORS
 
 lei uses the same MIME parsing library as L<public-inbox-mda(1)>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 07/12] lei lcat: support NNTP URLs
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (5 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 06/12] doc: lei-security: section for WIP auth methods Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 08/12] lei: various completion improvements Eric Wong
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

NNTP URLs are probably more prevalent in public message archives
than IMAP URLs.
---
 lib/PublicInbox/LeiLcat.pm     | 66 +++++++++++++++++++++-------------
 lib/PublicInbox/LeiMailSync.pm | 14 +++++---
 t/lei-import-nntp.t            | 23 ++++++++++++
 3 files changed, 74 insertions(+), 29 deletions(-)

diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 1a4a988e..0902c213 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -11,47 +11,64 @@ use PublicInbox::LeiViewText;
 use URI::Escape qw(uri_unescape);
 use PublicInbox::MID qw($MID_EXTRACT);
 
-sub lcat_folder ($$$) {
-	my ($lei, $lms, $folder) = @_;
-	$lms //= $lei->lms or return;
-	my $folders = [ $folder];
+sub lcat_folder ($$;$$) {
+	my ($lei, $folder, $beg, $end) = @_;
+	my $lms = $lei->{-lms_ro} //= $lei->lms // return;
+	my $folders = [ $folder ];
 	eval { $lms->arg2folder($lei, $folders) };
-	if ($@) {
-		$lei->child_error(0, "# unknown folder: $folder");
-	} else {
-		for my $f (@$folders) {
-			my $fid = $lms->fid_for($f);
-			push @{$lei->{lcat_todo}}, { fid => $fid };
-		}
+	return $lei->child_error(0, "# unknown folder: $folder") if $@;
+	my %range;
+	if (defined($beg)) { # NNTP article range
+		$range{min} = $beg;
+		$range{max} = $end // $beg;
+	}
+	for my $f (@$folders) {
+		my $fid = $lms->fid_for($f);
+		push @{$lei->{lcat_todo}}, { fid => $fid, %range };
 	}
 }
 
 sub lcat_imap_uri ($$) {
 	my ($lei, $uri) = @_;
-	my $lms = $lei->lms or return;
-	# cf. LeiXsearch->lcat_dump
+	# cf. LeiXSearch->lcat_dump
+	my $lms = $lei->{-lms_ro} //= $lei->lms // return;
 	if (defined $uri->uid) {
 		push @{$lei->{lcat_todo}}, $lms->imap_oidhex($lei, $uri);
 	} elsif (defined(my $fid = $lms->fid_for($$uri))) {
 		push @{$lei->{lcat_todo}}, { fid => $fid };
 	} else {
-		lcat_folder($lei, $lms, $$uri);
+		lcat_folder($lei, $$uri);
 	}
 }
 
+sub lcat_nntp_uri ($$) {
+	my ($lei, $uri) = @_;
+	my $mid = $uri->message; # already unescaped by URI::news
+	return "mid:$mid" if defined($mid);
+	my $lms = $lei->{-lms_ro} //= $lei->lms // return;
+	my ($ng, $beg, $end) = $uri->group;
+	$uri->group($ng);
+	lcat_folder($lei, $$uri, $beg, $end);
+	'""';
+}
+
 sub extract_1 ($$) {
 	my ($lei, $x) = @_;
-	if ($x =~ m!\b(imaps?://[^>]+)!i) {
-		my $u = $1;
-		require PublicInbox::URIimap;
-		lcat_imap_uri($lei, PublicInbox::URIimap->new($u));
-		'""'; # blank query, using {lcat_todo}
-	} elsif ($x =~ m!\b(maildir:.+)!i) {
-		lcat_folder($lei, undef, $1);
+	if ($x =~ m!\b(maildir:.+)!i) {
+		lcat_folder($lei, $1);
 		'""'; # blank query, using {lcat_todo}
-	} elsif ($x =~ m!\b([a-z]+?://\S+)!i) {
-		my $u = $1;
+	} elsif ($x =~ m!\b(([a-z]+)://\S+)!i) {
+		my ($u, $scheme) = ($1, $2);
 		$u =~ s/[\>\]\)\,\.\;]+\z//;
+		if ($scheme =~ m!\A(imaps?)\z!i) {
+			require PublicInbox::URIimap;
+			lcat_imap_uri($lei, PublicInbox::URIimap->new($u));
+			return '""'; # blank query, using {lcat_todo}
+		} elsif ($scheme =~ m!\A(?:nntps?|s?news)\z!i) {
+			require PublicInbox::URInntps;
+			$u = PublicInbox::URInntps->new($u);
+			return lcat_nntp_uri($lei, $u);
+		} # http, or something else:
 		require URI;
 		$u = URI->new($u);
 		my $p = $u->path;
@@ -93,7 +110,7 @@ sub extract_all {
 	my $strict = !$lei->{opt}->{stdin};
 	my @q;
 	for my $x (@argv) {
-		if (my $term = extract_1($lei,$x)) {
+		if (my $term = extract_1($lei, $x)) {
 			push @q, $term;
 		} elsif ($strict) {
 			return $lei->fail(<<"");
@@ -101,6 +118,7 @@ could not extract Message-ID from $x
 
 		}
 	}
+	delete $lei->{-lms_ro};
 	@q ? join(' OR ', @q) : $lei->fail("no Message-ID in: @argv");
 }
 
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index f83c7de2..522a5ebc 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -197,9 +197,12 @@ INSERT OR IGNORE INTO blob2name (oidbin, fid, name) VALUES (?, ?, ?)
 sub each_src {
 	my ($self, $folder, $cb, @args) = @_;
 	my $dbh = $self->{dbh} //= dbh_new($self);
-	my $fid;
+	my ($fid, @rng);
+	my $and_ge_le = '';
 	if (ref($folder) eq 'HASH') {
 		$fid = $folder->{fid} // die "BUG: no `fid'";
+		@rng = grep(defined, @$folder{qw(min max)});
+		$and_ge_le = 'AND uid >= ? AND uid <= ?' if @rng;
 	} else {
 		$fid = $self->{fmap}->{$folder} //=
 			fid_for($self, $folder) // return;
@@ -208,16 +211,17 @@ sub each_src {
 	# minimize implicit txn time to avoid blocking writers by
 	# batching SELECTs.  This looks wonky but is necessary since
 	# $cb-> may access the DB on its own.
-	my $ary = $dbh->selectall_arrayref(<<'', undef, $fid);
-SELECT _rowid_,oidbin,uid FROM blob2num WHERE fid = ?
+	my $ary = $dbh->selectall_arrayref(<<"", undef, $fid, @rng);
+SELECT _rowid_,oidbin,uid FROM blob2num WHERE fid = ? $and_ge_le
 ORDER BY _rowid_ ASC LIMIT 1000
 
 	my $min = @$ary ? $ary->[-1]->[0] : undef;
 	while (defined $min) {
 		for my $row (@$ary) { $cb->($row->[1], $row->[2], @args) }
 
-		$ary = $dbh->selectall_arrayref(<<'', undef, $fid, $min);
-SELECT _rowid_,oidbin,uid FROM blob2num WHERE fid = ? AND _rowid_ > ?
+		$ary = $dbh->selectall_arrayref(<<"", undef, $fid, @rng, $min);
+SELECT _rowid_,oidbin,uid FROM blob2num
+WHERE fid = ? $and_ge_le AND _rowid_ > ?
 ORDER BY _rowid_ ASC LIMIT 1000
 
 		$min = @$ary ? $ary->[-1]->[0] : undef;
diff --git a/t/lei-import-nntp.t b/t/lei-import-nntp.t
index 0b080781..eb1ae312 100644
--- a/t/lei-import-nntp.t
+++ b/t/lei-import-nntp.t
@@ -25,6 +25,11 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	is(ref(json_utf8->decode($lei_out)), 'ARRAY', 'ls-mail-source JSON');
 
 	lei_ok('import', $url);
+	lei_ok "lcat", "nntp://$host_port/testmessage\@example.com";
+	my $local = $lei_out;
+	lei_ok "lcat", "nntp://example.com/testmessage\@example.com";
+	my $remote = $lei_out;
+	is($local, $remote, 'Message-ID used even from unknown host');
 	lei_ok(qw(q z:1..));
 	$out = json_utf8->decode($lei_out);
 	ok(scalar(@$out) > 1, 'got imported messages');
@@ -57,6 +62,11 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	lei_ok('inspect', "$url/$high");
 	my $x = json_utf8->decode($lei_out);
 	like($x->{$url}->{$high}, qr/\A[a-f0-9]{40,}\z/, 'inspect shows blob');
+	lei_ok qw(lcat -f json), "$url/$high";
+	my $lcat = json_utf8->decode($lei_out);
+	is($lcat->[1], undef, 'only one result for lcat');
+	is($lcat->[0]->{blob}, $x->{$url}->{$high},
+		'lcat showed correct blob');
 
 	lei_ok 'ls-mail-sync';
 	is($lei_out, "$url\n", 'article number not stored as folder');
@@ -78,6 +88,19 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	is(scalar(grep(/\A[a-f0-9]{40,}\z/, values %{$x->{$url}})),
 		$end - $low + 1, 'all values are git blobs');
 
+	lei_ok qw(lcat -f json), "$url/$low";
+	$lcat = json_utf8->decode($lei_out);
+	is($lcat->[1], undef, 'only one result for lcat');
+	is($lcat->[0]->{blob}, $x->{$url}->{$low},
+		'lcat showed correct blob');
+	lei_ok qw(lcat -f json), "$url/$low-$end";
+	$lcat = json_utf8->decode($lei_out);
+	pop @$lcat;
+	for ($low..$end) {
+		my $tip = shift @$lcat;
+		is($x->{$url}->{$_}, $tip->{blob}, "blob matches art #$_");
+	}
+
 	lei_ok 'ls-mail-sync';
 	is($lei_out, "$url\n", 'article range not stored as folder');
 	lei_ok qw(q z:0..); my $start = json_utf8->decode($lei_out);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 08/12] lei: various completion improvements
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (6 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 07/12] lei lcat: support NNTP URLs Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 09/12] lei q: show progress on >1s preparation phase Eric Wong
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

"lei export-kw" no longer completes for anonymous sources.

More commands use "lei refresh-mail-sync" as a basis for their
completion work, as well.

";AUTH=ANONYMOUS@" is stripped from completions since it was
preventing bash completion from working on AUTH=ANONYMOUS IMAP
URLs.  I'm not sure if there's a better way, but all of our code
works fine without specifying AUTH=ANONYMOUS as a command-line
arg.

Finally, we fallback to using more candidates if none can
be found, allowing multiple URLs to be completed.
---
 lib/PublicInbox/LeiExportKw.pm        |  8 ++++++--
 lib/PublicInbox/LeiExternal.pm        |  5 +++++
 lib/PublicInbox/LeiForgetMailSync.pm  |  5 +++--
 lib/PublicInbox/LeiImport.pm          | 12 +++++++-----
 lib/PublicInbox/LeiInspect.pm         |  7 +++----
 lib/PublicInbox/LeiLcat.pm            |  7 +++----
 lib/PublicInbox/LeiLsMailSource.pm    | 11 ++++++-----
 lib/PublicInbox/LeiMailSync.pm        | 13 ++++++++-----
 lib/PublicInbox/LeiRefreshMailSync.pm | 12 ++++++++++--
 lib/PublicInbox/LeiTag.pm             | 14 ++++++++------
 lib/PublicInbox/SharedKV.pm           | 19 +++++++++++++------
 11 files changed, 72 insertions(+), 41 deletions(-)

diff --git a/lib/PublicInbox/LeiExportKw.pm b/lib/PublicInbox/LeiExportKw.pm
index d5533a2a..cea9beeb 100644
--- a/lib/PublicInbox/LeiExportKw.pm
+++ b/lib/PublicInbox/LeiExportKw.pm
@@ -127,9 +127,13 @@ EOM
 
 sub _complete_export_kw {
 	my ($lei, @argv) = @_;
-	my $lms = $lei->lms or return;
+	my $lms = $lei->lms or return ();
 	my $match_cb = $lei->complete_url_prepare(\@argv);
-	map { $match_cb->($_) } $lms->folders;
+	# filter-out read-only sources:
+	my @k = grep(!m!(?://;AUTH=ANONYMOUS\@|\A(?:nntps?|s?news)://)!,
+			$lms->folders($argv[-1], 1));
+	my @m = map { $match_cb->($_) } @k;
+	@m ? @m : @k;
 }
 
 no warnings 'once';
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index d802f0e2..f8e610ca 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -241,6 +241,11 @@ sub complete_url_prepare {
 		$re = quotemeta($re);
 	}
 	my $match_cb = sub {
+		# the "//;" here (for AUTH=ANONYMOUS) interacts badly with
+		# bash tab completion, strip it out for now since our commands
+		# work w/o it.  Not sure if there's a better solution...
+		$_[0] =~ s!//;AUTH=ANONYMOUS\@!//!i;
+		$_[0] =~ s!;!\\;!g;
 		# only return the part specified on the CLI
 		# don't duplicate if already 100% completed
 		$_[0] =~ /\A$re(\Q$cur\E.*)/ ? ($cur eq $1 ? () : $1) : ()
diff --git a/lib/PublicInbox/LeiForgetMailSync.pm b/lib/PublicInbox/LeiForgetMailSync.pm
index d85616cc..762910ed 100644
--- a/lib/PublicInbox/LeiForgetMailSync.pm
+++ b/lib/PublicInbox/LeiForgetMailSync.pm
@@ -10,7 +10,7 @@
 package PublicInbox::LeiForgetMailSync;
 use strict;
 use v5.10.1;
-use PublicInbox::LeiExportKw;
+use PublicInbox::LeiRefreshMailSync;
 
 sub lei_forget_mail_sync {
 	my ($lei, @folders) = @_;
@@ -20,6 +20,7 @@ sub lei_forget_mail_sync {
 	$lms->forget_folders(@folders);
 }
 
-*_complete_forget_mail_sync = \&PublicInbox::LeiExportKw::_complete_export_kw;
+*_complete_forget_mail_sync =
+	\&PublicInbox::LeiRefreshMailSync::_complete_refresh_mail_sync;
 
 1;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 3c30db8d..397292d4 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -121,12 +121,14 @@ sub lei_import { # the main "lei import" method
 
 sub _complete_import {
 	my ($lei, @argv) = @_;
-	my $match_cb = $lei->complete_url_prepare(\@argv);
-	my @m = map { $match_cb->($_) } $lei->url_folder_cache->keys;
-	my %f = map { $_ => 1 } @m;
+	my ($re, $cur, $match_cb) = $lei->complete_url_prepare(\@argv);
+	my @k = $lei->url_folder_cache->keys($argv[-1], 1);
+	my @m = map { $match_cb->($_) } @k;
+	my %f = map { $_ => 1 } (@m ? @m : @k);
 	if (my $lms = $lei->lms) {
-		@m = map { $match_cb->($_) } $lms->folders;
-		@f{@m} = @m;
+		@k = $lms->folders($argv[-1], 1);
+		@m = map { $match_cb->($_) } @k;
+		if (@m) { @f{@m} = @m } else { @f{@k} = @k }
 	}
 	keys %f;
 }
diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 8e128580..2158b996 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -269,10 +269,9 @@ no args allowed on command-line with --stdin
 }
 
 sub _complete_inspect {
-	my ($lei, @argv) = @_;
-	my $lms = $lei->lms or return;
-	my $match_cb = $lei->complete_url_prepare(\@argv);
-	map { $match_cb->($_) } $lms->folders;
+	require PublicInbox::LeiRefreshMailSync;
+	PublicInbox::LeiRefreshMailSync::_complete_refresh_mail_sync(@_);
+	# TODO: message-ids?, blobs? could get expensive...
 }
 
 1;
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 0902c213..c13e2153 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -164,10 +164,9 @@ no args allowed on command-line with --stdin
 }
 
 sub _complete_lcat {
-	my ($lei, @argv) = @_;
-	my $lms = $lei->lms or return;
-	my $match_cb = $lei->complete_url_prepare(\@argv);
-	map { $match_cb->($_) } $lms->folders;
+	require PublicInbox::LeiRefreshMailSync;
+	PublicInbox::LeiRefreshMailSync::_complete_refresh_mail_sync(@_);
+	# TODO: message-ids?, blobs? could get expensive...
 }
 
 1;
diff --git a/lib/PublicInbox/LeiLsMailSource.pm b/lib/PublicInbox/LeiLsMailSource.pm
index 2265969a..1db15d57 100644
--- a/lib/PublicInbox/LeiLsMailSource.pm
+++ b/lib/PublicInbox/LeiLsMailSource.pm
@@ -107,12 +107,13 @@ sub lei_ls_mail_source {
 sub _complete_ls_mail_source {
 	my ($lei, @argv) = @_;
 	my $match_cb = $lei->complete_url_prepare(\@argv);
-	my @m = map { $match_cb->($_) } $lei->url_folder_cache->keys;
-	my %f = map { $_ => 1 } @m;
+	my @k = $lei->url_folder_cache->keys($argv[-1], 1);
+	my @m = map { $match_cb->($_) } @k;
+	my %f = map { $_ => 1 } (@m ? @m : @k);
 	if (my $lms = $lei->lms) {
-		@m = map { $match_cb->($_) } grep(
-			m!\A(?:imaps?|nntps?|s?news)://!, $lms->folders);
-		@f{@m} = @m;
+		@k = $lms->folders($argv[-1], 1);
+		@m = map { $match_cb->($_) } grep(m!\A[a-z]+://!, @k);
+		if (@m) { @f{@m} = @m } else { @f{@k} = @k }
 	}
 	keys %f;
 }
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 522a5ebc..91cd1c93 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -299,16 +299,19 @@ sub locations_for {
 
 # returns a list of folders used for completion
 sub folders {
-	my ($self, $pfx) = @_;
-	my $dbh = $self->{dbh} //= dbh_new($self);
+	my ($self, @pfx) = @_;
 	my $sql = 'SELECT loc FROM folders';
-	my @pfx;
-	if (defined $pfx) {
+	if (defined($pfx[0])) {
 		$sql .= ' WHERE loc LIKE ? ESCAPE ?';
-		@pfx = ($pfx, '\\');
+		my $anywhere = !!$pfx[1];
+		$pfx[1] = '\\';
 		$pfx[0] =~ s/([%_\\])/\\$1/g; # glob chars
 		$pfx[0] .= '%';
+		substr($pfx[0], 0, 0, '%') if $anywhere;
+	} else {
+		@pfx = (); # [0] may've been undef
 	}
+	my $dbh = $self->{dbh} //= dbh_new($self);
 	map { $_->[0] } @{$dbh->selectall_arrayref($sql, undef, @pfx)};
 }
 
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 51e89b23..eb842843 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -7,7 +7,7 @@ package PublicInbox::LeiRefreshMailSync;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::LeiExportKw;
+use PublicInbox::LeiImport;
 use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::Import;
 
@@ -97,8 +97,16 @@ sub ipc_atfork_child { # needed for PublicInbox::LeiPmdir
 	undef;
 }
 
+sub _complete_refresh_mail_sync {
+	my ($lei, @argv) = @_;
+	my $lms = $lei->lms or return ();
+	my $match_cb = $lei->complete_url_prepare(\@argv);
+	my @k = $lms->folders($argv[-1], 1);
+	my @m = map { $match_cb->($_) } @k;
+	@m ? @m : @k
+}
+
 no warnings 'once';
-*_complete_refresh_mail_sync = \&PublicInbox::LeiExportKw::_complete_export_kw;
 *net_merge_all_done = \&PublicInbox::LeiInput::input_only_net_merge_all_done;
 
 1;
diff --git a/lib/PublicInbox/LeiTag.pm b/lib/PublicInbox/LeiTag.pm
index 9bbf0d79..817b87f8 100644
--- a/lib/PublicInbox/LeiTag.pm
+++ b/lib/PublicInbox/LeiTag.pm
@@ -74,7 +74,7 @@ sub ipc_atfork_child {
 # Workaround bash word-splitting s to ['kw', ':', 'keyword' ...]
 # Maybe there's a better way to go about this in
 # contrib/completion/lei-completion.bash
-sub _complete_mark_common ($) {
+sub _complete_tag_common ($) {
 	my ($argv) = @_;
 	# Workaround bash word-splitting URLs to ['https', ':', '//' ...]
 	# Maybe there's a better way to go about this in
@@ -104,16 +104,18 @@ sub _complete_mark_common ($) {
 # FIXME: same problems as _complete_forget_external and similar
 sub _complete_tag {
 	my ($self, @argv) = @_;
+	require PublicInbox::LeiImport;
+	my @in = PublicInbox::LeiImport::_complete_import(@_);
 	my @L = eval { $self->_lei_store->search->all_terms('L') };
-	my @all = ((map { ("+kw:$_", "-kw:$_") } @PublicInbox::LeiInput::KW),
+	my @kwL = ((map { ("+kw:$_", "-kw:$_") } @PublicInbox::LeiInput::KW),
 		(map { ("+L:$_", "-L:$_") } @L));
-	return @all if !@argv;
-	my ($cur, $re) = _complete_mark_common(\@argv);
-	map {
+	my ($cur, $re) = _complete_tag_common(\@argv);
+	my @m = map {
 		# only return the part specified on the CLI
 		# don't duplicate if already 100% completed
 		/\A$re(\Q$cur\E.*)/ ? ($cur eq $1 ? () : $1) : ();
-	} grep(/$re\Q$cur/, @all);
+	} grep(/$re\Q$cur/, @kwL);
+	(@in, (@m ? @m : @kwL));
 }
 
 no warnings 'once'; # the following works even when LeiAuth is lazy-loaded
diff --git a/lib/PublicInbox/SharedKV.pm b/lib/PublicInbox/SharedKV.pm
index 3487e820..645bb57c 100644
--- a/lib/PublicInbox/SharedKV.pm
+++ b/lib/PublicInbox/SharedKV.pm
@@ -84,12 +84,19 @@ SELECT k,v FROM kv
 }
 
 sub keys {
-	my ($self) = @_;
-	my $sth = $self->dbh->prepare_cached(<<'', undef, 1);
-SELECT k FROM kv
-
-	$sth->execute;
-	map { $_->[0] } @{$sth->fetchall_arrayref};
+	my ($self, @pfx) = @_;
+	my $sql = 'SELECT k FROM kv';
+	if (defined $pfx[0]) {
+		$sql .= ' WHERE k LIKE ? ESCAPE ?';
+		my $anywhere = !!$pfx[1];
+		$pfx[1] = '\\';
+		$pfx[0] =~ s/([%_\\])/\\$1/g; # glob chars
+		$pfx[0] .= '%';
+		substr($pfx[0], 0, 0, '%') if $anywhere;
+	} else {
+		@pfx = (); # [0] may've been undef
+	}
+	map { $_->[0] } @{$self->dbh->selectall_arrayref($sql, undef, @pfx)};
 }
 
 sub delete_by_val {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 09/12] lei q: show progress on >1s preparation phase
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (7 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 08/12] lei: various completion improvements Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 10/12] search: drop reopen retry message Eric Wong
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

Overwriting existing destinations safe (but slow) by default,
so show a progress message noting what we're doing while
a user waits.
---
 lib/PublicInbox/LeiToMail.pm  | 23 ++++++++++++++++++++++-
 lib/PublicInbox/LeiXSearch.pm |  2 +-
 t/lei-watch.t                 |  2 +-
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a419b83f..ed609081 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -689,6 +689,8 @@ sub do_augment { # slow, runs in wq worker
 # fast (spawn compressor or mkdir), runs in same process as pre_augment
 sub post_augment {
 	my ($self, $lei, @args) = @_;
+	$self->{-au_noted}++ and $lei->qerr("# writing to $self->{dst} ...");
+
 	my $wait = $lei->{opt}->{'import-before'} ?
 			$lei->{sto}->wq_do('checkpoint', 1) : 0;
 	# _post_augment_mbox
@@ -784,9 +786,28 @@ sub wq_atexit_child {
 	$lei->{pkt_op_p}->pkt_do('l2m_progress', $nr);
 }
 
+# runs on a 1s timer in lei-daemon
+sub augment_inprogress {
+	my ($err, $opt, $dst, $au_noted) = @_;
+	$$au_noted++ and return;
+	print $err '# '.($opt->{'import-before'} ?
+			"importing non-external contents of $dst" : (
+			($opt->{dedupe} // 'content') ne 'none') ?
+			"scanning old contents of $dst for dedupe" :
+			"removing old contents of $dst")." ...\n";
+}
+
 # called in top-level lei-daemon when LeiAuth is done
 sub net_merge_all_done {
-	my ($self) = @_;
+	my ($self, $lei) = @_;
+	if ($PublicInbox::DS::in_loop &&
+			$self->can("_do_augment_$self->{base_type}") &&
+			!$lei->{opt}->{quiet}) {
+		$self->{-au_noted} = 0;
+		PublicInbox::DS::add_timer(1, \&augment_inprogress,
+				$lei->{2}, $lei->{opt},
+				$self->{dst}, \$self->{-au_noted});
+	}
 	$self->wq_broadcast('do_post_auth');
 	$self->wq_close(1);
 }
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 3ce8f32d..2227c2ac 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -572,7 +572,7 @@ sub do_query {
 	$self->{opt_sort} = $lei->{opt}->{'sort'};
 	$self->{-do_lcat} = !!(delete $lei->{lcat_todo});
 	if ($l2m) {
-		$l2m->net_merge_all_done unless $lei->{auth};
+		$l2m->net_merge_all_done($lei) unless $lei->{auth};
 	} else {
 		start_query($self);
 	}
diff --git a/t/lei-watch.t b/t/lei-watch.t
index a881fbb9..df887a03 100644
--- a/t/lei-watch.t
+++ b/t/lei-watch.t
@@ -91,7 +91,7 @@ test_lei(sub {
 		$ino_fdinfo or skip 'Linux/inotify-only removal removal', 1;
 		open my $fh, '<', $ino_fdinfo or xbail "open $ino_fdinfo: $!";
 		my $cmp = [ <$fh> ];
-		is_deeply($cmp, $ino_contents, 'inotify Maildir watches gone');
+		is_xdeeply($cmp, $ino_contents, 'inotify Maildir watches gone');
 	};
 });
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 10/12] search: drop reopen retry message
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (8 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 09/12] lei q: show progress on >1s preparation phase Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 11/12] lei q: update messages to reflect --save default Eric Wong
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

It's needless noise in syslogs for daemons and unnecessarily
alarming to users on the command-line.
---
 lib/PublicInbox/Search.pm | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index af0a35d9..1d1cd2f5 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -409,7 +409,6 @@ sub retry_reopen {
 		# Exception: The revision being read has been discarded -
 		# you should call Xapian::Database::reopen()
 		if (ref($@) =~ /\bDatabaseModifiedError\b/) {
-			warn "# reopen try #$i on $@\n";
 			reopen($self);
 		} else {
 			# let caller decide how to spew, because ExtMsg queries

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 11/12] lei q: update messages to reflect --save default
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (9 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 10/12] search: drop reopen retry message Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  7:41 ` [PATCH 12/12] lei q: improve --limit behavior and progress Eric Wong
  2021-09-21  9:29 ` [PATCH 0/3] lei: a few more annoyances fixed Eric Wong
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

I wanted to try --dedupe=none for something, but it failed
since I forgot --no-save :x  So hint users towards --no-save
if necessary.
---
 lib/PublicInbox/LeiSavedSearch.pm | 4 ++--
 t/lei-q-save.t                    | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiSavedSearch.pm b/lib/PublicInbox/LeiSavedSearch.pm
index 89f5c359..637456e4 100644
--- a/lib/PublicInbox/LeiSavedSearch.pm
+++ b/lib/PublicInbox/LeiSavedSearch.pm
@@ -94,7 +94,7 @@ sub translate_dedupe ($$$) {
 	$dd //= 'content';
 	return 1 if $dd eq 'content'; # the default
 	return $self->{"-dedupe_$dd"} = 1 if ($dd eq 'oid' || $dd eq 'mid');
-	$lei->fail("--dedupe=$dd unsupported with --save");
+	$lei->fail("--dedupe=$dd requires --no-save");
 }
 
 sub up { # updating existing saved search via "lei up"
@@ -103,7 +103,7 @@ sub up { # updating existing saved search via "lei up"
 	my $self = bless { ale => $lei->ale }, $cls;
 	my $dir = $dst;
 	output2lssdir($self, $lei, \$dir, \$f) or
-		return $lei->fail("--save was not used with $dst cwd=".
+		return $lei->fail("--no-save was used with $dst cwd=".
 					$lei->rel2abs('.'));
 	$self->{-cfg} = $lei->cfg_dump($f) // return $lei->fail;
 	$self->{-ovf} = "$dir/over.sqlite3";
diff --git a/t/lei-q-save.t b/t/lei-q-save.t
index 9c17a011..5940018c 100644
--- a/t/lei-q-save.t
+++ b/t/lei-q-save.t
@@ -69,11 +69,11 @@ test_lei(sub {
 	ok(-s "$home/mbcl2" > $size, 'size increased after up');
 
 	ok(!lei(qw(up -q), $home), 'up fails on unknown dir');
-	like($lei_err, qr/--save was not used/, 'error noted --save');
+	like($lei_err, qr/--no-save was used/, 'error noted --no-save');
 
 	lei_ok(qw(q --no-save d:last.week.. -q -o), "$home/no-save");
 	ok(!lei(qw(up -q), "$home/no-save"), 'up fails on --no-save');
-	like($lei_err, qr/--save was not used/, 'error noted --save');
+	like($lei_err, qr/--no-save was used/, 'error noted --no-save');
 
 	lei_ok qw(ls-search); my @d = split(/\n/, $lei_out);
 	lei_ok qw(ls-search -z); my @z = split(/\0/, $lei_out);
@@ -131,7 +131,7 @@ test_lei(sub {
 	unlike($lei_out, qr/mbrd-aug/,
 		'forget-search completion cleared after forget');
 	ok(!lei('up', "$home/mbrd-aug"), 'lei up fails after forget');
-	like($lei_err, qr/--save was not used/, 'error noted --save');
+	like($lei_err, qr/--no-save was used/, 'error noted --no-save');
 
 	# dedupe=mid
 	my $o = "$home/dd-mid";

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 12/12] lei q: improve --limit behavior and progress
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (10 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 11/12] lei q: update messages to reflect --save default Eric Wong
@ 2021-09-21  7:41 ` Eric Wong
  2021-09-21  9:29 ` [PATCH 0/3] lei: a few more annoyances fixed Eric Wong
  12 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  7:41 UTC (permalink / raw)
  To: meta

Avoid slurping gigantic (e.g. 100000) result sets into a single
response if a giant limit is specified, and instead use 10000
as a window for the mset with a given offset.  We'll also warn
and hint towards about the --limit= switch when the estimated
result set is larger than the default limit.
---
 Documentation/lei-q.pod           |  6 ++++--
 lib/PublicInbox/LeiLcat.pm        |  3 +--
 lib/PublicInbox/LeiQuery.pm       |  9 +++++++--
 lib/PublicInbox/LeiSavedSearch.pm | 18 +++++++++++-------
 lib/PublicInbox/LeiUp.pm          | 25 +++++++++----------------
 lib/PublicInbox/LeiXSearch.pm     | 18 +++++++++++++++---
 6 files changed, 47 insertions(+), 32 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 2823ced8..e1e3666d 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -10,7 +10,7 @@ lei q [OPTIONS] (--stdin|-)
 
 =head1 DESCRIPTION
 
-Search for messages across the lei store and externals.
+Search for messages across the lei/store and externals.
 
 =for comment
 TODO: Give common prefixes, or at least a description/reference.
@@ -192,7 +192,9 @@ Default: fcntl,dotlock
 
 =item -n NUMBER
 
-Limit the number of matches.
+Fuzzy limit the number of matches per-local external and lei/store.
+Messages added by the L<--threads> switch do not count towards this
+limit, and there is no limit on remote externals.
 
 Default: 10000
 
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index c13e2153..d553b187 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -144,9 +144,8 @@ sub lei_lcat {
 	$lei->ale->refresh_externals($lxs, $lei);
 	$lei->_lei_store(1);
 	my $opt = $lei->{opt};
-	my %mset_opt = map { $_ => $opt->{$_} } qw(threads limit offset);
+	my %mset_opt;
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
-	$mset_opt{limit} //= 10000;
 	$opt->{sort} //= 'relevance';
 	$mset_opt{relevance} = 1;
 	$lei->{mset_opt} = \%mset_opt;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index d5f132f1..cb5ac8fb 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -41,6 +41,12 @@ sub _start_query { # used by "lei q" and "lei up"
 
 	# descending docid order is cheapest, MUA controls sorting order
 	$self->{mset_opt}->{relevance} //= -2 if $l2m || $opt->{threads};
+
+	my $tot = $self->{mset_opt}->{total} //= $self->{opt}->{limit} // 10000;
+	$self->{mset_opt}->{limit} = $tot > 10000 ? 10000 : $tot;
+	$self->{mset_opt}->{offset} //= 0;
+	$self->{mset_opt}->{threads} //= $opt->{threads};
+
 	if ($self->{net}) {
 		require PublicInbox::LeiAuth;
 		$self->{auth} = PublicInbox::LeiAuth->new
@@ -118,9 +124,8 @@ sub lei_q {
 	my $lxs = lxs_prepare($self) or return;
 	$self->ale->refresh_externals($lxs, $self);
 	my $opt = $self->{opt};
-	my %mset_opt = map { $_ => $opt->{$_} } qw(threads limit offset);
+	my %mset_opt;
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
-	$mset_opt{limit} //= 10000;
 	if (defined(my $sort = $opt->{'sort'})) {
 		if ($sort eq 'relevance') {
 			$mset_opt{relevance} = 1;
diff --git a/lib/PublicInbox/LeiSavedSearch.pm b/lib/PublicInbox/LeiSavedSearch.pm
index 637456e4..3e10f780 100644
--- a/lib/PublicInbox/LeiSavedSearch.pm
+++ b/lib/PublicInbox/LeiSavedSearch.pm
@@ -29,6 +29,8 @@ sub BOOL_FIELDS () {
 	qw(external local remote import-remote import-before threads)
 }
 
+sub SINGLE_FIELDS () { qw(limit dedupe output) }
+
 sub lss_dir_for ($$;$) {
 	my ($lei, $dstref, $on_fs) = @_;
 	my $pfx;
@@ -89,9 +91,9 @@ sub list {
 	} @$out
 }
 
-sub translate_dedupe ($$$) {
-	my ($self, $lei, $dd) = @_;
-	$dd //= 'content';
+sub translate_dedupe ($$) {
+	my ($self, $lei) = @_;
+	my $dd = $lei->{opt}->{dedupe} // 'content';
 	return 1 if $dd eq 'content'; # the default
 	return $self->{"-dedupe_$dd"} = 1 if ($dd eq 'oid' || $dd eq 'mid');
 	$lei->fail("--dedupe=$dd requires --no-save");
@@ -128,8 +130,7 @@ sub new { # new saved search "lei q --save"
 	File::Path::make_path($dir); # raises on error
 	$self->{-cfg} = {};
 	my $f = $self->{'-f'} = "$dir/lei.saved-search";
-	my $dd = $lei->{opt}->{dedupe};
-	translate_dedupe($self, $lei, $dd) or return;
+	translate_dedupe($self, $lei) or return;
 	open my $fh, '>', $f or return $lei->fail("open $f: $!");
 	my $sq_dst = PublicInbox::Config::squote_maybe($dst);
 	my $q = $lei->{mset_opt}->{q_raw} // die 'BUG: {q_raw} missing';
@@ -139,15 +140,14 @@ sub new { # new saved search "lei q --save"
 		$q = "\tq = ".cquote_val($q);
 	}
 	$dst = "$lei->{ovv}->{fmt}:$dst" if $dst !~ m!\Aimaps?://!i;
+	$lei->{opt}->{output} = $dst;
 	print $fh <<EOM;
 ; to refresh with new results, run: lei up $sq_dst
 ; `maxuid' and `lastresult' lines are maintained by "lei up" for optimization
 [lei]
 $q
 [lei "q"]
-	output = $dst
 EOM
-	print $fh "\tdedupe = $dd\n" if $dd;
 	for my $k (ARRAY_FIELDS) {
 		my $ary = $lei->{opt}->{$k} // next;
 		for my $x (@$ary) {
@@ -158,6 +158,10 @@ EOM
 		my $val = $lei->{opt}->{$k} // next;
 		print $fh "\t$k = ".($val ? 1 : 0)."\n";
 	}
+	for my $k (SINGLE_FIELDS) {
+		my $val = $lei->{opt}->{$k} // next;
+		print $fh "\t$k = $val\n";
+	}
 	close($fh) or return $lei->fail("close $f: $!");
 	$self->{lock_path} = "$self->{-f}.flock";
 	$self->{-ovf} = "$dir/over.sqlite3";
diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm
index abb05d46..89cf0112 100644
--- a/lib/PublicInbox/LeiUp.pm
+++ b/lib/PublicInbox/LeiUp.pm
@@ -18,7 +18,6 @@ sub up1 ($$) {
 	my $lss = PublicInbox::LeiSavedSearch->up($lei, $out) or return;
 	my $f = $lss->{'-f'};
 	my $mset_opt = $lei->{mset_opt} = { relevance => -2 };
-	$mset_opt->{limit} = $lei->{opt}->{limit} // 10000;
 	my $q = $mset_opt->{q_raw} = $lss->{-cfg}->{'lei.q'} //
 				return $lei->fail("lei.q unset in $f");
 	my $lse = $lei->{lse} // die 'BUG: {lse} missing';
@@ -27,24 +26,18 @@ sub up1 ($$) {
 	} else {
 		$lse->query_approxidate($lse->git, $mset_opt->{qstr} = $q);
 	}
-	my $o = $lei->{opt}->{output} = $lss->{-cfg}->{'lei.q.output'} //
-		return $lei->fail("lei.q.output unset in $f");
-	ref($o) and return $lei->fail("multiple values of lei.q.output in $f");
-	if (defined(my $dd = $lss->{-cfg}->{'lei.q.dedupe'})) {
-		$lss->translate_dedupe($lei, $dd) or return;
-		$lei->{opt}->{dedupe} = $dd;
-	}
-	for my $k (qw(only include exclude)) {
+	# n.b. only a few CLI args are accepted for "up", so //= usually sets
+	for my $k ($lss->ARRAY_FIELDS) {
 		my $v = $lss->{-cfg}->get_all("lei.q.$k") // next;
-		$lei->{opt}->{$k} = $v;
+		$lei->{opt}->{$k} //= $v;
 	}
-	for my $k (qw(external local remote
-			import-remote import-before threads)) {
-		my $c = "lei.q.$k";
-		my $v = $lss->{-cfg}->{$c} // next;
-		ref($v) and return $lei->fail("multiple values of $c in $f");
-		$lei->{opt}->{$k} = $v;
+	for my $k ($lss->BOOL_FIELDS, $lss->SINGLE_FIELDS) {
+		my $v = $lss->{-cfg}->get_1('lei.q', $k) // next;
+		$lei->{opt}->{$k} //= $v;
 	}
+	my $o = $lei->{opt}->{output} // '';
+	return $lei->fail("lei.q.output unset in $f") if $o eq '';
+	$lss->translate_dedupe($lei) or return;
 	$lei->{lss} = $lss; # for LeiOverview->new and query_remote_mboxrd
 	my $lxs = $lei->lxs_prepare or return;
 	$lei->ale->refresh_externals($lxs, $lei);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2227c2ac..584ffde9 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -110,10 +110,20 @@ sub recent {
 
 sub over {}
 
+sub _check_mset_limit ($$$) {
+	my ($lei, $desc, $mset) = @_;
+	return if defined($lei->{opt}->{limit}); # user requested limit
+	my $est = $mset->get_matches_estimated;
+	my $tot = $lei->{mset_opt}->{total};
+	$est > $tot and $lei->qerr(<<"");
+# $desc estimated matches ($est) exceeds default --limit=$tot
+
+}
+
 sub _mset_more ($$) {
 	my ($mset, $mo) = @_;
 	my $size = $mset->size;
-	$size >= $mo->{limit} && (($mo->{offset} += $size) < $mo->{limit});
+	$size >= $mo->{limit} && (($mo->{offset} += $size) < $mo->{total});
 }
 
 # $startq will EOF when do_augment is done augmenting and allow
@@ -182,7 +192,7 @@ sub query_one_mset { # for --threads and l2m w/o sort
 	my $first_ids;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
-		mset_progress($lei, $dir, $mset->size,
+		mset_progress($lei, $dir, $mo->{offset} + $mset->size,
 				$mset->get_matches_estimated);
 		wait_startq($lei); # wait for keyword updates
 		my $ids = $srch->mset_to_artnums($mset, $mo);
@@ -222,6 +232,7 @@ sub query_one_mset { # for --threads and l2m w/o sort
 			}
 		}
 	} while (_mset_more($mset, $mo));
+	_check_mset_limit($lei, $dir, $mset);
 	if ($lss && scalar(@$first_ids)) {
 		undef $stop_at;
 		my $max = $first_ids->[0];
@@ -244,7 +255,7 @@ sub query_combined_mset { # non-parallel for non-"--threads" users
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
-		mset_progress($lei, 'xsearch', $mset->size,
+		mset_progress($lei, 'xsearch', $mo->{offset} + $mset->size,
 				$mset->get_matches_estimated);
 		wait_startq($lei); # wait for keyword updates
 		for my $mitem ($mset->items) {
@@ -252,6 +263,7 @@ sub query_combined_mset { # non-parallel for non-"--threads" users
 			$each_smsg->($smsg, $mitem);
 		}
 	} while (_mset_more($mset, $mo));
+	_check_mset_limit($lei, 'xsearch', $mset);
 	undef $each_smsg; # may commit
 	$lei->{ovv}->ovv_atexit_child($lei);
 }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 0/3] lei: a few more annoyances fixed
  2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
                   ` (11 preceding siblings ...)
  2021-09-21  7:41 ` [PATCH 12/12] lei q: improve --limit behavior and progress Eric Wong
@ 2021-09-21  9:29 ` Eric Wong
  2021-09-21  9:29   ` [PATCH 1/3] t/lei-up: use '-q' to silence non-redirected test Eric Wong
                     ` (2 more replies)
  12 siblings, 3 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  9:29 UTC (permalink / raw)
  To: meta

The more I use it, the more I find wrong...

Eric Wong (3):
  t/lei-up: use '-q' to silence non-redirected test
  script/lei: handle SIGTSTP and SIGCONT
  lei: umask(077) before opening errors.log

 lib/PublicInbox/LEI.pm | 20 +++++++++++++-------
 script/lei             |  9 ++++-----
 t/lei-up.t             |  2 +-
 3 files changed, 18 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] t/lei-up: use '-q' to silence non-redirected test
  2021-09-21  9:29 ` [PATCH 0/3] lei: a few more annoyances fixed Eric Wong
@ 2021-09-21  9:29   ` Eric Wong
  2021-09-21  9:29   ` [PATCH 2/3] script/lei: handle SIGTSTP and SIGCONT Eric Wong
  2021-09-21  9:29   ` [PATCH 3/3] lei: umask(077) before opening errors.log Eric Wong
  2 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  9:29 UTC (permalink / raw)
  To: meta

We could redirect, too, but just use -q since we don't care
for the output with run_mode => 0.
---
 t/lei-up.t | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lei-up.t b/t/lei-up.t
index 8937cfb1..fc369156 100644
--- a/t/lei-up.t
+++ b/t/lei-up.t
@@ -35,7 +35,7 @@ test_lei(sub {
 	$uc = do { local $/; <$fh> };
 	is($uc, $exp, 'uncompressed both match');
 
-	lei_ok [ 'up', "$ENV{HOME}/b", "--mua=touch $ENV{HOME}/c" ],
+	lei_ok [ qw(up -q), "$ENV{HOME}/b", "--mua=touch $ENV{HOME}/c" ],
 		undef, { run_mode => 0 };
 	ok(-f "$ENV{HOME}/c", '--mua works with single output');
 });

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/3] script/lei: handle SIGTSTP and SIGCONT
  2021-09-21  9:29 ` [PATCH 0/3] lei: a few more annoyances fixed Eric Wong
  2021-09-21  9:29   ` [PATCH 1/3] t/lei-up: use '-q' to silence non-redirected test Eric Wong
@ 2021-09-21  9:29   ` Eric Wong
  2021-09-21  9:29   ` [PATCH 3/3] lei: umask(077) before opening errors.log Eric Wong
  2 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  9:29 UTC (permalink / raw)
  To: meta

Sometimes it's useful to pause an expensive query or
refresh-mail-sync to do something else.  While lei-daemon and
lei/store can't be paused since they're shared across clients,
per-invocation WQ workers can be paused safely using the
unblockable SIGSTOP.

While we're at it, drop the ETOOMANYREFS hint since it
hasn't been a problem since we drastically reduced FD passing
early in development.
---
 lib/PublicInbox/LEI.pm | 18 ++++++++++++------
 script/lei             |  9 ++++-----
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f94bfa45..2df1f326 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1118,23 +1118,28 @@ sub dclose {
 sub event_step {
 	my ($self) = @_;
 	local %ENV = %{$self->{env}};
-	my $sock = $self->{sock};
 	local $current_lei = $self;
 	eval {
-		while (my @fds = $recv_cmd->($sock, my $buf, 4096)) {
+		my $buf;
+		while (my @fds = $recv_cmd->($self->{sock}, $buf, 4096)) {
 			if (scalar(@fds) == 1 && !defined($fds[0])) {
 				return if $! == EAGAIN;
 				next if $! == EINTR;
 				last if $! == ECONNRESET;
 				die "recvmsg: $!";
 			}
-			for my $fd (@fds) {
-				open my $rfh, '+<&=', $fd;
+			for (@fds) { open my $rfh, '+<&=', $_ }
+		}
+		if ($buf eq '') {
+			_drop_wq($self); # EOF, client disconnected
+			dclose($self);
+		} elsif ($buf =~ /\A(STOP|CONT)\z/) {
+			for my $wq (grep(defined, @$self{@WQ_KEYS})) {
+				$wq->wq_kill($buf) or $wq->wq_kill_old($buf);
 			}
+		} else {
 			die "unrecognized client signal: $buf";
 		}
-		_drop_wq($self); # EOF, client disconnected
-		dclose($self);
 	};
 	if (my $err = $@) {
 		eval { $self->fail($err) };
@@ -1146,6 +1151,7 @@ sub event_step_init {
 	my ($self) = @_;
 	my $sock = $self->{sock} or return;
 	$self->{-event_init_done} //= do { # persist til $ops done
+		$sock->blocking(0);
 		$self->SUPER::new($sock, EPOLLIN|EPOLLET);
 		$sock;
 	};
diff --git a/script/lei b/script/lei
index 591013e3..399296ba 100755
--- a/script/lei
+++ b/script/lei
@@ -106,11 +106,10 @@ open my $dh, '<', '.' or die "open(.) $!";
 my $buf = join("\0", scalar(@ARGV), @ARGV);
 while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 $buf .= "\0\0";
-my $n = $send_cmd->($sock, [0, 1, 2, fileno($dh)], $buf, MSG_EOR);
-if (!$n) {
-	die "sendmsg: $! (check RLIMIT_NOFILE)\n" if $!{ETOOMANYREFS};
-	die "sendmsg: $!\n";
-}
+$send_cmd->($sock, [0, 1, 2, fileno($dh)], $buf, MSG_EOR) or die "sendmsg: $!";
+$SIG{TSTP} = sub { $send_cmd->($sock, [], 'STOP', MSG_EOR); kill 'STOP', $$ };
+$SIG{CONT} = sub { $send_cmd->($sock, [], 'CONT', MSG_EOR) };
+
 my $x_it_code = 0;
 while (1) {
 	my (@fds) = $recv_cmd->($sock, my $buf, 4096 * 33);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/3] lei: umask(077) before opening errors.log
  2021-09-21  9:29 ` [PATCH 0/3] lei: a few more annoyances fixed Eric Wong
  2021-09-21  9:29   ` [PATCH 1/3] t/lei-up: use '-q' to silence non-redirected test Eric Wong
  2021-09-21  9:29   ` [PATCH 2/3] script/lei: handle SIGTSTP and SIGCONT Eric Wong
@ 2021-09-21  9:29   ` Eric Wong
  2 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2021-09-21  9:29 UTC (permalink / raw)
  To: meta

There's a chance some sensitive information (e.g. folder names)
can end up in errors.log, though $XDG_RUNTIME_DIR or
/tmp/lei-$UID/ will have 0700 permissions, anyways.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2df1f326..29293e6c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1225,6 +1225,7 @@ sub lazy_start {
 	$errors_log = "$sock_dir/errors.log";
 	my $addr = pack_sockaddr_un($path);
 	my $lk = bless { lock_path => $errors_log }, 'PublicInbox::Lock';
+	umask(077) // die("umask(077): $!");
 	$lk->lock_acquire;
 	socket($listener, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
 	if ($errno == ECONNREFUSED || $errno == ENOENT) {
@@ -1236,7 +1237,6 @@ sub lazy_start {
 		$! = $errno; # allow interpolation to stringify in die
 		die "connect($path): $!";
 	}
-	umask(077) // die("umask(077): $!");
 	bind($listener, $addr) or die "bind($path): $!";
 	$lk->lock_release;
 	undef $lk;

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-09-21  9:29 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21  7:41 [PATCH 00/12] lei: fix various annoyances Eric Wong
2021-09-21  7:41 ` [PATCH 01/12] lei inspect: convert to WQ worker Eric Wong
2021-09-21  7:41 ` [PATCH 02/12] lei inspect: support NNTP URLs Eric Wong
2021-09-21  7:41 ` [PATCH 03/12] lei_mail_sync: account for non-unique cases Eric Wong
2021-09-21  7:41 ` [PATCH 04/12] lei: simplify internal arg2folder usage Eric Wong
2021-09-21  7:41 ` [PATCH 05/12] lei lcat: use single queue for ordering Eric Wong
2021-09-21  7:41 ` [PATCH 06/12] doc: lei-security: section for WIP auth methods Eric Wong
2021-09-21  7:41 ` [PATCH 07/12] lei lcat: support NNTP URLs Eric Wong
2021-09-21  7:41 ` [PATCH 08/12] lei: various completion improvements Eric Wong
2021-09-21  7:41 ` [PATCH 09/12] lei q: show progress on >1s preparation phase Eric Wong
2021-09-21  7:41 ` [PATCH 10/12] search: drop reopen retry message Eric Wong
2021-09-21  7:41 ` [PATCH 11/12] lei q: update messages to reflect --save default Eric Wong
2021-09-21  7:41 ` [PATCH 12/12] lei q: improve --limit behavior and progress Eric Wong
2021-09-21  9:29 ` [PATCH 0/3] lei: a few more annoyances fixed Eric Wong
2021-09-21  9:29   ` [PATCH 1/3] t/lei-up: use '-q' to silence non-redirected test Eric Wong
2021-09-21  9:29   ` [PATCH 2/3] script/lei: handle SIGTSTP and SIGCONT Eric Wong
2021-09-21  9:29   ` [PATCH 3/3] lei: umask(077) before opening errors.log Eric Wong

Code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).