* [PATCH 0/4] extindex: more fixes and usability things
@ 2020-12-26 10:16 Eric Wong
2020-12-26 10:16 ` [PATCH 1/4] extindex: various --watch signal handling fixes Eric Wong
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Eric Wong @ 2020-12-26 10:16 UTC (permalink / raw)
To: meta
--watch seems nice, and "--watch --all" (or just "--all")
without having to specify the pathname of the extindex
is nice, too.
It still needs to write extindex.all.topdir if none is
configured and a section 1 manpage...
Eric Wong (4):
extindex: various --watch signal handling fixes
extindex: enable autoflush on STDOUT/STDERR
extindex: add undocumented --no-scan switch
extindex: allow using --all without EXTINDEX_DIR
lib/PublicInbox/ExtSearchIdx.pm | 18 ++++++++++++++----
script/public-inbox-extindex | 21 +++++++++++++++------
2 files changed, 29 insertions(+), 10 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/4] extindex: various --watch signal handling fixes
2020-12-26 10:16 [PATCH 0/4] extindex: more fixes and usability things Eric Wong
@ 2020-12-26 10:16 ` Eric Wong
2020-12-26 10:16 ` [PATCH 2/4] extindex: enable autoflush on STDOUT/STDERR Eric Wong
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2020-12-26 10:16 UTC (permalink / raw)
To: meta
We need to clobber the SIGUSR1 resync queue on SIGHUP to
invalidate old inbox objects. Furthermore, the lengthy
initial scan needs to ignore signals intended for the
event loop to avoid unexpected behavior. Finally, add
some progress output to inform users on the terminal
to inform users' of progress.
---
lib/PublicInbox/ExtSearchIdx.pm | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 53ff2ca1..778154a5 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -1008,6 +1008,7 @@ sub eidx_reload { # -extindex --watch SIGHUP handler
if ($self->{cfg}) {
my $pr = $self->{-watch_sync}->{-opt}->{-progress};
$pr->('reloading ...') if $pr;
+ delete $self->{-resync_queue};
@{$self->{ibx_list}} = ();
%{$self->{ibx_map}} = ();
delete $self->{-watch_sync}->{id2pos};
@@ -1043,6 +1044,10 @@ sub event_step { # PublicInbox::DS::requeue callback
sub eidx_watch { # public-inbox-extindex --watch main loop
my ($self, $opt) = @_;
+ local %SIG = %SIG;
+ for my $sig (qw(HUP USR1 TSTP QUIT INT TERM)) {
+ $SIG{$sig} = sub { warn "SIG$sig ignored while scanning\n" };
+ }
require PublicInbox::InboxIdle;
require PublicInbox::DS;
require PublicInbox::Syscall;
@@ -1052,6 +1057,8 @@ sub eidx_watch { # public-inbox-extindex --watch main loop
$idler->watch_inbox($_) for @{$self->{ibx_list}};
}
$_->subscribe_unlock(__PACKAGE__, $self) for @{$self->{ibx_list}};
+ my $pr = $opt->{-progress};
+ $pr->("performing initial scan ...\n") if $pr;
my $sync = eidx_sync($self, $opt); # initial sync
return if $sync->{quit};
my $oldset = PublicInbox::Sigfd::block_signals();
@@ -1067,7 +1074,7 @@ sub eidx_watch { # public-inbox-extindex --watch main loop
$sig->{QUIT} = $sig->{INT} = $sig->{TERM} = $quit;
my $sigfd = PublicInbox::Sigfd->new($sig,
$PublicInbox::Syscall::SFD_NONBLOCK);
- local %SIG = (%SIG, %$sig) if !$sigfd;
+ %SIG = (%SIG, %$sig) if !$sigfd;
local $self->{-watch_sync} = $sync; # for ->on_inbox_unlock
if (!$sigfd) {
# wake up every second to accept signals if we don't
@@ -1076,6 +1083,7 @@ sub eidx_watch { # public-inbox-extindex --watch main loop
PublicInbox::DS->SetLoopTimeout(1000);
}
PublicInbox::DS->SetPostLoopCallback(sub { !$sync->{quit} });
+ $pr->("initial scan complete, entering event loop\n") if $pr;
PublicInbox::DS->EventLoop; # calls InboxIdle->event_step
done($self);
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/4] extindex: enable autoflush on STDOUT/STDERR
2020-12-26 10:16 [PATCH 0/4] extindex: more fixes and usability things Eric Wong
2020-12-26 10:16 ` [PATCH 1/4] extindex: various --watch signal handling fixes Eric Wong
@ 2020-12-26 10:16 ` Eric Wong
2020-12-26 10:16 ` [PATCH 3/4] extindex: add undocumented --no-scan switch Eric Wong
2020-12-26 10:16 ` [PATCH 4/4] extindex: allow using --all without EXTINDEX_DIR Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2020-12-26 10:16 UTC (permalink / raw)
To: meta
With --watch, the output may be redirected to a pipe or socket
which Perl may decide to buffer. Ensure Perl doesn't buffer
these outputs since they can provide real-time status updates
in response to signals or FS activity.
---
script/public-inbox-extindex | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/script/public-inbox-extindex b/script/public-inbox-extindex
index 607baa3e..17986f60 100644
--- a/script/public-inbox-extindex
+++ b/script/public-inbox-extindex
@@ -33,7 +33,9 @@ GetOptions($opt, qw(verbose|v+ reindex rethread compact|c+ jobs|j=i
or die $help;
if ($opt->{help}) { print $help; exit 0 };
die "--jobs must be >= 0\n" if defined $opt->{jobs} && $opt->{jobs} < 0;
-
+require IO::Handle;
+STDOUT->autoflush(1);
+STDERR->autoflush(1);
# require lazily to speed up --help
my $eidx_dir = shift(@ARGV) // die "E: $help";
local $SIG{USR1} = 'IGNORE'; # to be overridden in eidx_sync
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/4] extindex: add undocumented --no-scan switch
2020-12-26 10:16 [PATCH 0/4] extindex: more fixes and usability things Eric Wong
2020-12-26 10:16 ` [PATCH 1/4] extindex: various --watch signal handling fixes Eric Wong
2020-12-26 10:16 ` [PATCH 2/4] extindex: enable autoflush on STDOUT/STDERR Eric Wong
@ 2020-12-26 10:16 ` Eric Wong
2020-12-26 10:16 ` [PATCH 4/4] extindex: allow using --all without EXTINDEX_DIR Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2020-12-26 10:16 UTC (permalink / raw)
To: meta
This makes diagnosing --watch problems easier when there's
50K inboxes by avoiding the lengthy scan (which is the reason
--watch exists in the first place).
---
lib/PublicInbox/ExtSearchIdx.pm | 8 +++++---
script/public-inbox-extindex | 4 ++--
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 778154a5..07e64698 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -881,9 +881,11 @@ sub eidx_sync { # main entry point
}
# don't use $_ here, it'll get clobbered by reindex_checkpoint
- for my $ibx (@{$self->{ibx_list}}) {
- last if $sync->{quit};
- sync_inbox($self, $sync, $ibx);
+ if ($opt->{scan} // 1) {
+ for my $ibx (@{$self->{ibx_list}}) {
+ last if $sync->{quit};
+ sync_inbox($self, $sync, $ibx);
+ }
}
$self->{oidx}->rethread_done($opt) unless $sync->{quit};
eidxq_process($self, $sync) unless $sync->{quit};
diff --git a/script/public-inbox-extindex b/script/public-inbox-extindex
index 17986f60..f4ffda4b 100644
--- a/script/public-inbox-extindex
+++ b/script/public-inbox-extindex
@@ -23,12 +23,12 @@ usage: public-inbox-extindex [options] EXTINDEX_DIR [INBOX_DIR]
BYTES may use `k', `m', and `g' suffixes (e.g. `10m' for 10 megabytes)
See public-inbox-extindex(1) man page for full documentation.
EOF
-my $opt = { quiet => -1, compact => 0, max_size => undef, fsync => 1 };
+my $opt = { quiet => -1, compact => 0, fsync => 1, scan => 1 };
GetOptions($opt, qw(verbose|v+ reindex rethread compact|c+ jobs|j=i
fsync|sync!
indexlevel|index-level|L=s max_size|max-size=s
batch_size|batch-size=s
- gc commit-interval=i watch
+ gc commit-interval=i watch scan!
all help|h))
or die $help;
if ($opt->{help}) { print $help; exit 0 };
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 4/4] extindex: allow using --all without EXTINDEX_DIR
2020-12-26 10:16 [PATCH 0/4] extindex: more fixes and usability things Eric Wong
` (2 preceding siblings ...)
2020-12-26 10:16 ` [PATCH 3/4] extindex: add undocumented --no-scan switch Eric Wong
@ 2020-12-26 10:16 ` Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2020-12-26 10:16 UTC (permalink / raw)
To: meta
If "--all" is specified to index all inboxes, implicitly choose
the configured [extindex "all"] external index since "--all" is
incompatible with specifying inbox directories on the
command-line.
---
script/public-inbox-extindex | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/script/public-inbox-extindex b/script/public-inbox-extindex
index f4ffda4b..5f27988f 100644
--- a/script/public-inbox-extindex
+++ b/script/public-inbox-extindex
@@ -6,7 +6,7 @@ use strict;
use v5.10.1;
use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
my $help = <<EOF; # the following should fit w/o scrolling in 80x24 term:
-usage: public-inbox-extindex [options] EXTINDEX_DIR [INBOX_DIR]
+usage: public-inbox-extindex [options] [EXTINDEX_DIR] [INBOX_DIR...]
Create and update external (detached) search indices
@@ -36,11 +36,18 @@ die "--jobs must be >= 0\n" if defined $opt->{jobs} && $opt->{jobs} < 0;
require IO::Handle;
STDOUT->autoflush(1);
STDERR->autoflush(1);
-# require lazily to speed up --help
-my $eidx_dir = shift(@ARGV) // die "E: $help";
local $SIG{USR1} = 'IGNORE'; # to be overridden in eidx_sync
+# require lazily to speed up --help
require PublicInbox::Admin;
my $cfg = PublicInbox::Config->new;
+my $eidx_dir = shift(@ARGV);
+unless (defined $eidx_dir) {
+ if ($opt->{all} && $cfg->ALL) {
+ $eidx_dir = $cfg->ALL->{topdir};
+ } else {
+ die "E: $help";
+ }
+}
my @ibxs;
if ($opt->{gc}) {
die "E: inbox paths must not be specified with --gc\n" if @ARGV;
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-12-26 10:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-26 10:16 [PATCH 0/4] extindex: more fixes and usability things Eric Wong
2020-12-26 10:16 ` [PATCH 1/4] extindex: various --watch signal handling fixes Eric Wong
2020-12-26 10:16 ` [PATCH 2/4] extindex: enable autoflush on STDOUT/STDERR Eric Wong
2020-12-26 10:16 ` [PATCH 3/4] extindex: add undocumented --no-scan switch Eric Wong
2020-12-26 10:16 ` [PATCH 4/4] extindex: allow using --all without EXTINDEX_DIR Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).