* [PATCH 0/4] extindex tweaks and small fixes
@ 2021-10-17 9:52 Eric Wong
2021-10-17 9:52 ` [PATCH 1/4] extindex: use localtime to display lock time Eric Wong
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Eric Wong @ 2021-10-17 9:52 UTC (permalink / raw)
To: meta
Probably nothing of note, but some extra safety and
redundant work elimination.
Eric Wong (4):
extindex: use localtime to display lock time
extindex: retry sync_inbox before reindex
extindex: guard against false mismatch unrefs
extindex: better locations for {quit} checks
lib/PublicInbox/ExtSearchIdx.pm | 31 ++++++++++++++++++++++++-------
1 file changed, 24 insertions(+), 7 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/4] extindex: use localtime to display lock time
2021-10-17 9:52 [PATCH 0/4] extindex tweaks and small fixes Eric Wong
@ 2021-10-17 9:52 ` Eric Wong
2021-10-17 9:52 ` [PATCH 2/4] extindex: retry sync_inbox before reindex Eric Wong
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-10-17 9:52 UTC (permalink / raw)
To: meta
Since this is intended for use on the command-line,
include TZ offset in time and try to shorten the
message a bit so it wraps less on a terminal.
---
lib/PublicInbox/ExtSearchIdx.pm | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 69d048fb7342..67d720368922 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -719,11 +719,12 @@ sub eidxq_lock_acquire ($) {
return $locked if $locked eq $cur;
}
my ($pid, $time, $euid, $ident) = split(/-/, $cur, 4);
- my $t = strftime('%Y-%m-%d %k:%M:%S', gmtime($time));
+ my $t = strftime('%Y-%m-%d %k:%M %z', localtime($time));
+ local $self->{current_info} = 'eidxq';
if ($euid == $> && $ident eq host_ident) {
if (kill(0, $pid)) {
warn <<EOM; return;
-I: PID:$pid (re)indexing Xapian since $t, it will continue our work
+I: PID:$pid (re)indexing since $t, it will continue our work
EOM
}
if ($!{ESRCH}) {
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/4] extindex: retry sync_inbox before reindex
2021-10-17 9:52 [PATCH 0/4] extindex tweaks and small fixes Eric Wong
2021-10-17 9:52 ` [PATCH 1/4] extindex: use localtime to display lock time Eric Wong
@ 2021-10-17 9:52 ` Eric Wong
2021-10-17 9:52 ` [PATCH 3/4] extindex: guard against false mismatch unrefs Eric Wong
2021-10-17 9:52 ` [PATCH 4/4] extindex: better locations for {quit} checks Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-10-17 9:52 UTC (permalink / raw)
To: meta
Ensure the num highwater mark of the target inbox is stable
before using it. Otherwise we may end up repeating work
done to index a message.
---
lib/PublicInbox/ExtSearchIdx.pm | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 67d720368922..daff656d1ac5 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -859,14 +859,20 @@ sub _reindex_check_ibx ($$$) {
my $slice = 10000;
my $opt = { limit => $slice };
my ($beg, $end) = (1, $slice);
- my $err = sync_inbox($self, $sync, $ibx) and return;
- my $max = $ibx->mm->num_highwater;
+ my $ekey = $ibx->eidx_key;
+ my ($max, $max0);
+ do {
+ $max0 = $ibx->mm->num_highwater;
+ sync_inbox($self, $sync, $ibx) and return; # warned
+ $max = $ibx->mm->num_highwater;
+ return if $sync->{quit};
+ } while ($max > $max0 &&
+ warn("# $ekey moved $max0..$max, resyncing..\n"));
$end = $max if $end > $max;
# first, check if we missed any messages in target $ibx
my $msgs;
my $pr = $sync->{-opt}->{-progress};
- my $ekey = $ibx->eidx_key;
local $sync->{-regen_fmt} = "$ekey checking %u/$max\n";
${$sync->{nr}} = 0;
my $fast = $sync->{-opt}->{fast};
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/4] extindex: guard against false mismatch unrefs
2021-10-17 9:52 [PATCH 0/4] extindex tweaks and small fixes Eric Wong
2021-10-17 9:52 ` [PATCH 1/4] extindex: use localtime to display lock time Eric Wong
2021-10-17 9:52 ` [PATCH 2/4] extindex: retry sync_inbox before reindex Eric Wong
@ 2021-10-17 9:52 ` Eric Wong
2021-10-17 9:52 ` [PATCH 4/4] extindex: better locations for {quit} checks Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-10-17 9:52 UTC (permalink / raw)
To: meta
I'm not sure if this is a bug or not (or it could be
an old bug in the v2 indexing code).
---
lib/PublicInbox/ExtSearchIdx.pm | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index daff656d1ac5..cb5256a2c562 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -921,6 +921,16 @@ ibx_id = ? AND xnum >= ? AND xnum <= ?
my ($xnum, $hex) = unpack('JH*', $k);
my $bin = pack('H*', $hex);
my $exp = $mismatch{$xnum};
+ if (defined $exp) {
+ my $smsg = $ibx->over->get_art($xnum) // next;
+ # $xnum may be expired by another process
+ if ($smsg->{blob} eq $hex) {
+ warn <<"";
+BUG: (non-fatal) $ekey #$xnum $smsg->{blob} still matches (old exp: $exp)
+
+ next;
+ } # else: continue to unref
+ }
my $m = defined($exp) ? "mismatch (!= $exp)" : 'stale';
warn("# $xnum:$hex (#@$docids): $m\n");
for my $i (@$docids) {
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 4/4] extindex: better locations for {quit} checks
2021-10-17 9:52 [PATCH 0/4] extindex tweaks and small fixes Eric Wong
` (2 preceding siblings ...)
2021-10-17 9:52 ` [PATCH 3/4] extindex: guard against false mismatch unrefs Eric Wong
@ 2021-10-17 9:52 ` Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-10-17 9:52 UTC (permalink / raw)
To: meta
Check for graceful termination at every message since it's
a fairly inexpensive check.
---
lib/PublicInbox/ExtSearchIdx.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index cb5256a2c562..f479cf9e1a3f 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -908,10 +908,9 @@ ibx_id = ? AND xnum >= ? AND xnum <= ?
for my $num (@$docids) {
$self->{oidx}->eidxq_add($num);
}
- return if $sync->{quit};
}
+ return if $sync->{quit};
}
- return if $sync->{quit};
next unless scalar keys %x3m;
$self->git->async_wait_all; # wait for reindex_unseen
@@ -936,6 +935,7 @@ BUG: (non-fatal) $ekey #$xnum $smsg->{blob} still matches (old exp: $exp)
for my $i (@$docids) {
_unref_doc($sync, $i, $ibx, $xnum, $bin);
}
+ return if $sync->{quit};
}
}
defined($hi) and ($hi < $max) and
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-10-17 9:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17 9:52 [PATCH 0/4] extindex tweaks and small fixes Eric Wong
2021-10-17 9:52 ` [PATCH 1/4] extindex: use localtime to display lock time Eric Wong
2021-10-17 9:52 ` [PATCH 2/4] extindex: retry sync_inbox before reindex Eric Wong
2021-10-17 9:52 ` [PATCH 3/4] extindex: guard against false mismatch unrefs Eric Wong
2021-10-17 9:52 ` [PATCH 4/4] extindex: better locations for {quit} checks Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).