* [PATCH 05/11] watch: avoid unnecessary spawning on spam removals
2020-08-31 4:41 7% [PATCH 00/11] watch: fix contention w/ Maildir & NNTP Eric Wong
@ 2020-08-31 4:41 6% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-08-31 4:41 UTC (permalink / raw)
To: meta; +Cc: Eric Wong
From: Eric Wong <e@yhbt.net>
This should further mitigate lock contention problems
when -watch is configured to watch on a Maildir for spam
while performing a large NNTP import.
There is now a small risk a message won't get removed because if
it's in the current (uncommitted) fast-import batch, but
unlikely given the batch size is now only 10 messages.
If a that small window is hit, flipping the \Seen flag
(e.g. marking it unread, and then read again) will trigger
another removal attempt via IMAP or Maildir.
---
lib/PublicInbox/Import.pm | 3 +++
lib/PublicInbox/V2Writable.pm | 3 +++
lib/PublicInbox/Watch.pm | 31 +++++++++++++++++++++++++------
3 files changed, 31 insertions(+), 6 deletions(-)
diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index 700b4026..ee5ca2ea 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -461,6 +461,9 @@ sub init_bare {
}
}
+# true if locked and active
+sub active { !!$_[0]->{out} }
+
sub done {
my ($self) = @_;
my $w = delete $self->{out} or return;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index f2288904..553dd839 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -655,6 +655,9 @@ sub checkpoint ($;$) {
# public
sub barrier { checkpoint($_[0], 1) };
+# true if locked and active
+sub active { !!$_[0]->{im} }
+
# public
sub done {
my ($self) = @_;
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index 5f786139..0bb92d0a 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -134,15 +134,34 @@ sub _done_for_now {
sub remove_eml_i { # each_inbox callback
my ($ibx, $arg) = @_;
my ($self, $eml, $loc) = @$arg;
+
eval {
- my $im = _importer_for($self, $ibx);
- $im->remove($eml, 'spam');
- if (my $scrub = $ibx->filter($im)) {
- my $scrubbed = $scrub->scrub($eml, 1);
- if ($scrubbed && $scrubbed != REJECT) {
- $im->remove($scrubbed, 'spam');
+ # try to avoid taking a lock or unnecessary spawning
+ my $im = $self->{importers}->{"$ibx"};
+ my $scrubbed;
+ if ((!$im || !$im->active) && $ibx->over) {
+ if (content_exists($ibx, $eml)) {
+ # continue
+ } elsif (my $scrub = $ibx->filter($im)) {
+ $scrubbed = $scrub->scrub($eml, 1);
+ if ($scrubbed && $scrubbed != REJECT &&
+ !content_exists($ibx, $scrubbed)) {
+ return;
+ }
+ } else {
+ return;
}
}
+
+ $im //= _importer_for($self, $ibx); # may spawn fast-import
+ $im->remove($eml, 'spam');
+ $scrubbed //= do {
+ my $scrub = $ibx->filter($im);
+ $scrub ? $scrub->scrub($eml, 1) : undef;
+ };
+ if ($scrubbed && $scrubbed != REJECT) {
+ $im->remove($scrubbed, 'spam');
+ }
};
if ($@) {
warn "error removing spam at: $loc from $ibx->{name}: $@\n";
^ permalink raw reply related [relevance 6%]
* [PATCH 00/11] watch: fix contention w/ Maildir & NNTP
@ 2020-08-31 4:41 7% Eric Wong
2020-08-31 4:41 6% ` [PATCH 05/11] watch: avoid unnecessary spawning on spam removals Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-08-31 4:41 UTC (permalink / raw)
To: meta
Here's a bunch of fixes to improve watch performance when
both Maildirs and NNTP are being watched (possibly on the same
inbox, or if `watchspam' is configured for spam removals).
Wakeups are reduced, and inbox.lock contention is minimized by
using read-only ->over to check for `watchspam' removals.
These affect IMAP, too; but I've been mainly using NNTP.
Eric Wong (11):
watch: limit batch size of NNTP and IMAP workers, too
watchmaildir: use v5.10.1, drop warnings
rename WatchMaildir => Watch
watch: log signal activities to STDERR
watch: avoid unnecessary spawning on spam removals
watch: block signals before fork on non-signalfd/kevent systems
watch: comments and tiny cleanups
ds: avoid excessive queueing when reaping PIDs
watch: use EOFpipe to reduce dwaitpid wakeups
ds: avoid unnecessary timer for waitpid
replace ParentPipe with EOFpipe
MANIFEST | 4 +-
lib/PublicInbox/DS.pm | 38 +++---
lib/PublicInbox/Daemon.pm | 6 +-
lib/PublicInbox/EOFpipe.pm | 24 ++++
lib/PublicInbox/Import.pm | 3 +
lib/PublicInbox/ParentPipe.pm | 23 ----
lib/PublicInbox/V2Writable.pm | 3 +
lib/PublicInbox/{WatchMaildir.pm => Watch.pm} | 111 +++++++++++++-----
script/public-inbox-watch | 34 ++++--
t/imapd.t | 2 +-
t/nntpd.t | 2 +-
t/watch_filter_rubylang.t | 4 +-
t/watch_imap.t | 4 +-
t/watch_maildir.t | 18 +--
t/watch_maildir_v2.t | 22 ++--
t/watch_multiple_headers.t | 4 +-
t/watch_nntp.t | 4 +-
17 files changed, 190 insertions(+), 116 deletions(-)
create mode 100644 lib/PublicInbox/EOFpipe.pm
delete mode 100644 lib/PublicInbox/ParentPipe.pm
rename lib/PublicInbox/{WatchMaildir.pm => Watch.pm} (92%)
^ permalink raw reply [relevance 7%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-08-31 4:41 7% [PATCH 00/11] watch: fix contention w/ Maildir & NNTP Eric Wong
2020-08-31 4:41 6% ` [PATCH 05/11] watch: avoid unnecessary spawning on spam removals Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).