user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH] filter/rubylang: do not set altid on spam training
@ 2018-04-19 22:42 Eric Wong
  0 siblings, 0 replies; only message in thread
From: Eric Wong @ 2018-04-19 22:42 UTC (permalink / raw)
  To: meta

I suppose it's a bug or inconsistency that altid is write-only
and their deletions do not get reflected.  But for now, we
do not set it when training spam so there's no window where
an invalid NNTP article number shows up.

This should solve the problem where there's massive gaps
in messages solved by spam training for ruby groups:

	https://public-inbox.org/meta/20180307093754.GA27748@dcvr/
---
 lib/PublicInbox/Filter/Base.pm     | 5 +++--
 lib/PublicInbox/Filter/RubyLang.pm | 4 ++--
 lib/PublicInbox/WatchMaildir.pm    | 3 ++-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/Filter/Base.pm b/lib/PublicInbox/Filter/Base.pm
index 5d07013..638b627 100644
--- a/lib/PublicInbox/Filter/Base.pm
+++ b/lib/PublicInbox/Filter/Base.pm
@@ -66,9 +66,10 @@ sub reject ($$) {
 sub err ($) { $_[0]->{err} }
 
 # by default, scrub is a no-op, see PublicInbox::Filter::Vger::scrub
-# for an example of the override
+# for an example of the override.  The $for_remove arg is set to
+# disable altid setting for spam removal.
 sub scrub {
-	my ($self, $mime) = @_;
+	my ($self, $mime, $for_remove) = @_;
 	$self->ACCEPT($mime);
 }
 
diff --git a/lib/PublicInbox/Filter/RubyLang.pm b/lib/PublicInbox/Filter/RubyLang.pm
index cb69e38..a43d67a 100644
--- a/lib/PublicInbox/Filter/RubyLang.pm
+++ b/lib/PublicInbox/Filter/RubyLang.pm
@@ -30,7 +30,7 @@ sub new {
 }
 
 sub scrub {
-	my ($self, $mime) = @_;
+	my ($self, $mime, $for_remove) = @_;
 	# no msg_iter here, that is only for read-only access
 	$mime->walk_parts(sub {
 		my ($part) = $_[0];
@@ -43,7 +43,7 @@ sub scrub {
 		}
 	});
 	my $altid = $self->{-altid};
-	if ($altid) {
+	if ($altid && !$for_remove) {
 		my $hdr = $mime->header_obj;
 		my $mids = mids($hdr);
 		return $self->REJECT('Message-ID missing') unless (@$mids);
diff --git a/lib/PublicInbox/WatchMaildir.pm b/lib/PublicInbox/WatchMaildir.pm
index 7ee29da..10dc618 100644
--- a/lib/PublicInbox/WatchMaildir.pm
+++ b/lib/PublicInbox/WatchMaildir.pm
@@ -126,7 +126,8 @@ sub _remove_spam {
 			my $im = _importer_for($self, $ibx);
 			$im->remove($mime, 'spam');
 			if (my $scrub = $ibx->filter) {
-				my $scrubbed = $scrub->scrub($mime) or return;
+				my $scrubbed = $scrub->scrub($mime, 1);
+				$scrubbed or return;
 				$scrubbed == REJECT() and return;
 				$im->remove($scrubbed, 'spam');
 			}
-- 
EW


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2018-04-19 22:42 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-19 22:42 [PATCH] filter/rubylang: do not set altid on spam training Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).