From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 06/14] mda: hoist out List-ID handling and reuse in -learn
Date: Mon, 28 Oct 2019 10:45:20 +0000 [thread overview]
Message-ID: <20191028104528.10140-7-e@80x24.org> (raw)
In-Reply-To: <20191028104528.10140-1-e@80x24.org>
It's now possible to inject false-positive ham into an inbox
the same way -mda does via List-ID.
---
lib/PublicInbox/MDA.pm | 15 +++++++++++++++
script/public-inbox-learn | 8 +++++++-
script/public-inbox-mda | 5 +----
3 files changed, 23 insertions(+), 5 deletions(-)
mode change 100755 => 100644 script/public-inbox-learn
diff --git a/lib/PublicInbox/MDA.pm b/lib/PublicInbox/MDA.pm
index 9cafda13..ce2c870f 100644
--- a/lib/PublicInbox/MDA.pm
+++ b/lib/PublicInbox/MDA.pm
@@ -83,4 +83,19 @@ sub set_list_headers {
}
}
+# TODO: deal with multiple List-ID headers?
+sub inbox_for_list_id ($$) {
+ my ($klass, $config, $simple) = @_;
+
+ # newer Email::Simple allows header_raw, as does Email::MIME:
+ my $list_id = $simple->can('header_raw') ?
+ $simple->header_raw('List-Id') :
+ $simple->header('List-Id');
+ my $ibx;
+ if (defined $list_id && $list_id =~ /<[ \t]*(.+)?[ \t]*>/) {
+ $ibx = $config->lookup_list_id($1);
+ }
+ $ibx;
+}
+
1;
diff --git a/script/public-inbox-learn b/script/public-inbox-learn
old mode 100755
new mode 100644
index 56739f88..79f3ead5
--- a/script/public-inbox-learn
+++ b/script/public-inbox-learn
@@ -77,7 +77,7 @@ if ($train eq 'spam') {
$im->done;
});
} else {
- require PublicInbox::MDA if $train eq "ham";
+ require PublicInbox::MDA;
# get all recipients
my %dests; # address => <PublicInbox::Inbox|0(false)>
@@ -89,10 +89,16 @@ if ($train eq 'spam') {
}
# n.b. message may be cross-posted to multiple public-inboxes
+ my %seen;
while (my ($addr, $ibx) = each %dests) {
next unless ref($ibx); # $ibx may be 0
+ next if $seen{"$ibx"}++;
remove_or_add($ibx, $train, $addr);
}
+ my $ibx = PublicInbox::MDA->inbox_for_list_id($pi_config, $mime);
+ if ($ibx && !$seen{"$ibx"}) {
+ remove_or_add($ibx, $train, $ibx->{-primary_address});
+ }
}
if ($err) {
diff --git a/script/public-inbox-mda b/script/public-inbox-mda
index 584218b5..3ff318c9 100755
--- a/script/public-inbox-mda
+++ b/script/public-inbox-mda
@@ -43,10 +43,7 @@ if (defined $recipient) {
$dst = $config->lookup($recipient); # first check
}
if (!defined $dst) {
- my $list_id = $simple->header('List-Id');
- if (defined $list_id && $list_id =~ /<[ \t]*(.+)?[ \t]*>/) {
- $dst = $config->lookup_list_id($1);
- }
+ $dst = PublicInbox::MDA->inbox_for_list_id($config, $simple);
if (!defined $dst && !defined $recipient) {
die "ORIGINAL_RECIPIENT not defined in ENV\n";
}
next prev parent reply other threads:[~2019-10-28 10:45 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-28 10:45 [PATCH 00/14] learn: sync w/ -mda changes and add manpage Eric Wong
2019-10-28 10:45 ` [PATCH 01/14] learn: support multiple To/Cc headers Eric Wong
2019-10-28 10:45 ` [PATCH 02/14] learn: only map recipient list on "ham" or "rm" Eric Wong
2019-10-28 10:45 ` [PATCH 03/14] learn: update usage statement Eric Wong
2019-10-28 10:45 ` [PATCH 04/14] learn: GIT_COMMITTER_<NAME|EMAIL> may be "" or "0" Eric Wong
2019-10-28 10:45 ` [PATCH 05/14] learn: hoist out remove_or_add subroutine Eric Wong
2019-10-28 10:45 ` Eric Wong [this message]
2019-10-28 10:45 ` [PATCH 07/14] filter/base: remove MAX_MID_SIZE constant Eric Wong
2019-10-28 10:45 ` [PATCH 08/14] mda: hoist out mda_filter_adjust Eric Wong
2019-10-28 10:45 ` [PATCH 09/14] mda: skip MIME parsing if spam Eric Wong
2019-10-28 10:45 ` [PATCH 10/14] inboxwritable: add assert_usable_dir sub Eric Wong
2019-10-28 10:45 ` [PATCH 11/14] mda: prepare for multiple destinations Eric Wong
2019-10-28 10:45 ` [PATCH 12/14] mda: support multiple List-ID matches Eric Wong
2019-10-28 18:05 ` Eric W. Biederman
2019-10-30 21:32 ` Eric Wong
2019-10-28 10:45 ` [PATCH 13/14] learn: allow running without spamc Eric Wong
2019-10-28 10:45 ` [PATCH 14/14] doc: add public-inbox-learn(1) manpage Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191028104528.10140-7-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).