* [PATCH 05/14] learn: hoist out remove_or_add subroutine
2019-10-28 10:45 7% [PATCH 00/14] learn: sync w/ -mda changes and add manpage Eric Wong
@ 2019-10-28 10:45 5% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-10-28 10:45 UTC (permalink / raw)
To: meta
We'll be reusing it for List-ID processing in the next commit.
---
script/public-inbox-learn | 56 ++++++++++++++++++++++-----------------
1 file changed, 31 insertions(+), 25 deletions(-)
diff --git a/script/public-inbox-learn b/script/public-inbox-learn
index 299f75a0..56739f88 100755
--- a/script/public-inbox-learn
+++ b/script/public-inbox-learn
@@ -39,6 +39,34 @@ my $mime = PublicInbox::MIME->new(eval {
$data
});
+sub remove_or_add ($$$) {
+ my ($ibx, $train, $addr) = @_;
+
+ # We do not touch GIT_COMMITTER_* env here so we can track
+ # who trained the message.
+ $ibx->{name} = $ENV{GIT_COMMITTER_NAME} // $ibx->{name};
+ $ibx->{-primary_address} = $ENV{GIT_COMMITTER_EMAIL} // $addr;
+ $ibx = PublicInbox::InboxWritable->new($ibx);
+ my $im = $ibx->importer(0);
+
+ if ($train eq "rm") {
+ # This needs to be idempotent, as my inotify trainer
+ # may train for each cross-posted message, and this
+ # script already learns for every list in
+ # ~/.public-inbox/config
+ $im->remove($mime, $train);
+ } elsif ($train eq "ham") {
+ # no checking for spam here, we assume the message has
+ # been reviewed by a human at this point:
+ PublicInbox::MDA->set_list_headers($mime, $ibx);
+
+ # Ham messages are trained when they're marked into
+ # a SEEN state, so this is idempotent:
+ $im->add($mime);
+ }
+ $im->done;
+}
+
# spam is removed from all known inboxes since it is often Bcc:-ed
if ($train eq 'spam') {
$pi_config->each_inbox(sub {
@@ -61,31 +89,9 @@ if ($train eq 'spam') {
}
# n.b. message may be cross-posted to multiple public-inboxes
- while (my ($addr, $dst) = each %dests) {
- next unless ref($dst);
- # We do not touch GIT_COMMITTER_* env here so we can track
- # who trained the message.
- $dst->{name} = $ENV{GIT_COMMITTER_NAME} // $dst->{name};
- $dst->{-primary_address} = $ENV{GIT_COMMITTER_EMAIL} // $addr;
- $dst = PublicInbox::InboxWritable->new($dst);
- my $im = $dst->importer(0);
-
- if ($train eq "rm") {
- # This needs to be idempotent, as my inotify trainer
- # may train for each cross-posted message, and this
- # script already learns for every list in
- # ~/.public-inbox/config
- $im->remove($mime, $train);
- } elsif ($train eq "ham") {
- # no checking for spam here, we assume the message has
- # been reviewed by a human at this point:
- PublicInbox::MDA->set_list_headers($mime, $dst);
-
- # Ham messages are trained when they're marked into
- # a SEEN state, so this is idempotent:
- $im->add($mime);
- }
- $im->done;
+ while (my ($addr, $ibx) = each %dests) {
+ next unless ref($ibx); # $ibx may be 0
+ remove_or_add($ibx, $train, $addr);
}
}
^ permalink raw reply related [relevance 5%]
* [PATCH 00/14] learn: sync w/ -mda changes and add manpage
@ 2019-10-28 10:45 7% Eric Wong
2019-10-28 10:45 5% ` [PATCH 05/14] learn: hoist out remove_or_add subroutine Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-10-28 10:45 UTC (permalink / raw)
To: meta
What started with adding a manpage for public-inbox-learn,
ended up being a bunch of fixes and improvements to catch
up to -mda changes.
-mda also learned to deal with multiple List-ID headers in the
meantime.
Eric Wong (14):
learn: support multiple To/Cc headers
learn: only map recipient list on "ham" or "rm"
learn: update usage statement
learn: GIT_COMMITTER_<NAME|EMAIL> may be "" or "0"
learn: hoist out remove_or_add subroutine
mda: hoist out List-ID handling and reuse in -learn
filter/base: remove MAX_MID_SIZE constant
mda: hoist out mda_filter_adjust
mda: skip MIME parsing if spam
inboxwritable: add assert_usable_dir sub
mda: prepare for multiple destinations
mda: support multiple List-ID matches
learn: allow running without spamc
doc: add public-inbox-learn(1) manpage
Documentation/include.mk | 1 +
Documentation/public-inbox-learn.pod | 86 +++++++++++++++++++++
MANIFEST | 1 +
lib/PublicInbox/Filter/Base.pm | 1 -
lib/PublicInbox/InboxWritable.pm | 9 ++-
lib/PublicInbox/MDA.pm | 22 ++++++
lib/PublicInbox/V2Writable.pm | 5 +-
script/public-inbox-learn | 84 +++++++++++---------
script/public-inbox-mda | 110 ++++++++++++++++-----------
t/import.t | 8 ++
t/mda.t | 19 +++++
t/v2writable.t | 12 +++
12 files changed, 275 insertions(+), 83 deletions(-)
create mode 100644 Documentation/public-inbox-learn.pod
mode change 100755 => 100644 script/public-inbox-learn
^ permalink raw reply [relevance 7%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2019-10-28 10:45 7% [PATCH 00/14] learn: sync w/ -mda changes and add manpage Eric Wong
2019-10-28 10:45 5% ` [PATCH 05/14] learn: hoist out remove_or_add subroutine Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).