user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 05/14] learn: hoist out remove_or_add subroutine
Date: Mon, 28 Oct 2019 10:45:19 +0000	[thread overview]
Message-ID: <20191028104528.10140-6-e@80x24.org> (raw)
In-Reply-To: <20191028104528.10140-1-e@80x24.org>

We'll be reusing it for List-ID processing in the next commit.
---
 script/public-inbox-learn | 56 ++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/script/public-inbox-learn b/script/public-inbox-learn
index 299f75a0..56739f88 100755
--- a/script/public-inbox-learn
+++ b/script/public-inbox-learn
@@ -39,6 +39,34 @@ my $mime = PublicInbox::MIME->new(eval {
 	$data
 });
 
+sub remove_or_add ($$$) {
+	my ($ibx, $train, $addr) = @_;
+
+	# We do not touch GIT_COMMITTER_* env here so we can track
+	# who trained the message.
+	$ibx->{name} = $ENV{GIT_COMMITTER_NAME} // $ibx->{name};
+	$ibx->{-primary_address} = $ENV{GIT_COMMITTER_EMAIL} // $addr;
+	$ibx = PublicInbox::InboxWritable->new($ibx);
+	my $im = $ibx->importer(0);
+
+	if ($train eq "rm") {
+		# This needs to be idempotent, as my inotify trainer
+		# may train for each cross-posted message, and this
+		# script already learns for every list in
+		# ~/.public-inbox/config
+		$im->remove($mime, $train);
+	} elsif ($train eq "ham") {
+		# no checking for spam here, we assume the message has
+		# been reviewed by a human at this point:
+		PublicInbox::MDA->set_list_headers($mime, $ibx);
+
+		# Ham messages are trained when they're marked into
+		# a SEEN state, so this is idempotent:
+		$im->add($mime);
+	}
+	$im->done;
+}
+
 # spam is removed from all known inboxes since it is often Bcc:-ed
 if ($train eq 'spam') {
 	$pi_config->each_inbox(sub {
@@ -61,31 +89,9 @@ if ($train eq 'spam') {
 	}
 
 	# n.b. message may be cross-posted to multiple public-inboxes
-	while (my ($addr, $dst) = each %dests) {
-		next unless ref($dst);
-		# We do not touch GIT_COMMITTER_* env here so we can track
-		# who trained the message.
-		$dst->{name} = $ENV{GIT_COMMITTER_NAME} // $dst->{name};
-		$dst->{-primary_address} = $ENV{GIT_COMMITTER_EMAIL} // $addr;
-		$dst = PublicInbox::InboxWritable->new($dst);
-		my $im = $dst->importer(0);
-
-		if ($train eq "rm") {
-			# This needs to be idempotent, as my inotify trainer
-			# may train for each cross-posted message, and this
-			# script already learns for every list in
-			# ~/.public-inbox/config
-			$im->remove($mime, $train);
-		} elsif ($train eq "ham") {
-			# no checking for spam here, we assume the message has
-			# been reviewed by a human at this point:
-			PublicInbox::MDA->set_list_headers($mime, $dst);
-
-			# Ham messages are trained when they're marked into
-			# a SEEN state, so this is idempotent:
-			$im->add($mime);
-		}
-		$im->done;
+	while (my ($addr, $ibx) = each %dests) {
+		next unless ref($ibx); # $ibx may be 0
+		remove_or_add($ibx, $train, $addr);
 	}
 }
 

  parent reply	other threads:[~2019-10-28 10:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28 10:45 [PATCH 00/14] learn: sync w/ -mda changes and add manpage Eric Wong
2019-10-28 10:45 ` [PATCH 01/14] learn: support multiple To/Cc headers Eric Wong
2019-10-28 10:45 ` [PATCH 02/14] learn: only map recipient list on "ham" or "rm" Eric Wong
2019-10-28 10:45 ` [PATCH 03/14] learn: update usage statement Eric Wong
2019-10-28 10:45 ` [PATCH 04/14] learn: GIT_COMMITTER_<NAME|EMAIL> may be "" or "0" Eric Wong
2019-10-28 10:45 ` Eric Wong [this message]
2019-10-28 10:45 ` [PATCH 06/14] mda: hoist out List-ID handling and reuse in -learn Eric Wong
2019-10-28 10:45 ` [PATCH 07/14] filter/base: remove MAX_MID_SIZE constant Eric Wong
2019-10-28 10:45 ` [PATCH 08/14] mda: hoist out mda_filter_adjust Eric Wong
2019-10-28 10:45 ` [PATCH 09/14] mda: skip MIME parsing if spam Eric Wong
2019-10-28 10:45 ` [PATCH 10/14] inboxwritable: add assert_usable_dir sub Eric Wong
2019-10-28 10:45 ` [PATCH 11/14] mda: prepare for multiple destinations Eric Wong
2019-10-28 10:45 ` [PATCH 12/14] mda: support multiple List-ID matches Eric Wong
2019-10-28 18:05   ` Eric W. Biederman
2019-10-30 21:32     ` Eric Wong
2019-10-28 10:45 ` [PATCH 13/14] learn: allow running without spamc Eric Wong
2019-10-28 10:45 ` [PATCH 14/14] doc: add public-inbox-learn(1) manpage Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191028104528.10140-6-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).