public-inbox.git  about / heads / tags
an "archives first" approach to mailing lists
blob d151d272d0aea390d6f08260ef06fe7a8ce738b6 2083 bytes (raw)
$ git show v1.4.0:Documentation/dc-dlvr-spam-flow.txt	# shows this blob on the CLI

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
 
dc-dlvr spam/ham training system flow
-------------------------------------

An overview of the Maildir + inotify-based spam training system Eric
uses on his mail server.  This idea may be implemented for kqueue-based
systems, too.

The idea is to use inotify (via incron) to watch for new files appearing
in Maildirs.  We only want to train seen messages as ham, and old (but
not necessarily seen) messages as spam.  The overall goal of this is to
allow a user to train their filters without leaving his favorite mail
user agent.

Every message written to Maildir involves a rename, so we only
have incron watch for IN_MOVED_TO events.

The generic flow is as follows, all for a single Unix user account:

    incron -> report-spam +-> sendmail -> MTA -> dc-dlvr -> spamc -> spamd
                          |
                          V
                         ...

For public-inbox, Eric uses a separate Unix account ("pi") to add a
layer of protection from fat-fingering something.  So his report-spam
script delivers to a second recipient for training, the "pi" user:
                         ...
                          |
                          +-> sendmail -> MTA -> dc-dlvr
                                                    |
                                                    V
                                            ~pi/.dc-dlvr.pre
                                                    |
                                                    V
                                           public-inbox-learn

public-inbox-learn will then internally handle the "spamc -> spamd"
delivery path as well as removing the message from the git tree.

* incron - run commands based on filesystem events: http://incron.aiken.cz/

* sendmail / MTA - we use and recommend use postfix, which includes a
                   sendmail-compatible wrapper: http://www.postfix.org/

* spamc / spamd - SpamAssassin: http://spamassassin.apache.org/

* report-spam / dc-dlvr - distributed with public-inbox in the scripts/
  directory: git clone https://public-inbox.org/public-inbox.git

git clone https://public-inbox.org/public-inbox.git
git clone http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git