public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2020-02-06	treewide: run update-copyrights from gnulib for 2019
	I didn't wait until September to do it, this year!
2020-01-11	make Filesys::Notify::Simple optional
	It's only used by us in public-inbox-watch, and maybe not for long. It's in most installations because Plack pulls it in though, but Plack is no longer required.
2020-01-11	spawn (and thus popen_rd) die on failure
	Most spawn and popen_rd callers die on failure to spawn, anyways, and some are missing checks entirely. This saves us a bunch of verbose error-checking code in callers. This also makes popen_rd more consistent, since it already dies on pipe creation failures.
2020-01-06	treewide: "require" + "use" cleanup and docs
	There's a bunch of leftover "require" and "use" statements we no longer need and can get rid of, along with some excessive imports via "use". IO::Handle usage isn't always obvious, so add comments describing why a package loads it. Along the same lines, document the tmpdir support as the reason we depend on File::Temp 0.19, even though every Perl 5.10.1+ user has it. While we're at it, favor "use" over "require", since it it gives us extra compile-time checking.
2020-01-01	filter/base: export REJECT as a constant
	And update callers to use it, as it makes the code a bit cleaner. Probably irrelvant, but it should be faster, too, as "perl -I lib -w -MO=Deparse $FILE" shows REJECT() calls are constant-folded.
2019-11-24	check for File::Temp 0.19 for ->newdir method
	This is distributed with Perl 5.10.1 and onwards, so it should not be an installation burden for any users. I'm planning to move away from tempdir() entirely and use File::Temp->newdir to remove dependencies on END{} blocks.
2019-10-22	watchmaildir: remove redundant _path_to_mime
	InboxWritable::maildir_path_load exists and we may support it for use with standalone scripts.
2019-10-15	mda, watch: wire up List-ID header support
	This also adds watchheader tests for -watch, which we never had before :x
2019-09-09	run update-copyrights from gnulib for 2019

2019-07-06	watch: allow multiple spam watch directories
	Given most folks have multiple mail accounts, there's no reason we can't support multiple Maildirs.
2019-07-06	watch: remove some indirectly-used imports
	We can drop some unnecessary imports and now that we switched to InboxWritable.
2019-06-27	watchmaildir: show the current path on spamcheck failures
	Knowing which message failed a spam check is tough when I have many Maildirs and don't have a search indexing tool setup for spam mail.
2019-01-05	filter/rubylang: fix SQLite DB lifetime problems
	Clearly the AltId stuff was never tested for v2. Ensure this tricky filter (which reuses Msgmap to avoid introducing new serial numbers) doesn't trigger deadlocks SQLite due to opening a DB for writing multiple times. I went through several iterations of this change before going with this one, which is the least intrusive I could fine.
2019-01-05	watchmaildir: normalize Maildir pathnames consistently
	Remove redundant slashes while we're at it.
2019-01-05	watchmaildir: get rid of unused spamdir field
	Unused since commit 6c2caa791bd5fbf5c4edb1a4a2c1807e527348a7 ("watchmaildir: support v2 repositories")
2019-01-05	watchmaildir: support multiple inboxes in the same Maildir
	Not sure what I was smoking when I originally wrote this code. cf. https://public-inbox.org/meta/874li887mp.fsf@vuxu.org/
2019-01-02	use PublicInbox::Config::each_inbox where appropriate
	No need to reach into PublicInbox::Config internals and iterate through the hashref by hand
2018-07-29	mda: allow configuring globally without spamc support
	This reuses some of the configuration from -watch, but remains independent since some configurations will use -watch for some inboxes and -mda for others. The default remains "spamc" for -mda users so nothing changes without explicit configuration. Per-inbox configurations may also be supported in the future.
2018-04-19	filter/rubylang: do not set altid on spam training
	I suppose it's a bug or inconsistency that altid is write-only and their deletions do not get reflected. But for now, we do not set it when training spam so there's no window where an invalid NNTP article number shows up. This should solve the problem where there's massive gaps in messages solved by spam training for ruby groups: https://public-inbox.org/meta/20180307093754.GA27748@dcvr/
2018-03-20	InboxWritable: add mbox/maildir parsing + import logic
	This will make it easier to as well as supporting future Filter API users. It allows simplifying our ad-hoc import_vger_from_mbox script.
2018-03-20	import: discard all the same headers as MDA
	Reduce the places where we have duplicate logic for discarding unwanted headers.
2018-03-20	introduce InboxWritable class
	This code will be shared with future mass-import tools.
2018-03-19	v2writable: allow disabling parallelization
	While parallel processes improves import speed for initial imports; they are probably not necessary for daily mail imports via WatchMaildir and certainly not for public-inbox-init. Save some memory for daily use and even helps improve readability of some subroutines by showing which methods they call remotely.
2018-03-19	watchmaildir: support v2 repositories
	Unfortunately this gives up some minor performance tweaks we made to avoid reforking import processes.
2018-03-19	import: force Message-ID generation for v1 here
	This allows us to share code for generating Message-IDs between v1 and v2 repos. For v1, this introduces a slight incompatibility in message removal iff the original message lacked a Message-ID AND the training request came from a message which did not pass through the public-inbox: The workaround for this would be to reuse the bad message from the archive itself.
2018-03-19	watchmaildir: use content_digest to generate Message-Id
	This can probably be moved to Import for code reuse.
2018-03-19	import: implement barrier operation for v1 repos
	This will allow WatchMaildir to use ->barrier operations instead of reaching inside for nchg. This also ensures dumb HTTP clients can see changes to V2 repos immediately.
2018-02-28	use PublicInbox::MIME consistently
	It works around some bugs in older Email::MIME which we'll find useful.
2018-02-08	watch_maildir: allow '-' in mail filename
	Hostnames can contain '-' and this allows public-inbox-watch(1) to work on machines which generate Maildir files with '-' in them.
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-11-16	watch: use "spam" in commit message for removals
	This makes it easy to identify the reason for message removals.
2017-06-26	watch: avoid potential race condition while quitting
	We must not trigger future activity when initializing a -watch shutdown.
2017-06-26	watch: commit changes to fast-import sooner
	We should make changes visible sooner, even during lengthy scans.
2017-06-26	watch: use "self-inotify-tempfile trick" for quit
	This should be more reliable and safer as it'll ensure existing fast-import instances are shut down properly.
2017-06-26	watch: improve fairness during full rescans
	We need to ensure new messages are being processed fairly during full rescans, so have the ->scan subroutine yield and reschedule itself. Additionally, having a long-running task inside the signal handler is dangerous and subject to reentrancy bugs. Due to the limitations of the Filesys::Notify::Simple interface, we cannot rely on multiplexing I/O interfaces (select, IO::Poll, Danga::Socket, etc...) for this. Forking a separate process was considered, but it is more expensive for a mostly-idle process. So, we use a variant of the "self-pipe trick" via inotify (or whatever Filesys::Notify::Simple gives us). Instead of writing to our own pipe, we write to a file in our own temporary directory watched by Filesys::Notify::Simple to trigger events in signal handlers.
2017-06-26	watch: ensure HUP causes the scanner to be reloaded
	Otherwise the old watcher may run indefinitely
2017-06-23	watchmaildir: deal with rejected (100) messages
	The RubyLang filter is strict about what messages it rejects, so the spam learning path will not auto-train or remove messages missing X-Mail-Count headers.
2017-06-22	add filter for RubyLang lists
	Unfortunately, it appears we have to reject this and instead add support filtering at View time(), due to DKIM signatures in messages from ruby-lang.org. () which may not be worth it
2017-05-09	watchmaildir: show $@ in warning message
	It should be helpful to know what error happened.
2017-04-04	watchmaildir: do not reject lowercase flags on Maildir files
	Dovecot uses 'a'..'z' (lowercase) to designate keywords in Maildir flags. This was preventing certain messages from being marked as spam. https://wiki2.dovecot.org/MailboxFormat/Maildir
2017-01-26	watchmaildir: allow arguments for filters
	We'll want to allow some degree of configuration for various mailing lists.
2017-01-19	watchmaildir: limit live importer processes
	We don't want to be triggering OOM or swapping on weaker systems when we have dozens of inboxes as potential targets.
2017-01-10	introduce PublicInbox::MIME wrapper class
	This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2017-01-02	watch: watchspam affects all configured inboxes
	If a message is spam in one mailbox, it is spam in all others a particular user/group will care about.
2016-09-01	watch: use "publicinboxwatch" namespace
	We'll keep supporting "publicinboxlearn" indefinitely, but "publicinboxwatch" is probably more appropriate at the moment. Noticed while writing documentation.
2016-08-12	watch: respect altid for incremental watch changes
	We need to pass the Inbox object to SearchIdx to get altid mappings properly for incremental imports. TODO: use the Inbox object in more places where it makes sense to do so.
2016-06-26	watch_maildir: warn on spam check failures
	It would be nice to know about spamcheck failures.
2016-06-24	watch_maildir: ignore Trash and Drafts, support Dovecot
	Trashed messages and drafts are probably not intended for importing, so do not import them. Dovecot uses extra flags via lowercase letters, so we must support those (as that's the server I use).
2016-06-24	watch_maildir: implement optional spam checking
	Mailing lists I watch and mirror may not have the best spam filtering, and an extra layer should not hurt.
2016-06-24	watch_maildir: rename _check_spam => _remove_spam
	We do not actually do spam checking, here; but will do spam checking before adding a message in the future.