about summary refs log tree commit homepage
path: root/script/public-inbox-watch
DateCommit message (Collapse)
2023-11-11mda|learn|watch: support dropUniqueUnsubscribe config
List-Unsubscribe headers with unique identifiers (such as those generated by our examples/unsubscribe.milter) should not end up in public archives. Add a new config knob to strip List-Unsubscribe headers if they have the `List-Unsubscribe-Post: List-Unsubscribe=One-Click' header. Unfortunately, this breaks DKIM signatures if the signature covers either of these List-Unsubscribe* headers. However, breaking DKIM is the lesser evil compared to any archive reader being able to stop archival by an independent archivist. As much as I would like this to be the default, it probably affects few users at the moment since very few mailing lists use unique identifiers in List-Unsubscribe (but that number has grown, recently).
2023-09-08watch: reset HUP + USR1 signal handlers in children
Child processes handling IMAP/NNTP aren't going to want to handle config reloads nor forced rescans, those are exclusively for the parent. We'll leave a note that QUIT/TERM/INT can safely use the same callback for both parent and children, as I nearly made the mistake of resetting those to their default values in the child.
2023-09-08watch: set %SIG for non-signalfd/kqueue
We need to ensure there isn't a window where we lose $SIG{CHLD} handling. This is the second part in getting t/imapd.t to pass the reload-after-setting-imap.pollInterval test That said, I'm not entirely happy with the way -watch jumps in and out of the event loop. It's historical baggage from the pre-event_loop days.
2023-09-05watch: ensure children can use signal handlers
Blindly using the signal set inherited from the parent process is wrong, since the parent (or grandparent) could've blocked all signals. Ensure children can process signals in the event loop when sig handlers have to use standard Perl facilities.
2023-03-26watch: do not recreate signalfd on SIGHUP
The normal method by which PublicInbox::DS::event_loop sets up signals once needs some coercing to work with -watch. Otherwise, we'll end up wasting FDs every time somebody reloads -watch via SIGHUP.
2022-10-24treewide: replace /^I: / prefix with /^# /
This is like more familiar to readers of TAP (Test Anywhere Protocol) output, as well as shell and Perl scripters which also use `#' for comments. AFAIK, nobody is parsing our stderr, and I'm not sure how standardized the `I:' prefix is (nor `W:' and `E:' are). It's already the prevailing style in Lei* code, too, so things have been moving in that direction for a bit.
2021-10-01ds: simplify signalfd use
Since signalfd is often combined with our event loop, give it a convenient API and reduce the code duplication required to use it. EventLoop is replaced with ::event_loop to allow consistent parameter passing and avoid needlessly passing the package name on stack. We also avoid exporting SFD_NONBLOCK since it's the only flag we support. There's no sense in having the memory overhead of a constant function when it's in cold code.
2021-01-14daemon+watch: fix localization of %SIG for non-signalfd users
It turns out "local" did not take effect in the way we used it: http://nntp.perl.org/group/perl.perl5.porters/258784 <CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com> Fortunately, none of the old use cases seem affected, unlike the previous lei change to ensure consistent SIGPIPE handling.
2021-01-12ds: block signals when reaping
This lets us call dwaitpid long before a process exits and not have to wait around for it. This is advantageous for lei where we can run dwaitpid on the pager as soon as we spawn it, instead of waiting for a client socket to go away on DESTROY.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2021-01-01syscall: SFD_NONBLOCK can be a constant, again
Since Perl exposes O_NONBLOCK as a constant, we can safely make SFD_NONBLOCK a constant, too. This is not the case for SFD_CLOEXEC, since O_CLOEXEC is not exposed by Perl despite being used internally in the interpreter.
2020-09-14sigfd: fix typos and scoping on systems w/o epoll+kqueue
Unfortunately, I'm not sure how easy catching these at compile-time, is. Prototypes do not seem to check these at compile time when crossing packages (not even with exported subroutines).
2020-09-02watch: add --help/-h support
And avoid unnecessary POD markup in the man page.
2020-09-01watch: log signal activities to STDERR
Sometimes it may not be apparent when/if a signal is processed, this hopefully improves the situation. We'll also change the process title when we're quitting to better inform users.
2020-09-01rename WatchMaildir => Watch
This is no longer limited to Maildirs now that IMAP and NNTP support exist; so give it a shorter name.
2020-08-07syscall: support sparc64 (and maybe other big-endian systems)
Thanks to the GCC compile farm project, we can wire up syscalls for sparc64 and set system-specific SFD_* constants properly. I've FINALLY figured out how to use POSIX::SigSet to generate a usable buffer for the syscall perlfunc. This is required for endian-neutral behavior and relevant to sparc64, at least. There's no need for signalfd-related stuff to be constants, either. signalfd initialization is never a hot path and a stub subroutine for constants uses several KB of memory in the interpreter. We'll drop the needless SEEK_CUR import while we're importing O_NONBLOCK, too.
2020-06-28watch: enable autoflush for STDOUT and STDERR
In case output is redirected to a pipe, ensure stdout and stderr are always unbuffered, as -watch may go long periods without any output to fill up buffers.
2020-06-28watch: wire up IMAP IDLE reapers to DS
We can avoid synchronous `waitpid(-1, 0)' and save a process when simultaneously watching Maildirs. One DS bug is fixed: ->Reset needs to clear the DS $in_loop flag in forked children so dwaitpid() fails and allows git processes to be reaped synchronously. TestCommon also calls DS->Reset when spawning new processes, since t/imapd.t uses DS->EventLoop while waiting on -watch to write.
2020-06-28watch: use signalfd for Maildir watching
We can get rid of the janky wannabe self-using-a-directory-instead-of-pipe thing we needed to workaround Filesys::Notify::Simple being blocking. For existing Maildir users, this should be more robust and immune to missed wakeups for signalfd and kqueue-enabled systems; as well as being immune to BOFHs clearing $TMPDIR and preventing notifications from firing. The IMAP IDLE code still uses normal Perl signals, so it's still vulnerable to missed wakeups. That will be addressed in future commits.
2020-06-28watch: remove Filesys::Notify::Simple dependency
Since we already use inotify and EVFILT_VNODE (kqueue) in -imapd, we might as well use them directly in -watch, too. This will allow public-inbox-watch to use PublicInbox::DS for timers to watch newsgroups/mailboxes and have saner signal handling in future commits.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2019-09-09run update-copyrights from gnulib for 2019
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-06-26watch: use "self-inotify-tempfile trick" for quit
This should be more reliable and safer as it'll ensure existing fast-import instances are shut down properly.
2017-06-26watch: improve fairness during full rescans
We need to ensure new messages are being processed fairly during full rescans, so have the ->scan subroutine yield and reschedule itself. Additionally, having a long-running task inside the signal handler is dangerous and subject to reentrancy bugs. Due to the limitations of the Filesys::Notify::Simple interface, we cannot rely on multiplexing I/O interfaces (select, IO::Poll, Danga::Socket, etc...) for this. Forking a separate process was considered, but it is more expensive for a mostly-idle process. So, we use a variant of the "self-pipe trick" via inotify (or whatever Filesys::Notify::Simple gives us). Instead of writing to our own pipe, we write to a file in our own temporary directory watched by Filesys::Notify::Simple to trigger events in signal handlers.
2017-06-26watch: ensure HUP causes the scanner to be reloaded
Otherwise the old watcher may run indefinitely
2016-08-12public-inbox-watch: support reloading config with SIGHUP
This can be useful for adding new lists, as restarting is expensive (but still non-lossy).
2016-06-17watch: introduce watch directive
This will allow users to run importers off existing mail accounts where they may not have access to run -mda. Currently, we only support Maildirs, but IMAP ought to be doable.