public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2024-04-28	xap_helper: reopen logs in daemons
	When read-only daemons reopen log files via SIGUSR1, be sure to propagate it to Xapian helper processes to ensure old log files can be closed and archived.
2024-04-28	daemon: share and allow configuring Xapian helpers
	Xapian helper processes are disabled by default once again. However, they can be enabled via the new `-X INTEGER' parameter. One big positive is the Xapian helpers being spawned by the top-level daemon means they can be shared freely across all workers for improved load balancing and memory reduction.
2024-04-03	treewide: avoid getpid for more ownership checks
	There are still some places where on_destroy isn't suitable, This gets rid of getpid() calls in most of those cases to reduce syscall costs and cleanup syscall trace output.
2024-04-03	treewide: avoid getpid() for OnDestroy checks
	getpid() isn't cached by glibc nowadays and system calls are more expensive due to CPU vulnerability mitigations. To ensure we switch to the new semantics properly, introduce a new `on_destroy' function to simplify callers. Furthermore, most OnDestroy correctness is often tied to the process which creates it, so make the new API default to guarded against running in subprocesses. For cases which require running in all children, a new PublicInbox::OnDestroy::all call is provided.
2024-02-08	daemon: quiet Email::Address::XS warnings properly
	Setting $SIG{__WARN__} at the top-level no longer has any effect since we localize $SIG{__WARN__} when entering ->event_step on a per-listener basis. Fixes: 60d262483a4d (daemon: use per-listener SIG{__WARN__} callbacks, 2022-08-08)
2023-11-26	drop redundant calls to DS->Reset
	Reset gets called on END{} anyways to workaround DBI lifetime problems, so there's no need to call it near exit. We can't replace calls to POSIX::_exit with `exit' to force END{} to run just yet, as there are still some lingering destruction ordering problems on newer DBI and or Perls.
2023-10-18	ds: introduce and use do_fork helper
	This ensures we handle RNG reseeding and resetting the event loop properly in child processes after forking.
2023-10-06	finalize DragonFlyBSD support
	require_bsd and require_mods(':fcntl_lock') are now supported in TestCommon to make it easier to maintain than a big list of regexps. getsockopt for SO_ACCEPTFILTER seems to always succeed, even if the retrieved struct is all zeroes.
2023-10-04	move all non-test @post_loop_do into named subs
	Compared to Danga::Socket, our @post_loop_do API is designed to make it easier to avoid anonymous subs (and their potential for leaks in buggy old versions of Perl).
2023-10-04	ds: don't pass FD map to post_loop_do callback
	It's not used by any post_loop_do callbacks anymore, and the underlying FD map is a global `our' variable accessible from anywhere, anyways.
2023-10-04	ds: hoist out close_non_busy
	It's shared by both by lei and public-facing daemons in using the ->busy callback.
2023-10-03	daemon: enable SO_ACCEPTFILTER on NetBSD
	NetBSD 5.0+ has accept filter support from FreeBSD; and I I think we can assume all NetBSD is 5.0+ (released in 2009) nowadays if we're already depending on Perl 5.12 from 2010.
2023-09-11	favor poll(2) for most daemons
	public-inbox-watch, lei-daemon, the master process of public-inbox-(netd\|httpd\|imapd\|nntpd\|pop3d), and the (mostly) Perl implementation of XapHelper do not have many FDs to watch so epoll\|kqueue end up being overkill. Of course, *BSDs already have separate kqueue FDs emulating signalfd and/or inotify, even. In other words, only the worker processes of public-inbox-(netd\|httpd\|imapd\|nntpd\|pop3d) are expected to see C10K (or C100K) types of traffic where epoll\|kqueue shine. Perhaps lei could benefit from epoll/kqueue on some virtual users IMAP/JMAP system one day; as could -watch with many IMAP IDLE folders; but we'll probably add a knob if/when it comes to that.
2023-09-11	daemon: depend on DS event_loop in master process, too
	The awaitpid API turns out to be quite handy for managing long-lived worker processes. This allows us to ensure all our uses of signalfd (and kevent emulation) are non-blocking.
2023-08-28	Fix some typos/grammar/errors in docs and comments

2023-04-14	listener: support multi-accept like nginx
	While accepting a single connection at-a-time is likely best for multi-worker and/or load-balanced deployments; accepting multiple connections at once should be less bad on overloaded single-worker systems. We can't automatically pick the best value here since worker counts are dynamic via SIGTTIN/SIGTTOU. Process managers (e.g. systemd) can also spawn multiple instances sharing a single listener with no knowledge sharing between listeners.
2023-03-25	sigfd: pass signal name rather than number to callback
	This is consistent with normal Perl %SIG handlers, and allows -cindex signal handlers to be implemented consistently across platforms.
2023-03-25	ds: @post_loop_do replaces SetPostLoopCallback
	This allows us to avoid repeatedly using memory-intensive anonymous subs in CodeSearchIdx where the callback is assigned frequently. Anonymous subs are known to leak memory in old Perls (e.g. 5.16.3 in enterprise distros) and still expensive in newer Perls. So favor the (\&subroutine, @args) form which allows us to eliminate anonymous subs going forward. Only CodeSearchIdx takes advantage of the new API at the moment, since it's the biggest repeat user of post-loop callback changes. Getting rid of the subroutine and relying on a global `our' variable also has two advantages: 1) Perl warnings can detect typos at compile-time, whereas the (now gone) method could only detect errors at run-time. 2) `our' variable assignment can be `local'-ized to a scope
2023-01-18	eofpipe: drop {arg} support for now
	The only user of EOFpipe has no args, so avoid wasting a hash slot on it. If we need it again in the future, EOFpipe will allow an array of args, instead.
2023-01-03	daemon: don't bother checking for existing FD flags
	FD_CLOEXEC is the only currently defined FD flag, and has been the case for decades at this point. I highly doubt any default FD flag will ever be forced on us by the kernel, init system, or Perl. So save ourselves a syscall and just call F_SETFD with the assumption FD_CLOEXEC is the only FD flag that we'd ever care for.
2022-08-09	daemon: cleanup internal data structures
	This avoids dangling {''} entries in $xnetd and %tls_opt hashes. Furthermore, we can safely undef %tls_opt once it's associated with each $xnetd object.
2022-08-09	daemon: use per-listener SIG{__WARN__} callbacks
	This allows "-l $ADDRESS?err=/path/to/err.log to isolate normal warn() (and carp()) messages for a particular listen address to track down errors more easily.
2022-08-09	daemon: use default address + well-known ports for scheme
	This ensures the "bound $URL" diagnostic message at startup always shows the URL scheme handled if not relying on socket inheritance. This also avoids duplicate/unused data structures when binding sockets ourselves, as bound socket names can expand from short names to longer names (e.g. "0:119" => "0.0.0.0:119").
2022-08-06	daemon: dedupe PublicInbox::Config objects by pathname
	This means all Inbox, Git, Over, Msgmap, Search objects also get deduplicated if they belong to the same config file, reducing memory and FD usage. This helps save memory and improve cache hit rates in -netd setups where NNTP, IMAP, HTTP, and POP3 servers run in the same process. InboxIdle was the only bit which needed adjustment, but there may be other bugs lurking despite all tests passing.
2022-08-04	daemon: handle per-listener options on inherited, well-known ports
	We must not clobber already-parsed per-listener options when handling inherited sockets which are well-known. Unfortunately, this isn't easy to test in a non-intrusive way for regular users.
2022-08-03	daemon: reload TLS certs and keys on SIGHUP
	This allows new TLS certificates to be loaded for new clients without having to timeout nor drop existing clients with established connections made with the old certs. This should benefit users with admins who expire certificates frequently (as encouraged by Let's Encrypt).
2022-08-02	daemon: share FDs for identical log paths
	We rely on the %logs hash for SIGUSR1 log reopening. Without this sharing, some FDs would be hidden inside its respective {HTTP,IMAP,POP3}D object and not reopened on USR2
2022-08-02	daemon: allow listening on well-known ports based on protocol
	This allows admins to use "-l nntp://0.0.0.0/" to bind on port 119 without specifying ":119" on the CLI.
2022-08-02	daemon: add diagnostics about inherited/bound listeners
	These are helpful for diagnosing configuration problems, as well as a bug (to be fixed in the following commit).
2022-08-02	daemon: require absolute cert/key paths with --daemonize
	This is preparation for supporting loading new certs on SIGHUP.
2022-08-02	daemon: support per-listener env, .psgi, out, err
	This allows memory savings by allowing multiple, completely unrelated-PSGI apps to run within the same process as IMAP, NNTP, and POP3.
2022-08-02	httpd: make internals slightly more generic
	This brings the HTTP server closer to the IMAP/NNTP/POP3 implementations and eliminates package-wide globals in PublicInbox::HTTPD. The end goal is to be able to host completely different PSGI applications on different listen ports.
2022-07-20	netd: setup TLS bits for well-known STARTTLS ports
	Unfortunately, I can't think of an easy way to test this in our test suite since binding these ports are privileged and are often in use, anyways.
2022-07-20	public-inbox-pop3d - a mostly read-only POP3 server
	Old account expiry has not been implemented, but it seems to work well with both mpop(1) and getmail(1). The strictness of mpop was particularly helpful in ironing out bugs in our implementation of (dreaded) message sequence numbers. "EXPIRE 0" (RFC 2449) can theoretically save numerous "DELE" commands, but that's untested by real-world clients. mpop supports PIPELINING which is effective in hiding latency, and the core networking functionality is already well-tested from our NNTP and IMAP implementations. Configuration requires "publicinbox.pop3state" to point to a directory writable by the otherwise read-only daemon. See public-inbox-pop3d(1) manpage for more usage details.
2022-07-20	netd: load modules for well-known ports
	When inheriting well-known ports from systemd (or similar), we can auto-load the proper *D.pm file based on the port number without requiring command-line args. load_mod also gets fixed to use its argument, instead of implicit $1 since that won't work for our well-known.
2022-05-08	daemon: fix uninitialized variable
	And also replace an unnecessary substitution (s///) op with a match (m//). Fixes: 93a7b219d58aad86 ("public-inbox-netd: a multi-protocol superserver")
2022-05-05	public-inbox-netd: a multi-protocol superserver
	Since we'll be adding POP3 support as our 4th network protocol; asking admins to run yet another daemon on top of existing -httpd, -nntpd, -imapd is a maintenance burden and a waste of memory. The goal of public-inbox-netd is to be able to replace all existing read-only daemons with a single process to save memory and reduce administrative overhead; hopefully encouraging more users to self-host their own mirrors. It's barely-tested at the moment. Eventually, multiple PI_CONFIG and HOME directories will be supported, as are per-listener .psgi config files.
2021-10-16	imapd+nntpd: drop timer-based expiration
	It's needlessly complex and O(n), so it doesn't scale well to a high number of clients nor is it easy-to-scale with the data structures available to us in pure Perl. In any case, I see no evidence of either -imapd nor -nntpd experiencing high connection loads on public-facing sites. -httpd has never had its own timer-based expiration, either. Fwiw, public-inbox.org itself has been running a public-facing HTTP/HTTPS server with no userspace idle client expiration for the past 8 years or with no ill effect. Clients can come and go as they wish, and SO_KEEPALIVE takes care of truly broken connections if they're gone for ~2 hours. Internet connections drop all time, so it should be harmless to drop connections w/o warning since both NNTP and IMAP protocols have well-defined semantics for determining if a message was truncated (as does HTTP/1.1+).
2021-10-13	daemon: set $SIG{__WARN__} properly
	Eml->warn_ignore_cb itself returns a callback, so creating a reference to it was wrong when assigning it to $SIG{__WARN__}; Fixes: 176cd51f9aa81b74 ("daemon: quiet down Eml-related warnings")
2021-10-12	daemon: quiet down Eml-related warnings
	Email::Address::XS is quite noisy and there's nothing we can really do about messages we're serving from read-only daemons.
2021-10-12	daemon: use v5.10.1, disable local warnings
	We're moving towards relying on "perl -w" for warnings and v5.12 for strict.
2021-10-08	git: fatalize async callback errors by default
	This should help us catch BUG: errors (and then some) in -extindex and other read-write code paths. Only read-only daemons should warn on async callback failures, since those aren't capable of causing data loss.
2021-10-01	ds: simplify signalfd use
	Since signalfd is often combined with our event loop, give it a convenient API and reduce the code duplication required to use it. EventLoop is replaced with ::event_loop to allow consistent parameter passing and avoid needlessly passing the package name on stack. We also avoid exporting SFD_NONBLOCK since it's the only flag we support. There's no sense in having the memory overhead of a constant function when it's in cold code.
2021-10-01	daemon: make SO_ACCEPTFILTER a shared variable
	Constant subroutines use more memory and there's no need to optimize it for inlining since it's only used at startup.
2021-05-23	treewide: favor open(..., '+<&=', $fd)
	Cut down on unnecessary imports of IO::Handle and method lookup + dispatch overhead.
2021-01-24	treewide: reseed RNG in child processes
	This prevents name conflicts leading to retries and slowdowns in temporary file name generation. No actual data corruption resulted because all temporary files are opened with O_EXCL anyways. This may increase security for IMAP, NNTP, and HTTPS sessions using TLS, but it's all public data anyways.
2021-01-14	daemon+watch: fix localization of %SIG for non-signalfd users
	It turns out "local" did not take effect in the way we used it: http://nntp.perl.org/group/perl.perl5.porters/258784 <CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com> Fortunately, none of the old use cases seem affected, unlike the previous lei change to ensure consistent SIGPIPE handling.
2021-01-12	ds: block signals when reaping
	This lets us call dwaitpid long before a process exits and not have to wait around for it. This is advantageous for lei where we can run dwaitpid on the pager as soon as we spawn it, instead of waiting for a client socket to go away on DESTROY.
2021-01-01	update copyrights for 2021
	Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2021-01-01	syscall: SFD_NONBLOCK can be a constant, again
	Since Perl exposes O_NONBLOCK as a constant, we can safely make SFD_NONBLOCK a constant, too. This is not the case for SFD_CLOEXEC, since O_CLOEXEC is not exposed by Perl despite being used internally in the interpreter.