diff options
-rw-r--r-- | Documentation/technical/ds.txt | 112 | ||||
-rw-r--r-- | MANIFEST | 1 | ||||
-rw-r--r-- | lib/PublicInbox/DS.pm | 16 |
3 files changed, 121 insertions, 8 deletions
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt new file mode 100644 index 00000000..cbd06cfb --- /dev/null +++ b/Documentation/technical/ds.txt @@ -0,0 +1,112 @@ +PublicInbox::DS - event loop and async I/O base class + +Our PublicInbox::DS event loop which powers public-inbox-nntpd +and public-inbox-httpd diverges significantly from the +unmaintained Danga::Socket package we forked from. In fact, +it's probably different from most other event loops out there. + +Most notably: + +* There is one and only one callback: ->event_step. Unlike other + event loops, there are no separate callbacks for read, write, + error or hangup events. In fact, we never care which kevent + filter or poll/epoll event flag (e.g. POLLIN/POLLOUT/POLLHUP) + triggers a call. + + The lack of read/write callback distinction is driven by the + fact TLS libraries (e.g. OpenSSL via IO::Socket::SSL) may + declare SSL_WANT_READ on SSL_write(), and SSL_WANT_READ on + SSL_read(). So we end up having to let each user object decide + whether it wants to make read or write calls depending on its + internal state, completely independent of the event loop. + + Error and hangup (POLLERR and POLLHUP) callbacks are redundant and + only triggered in rare cases. They're redundant because the + result of every read and write call in ->event_step must be + checked, anyways. At best, callbacks for POLLHUP and POLLERR can + save one syscall per socket lifetime and not worth the extra code + it imposes. + + Reducing the user-supplied code down to a single callback allows + subclasses to keep their logic self-contained. The combination + of this change and one-shot wakeups (see below) for bidirectional + data flows make asynchronous code easier to reason about. + +Other divergences: + +* ->write buffering uses temporary files whereas Danga::Socket used + the heap. The rationale for this is the kernel already provides + ample (and configurable) space for socket buffers. Modern kernels + also cache FS operations aggressively, so systems with ample RAM + are unlikely to notice degradation, while small systems are less + likely to suffer unpredictable heap fragmentation, swap and OOM + penalties. + + In the future, we may introduce sendfile and mmap+SSL_write to + reduce data copies, and use FALLOC_FL_PUNCH_HOLE on Linux to + release space after the buffer is partially cleared. + +Augmented features: + +* obj->write(CODEREF) passes the object itself to the CODEREF + Being able to enqueue subroutine calls is a powerful feature in + Danga::Socket for keeping linear logic in an asynchronous environment. + Unfortunately, each subroutine takes several kilobytes of memory. + One small change to Danga::Socket is to pass the receiver object + (aka "$self") to the CODEREF. $self can store any necessary + state it needs for a normal (named) subroutine. This allows us to + put the same sub into multiple queues without paying a large + memory penalty for each one. + + This idea is also more easily ported to C or other languages which + lack anonymous subroutines (aka "closures"). + +* ->requeue support. An optimization of the AddTimer(0, ...) idiom + for immediately dispatching code at the next event loop iteration. + public-inbox uses this for fairly generating large responses + iteratively (see PublicInbox::NNTP::long_response or the use of + ->getline callbacks for generating gigantic gzipped mboxes). + +New features + +* One-shot wakeups allowed via EPOLLONESHOT or EV_DISPATCH. These + flags allow us to simplify code in ->event_step callbacks for + bidirectional sockets (NNTP and HTTP). Instead of merely reacting + to events, control is handed over at ->event_step in one-shot scenarios. + The event_step caller (NNTP || HTTP) then becomes proactive in declaring + which (if any) events it's interested in for the next loop iteration. + +* Edge-triggering available via EPOLLET or EV_CLEAR. These reduce wakeups + for unidirectional classes (e.g. PublicInbox::Listener sockets, + and pipes via PublicInbox::HTTPD::Async). + +* IO::Socket::SSL support (for NNTPS, STARTTLS+NNTP, HTTPS) + +* dwaitpid (waitpid wrapper) support for reaping dead children + +* reliable signal wakeups are supported via signalfd on Linux, + EVFILT_SIGNAL on *BSDs via IO::KQueue. + +Removed features + +* Many fields removed or moved to subclasses, so the underlying + hash is smaller and suitable for FDs other than stream sockets. + Some fields we enforce (e.g. wbuf, wbuf_off) are autovivified + on an as-needed basis to save memory when they're not needed. + +* TCP_CORK support removed, instead we use MSG_MORE on non-TLS sockets + and we may use vectored I/O support via GnuTLS in the future + for TLS sockets. + +* per-FD PLCMap (post-loop callback) removed, we got ->requeue + support where no extra hash lookups or assignments are necessary. + +* read push backs removed. Some subclasses use a read buffer ({rbuf}) + but they control it, not this event loop. + +* Profiling and debug logging removed. Perl and OS-specific tracers + and profilers are sufficient. + +* ->AddOtherFds support removed, everything watched is a subclass of + PublicInbox::DS, but we've slimmed down the fields to eliminate + the memory penalty for objects. @@ -34,6 +34,7 @@ Documentation/public-inbox-watch.pod Documentation/public-inbox-xcpdb.pod Documentation/public-inbox.cgi.pod Documentation/standards.perl +Documentation/technical/ds.txt Documentation/txt2pre HACKING INSTALL diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm index 09dc3992..058b1358 100644 --- a/lib/PublicInbox/DS.pm +++ b/lib/PublicInbox/DS.pm @@ -3,15 +3,15 @@ # # This license differs from the rest of public-inbox # -# This is a fork of the (for now) unmaintained Danga::Socket 1.61. -# Unused features will be removed, and updates will be made to take -# advantage of newer kernels. +# This is a fork of the unmaintained Danga::Socket (1.61) with +# significant changes. See Documentation/technical/ds.txt in our +# source for details. # -# API changes to diverge from Danga::Socket will happen to better -# accomodate new features and improve scalability. Do not expect -# this to be a stable API like Danga::Socket. -# Bugs encountered (and likely fixed) are reported to -# bug-Danga-Socket@rt.cpan.org and visible at: +# Do not expect this to be a stable API like Danga::Socket, +# but it will evolve to suite our needs and to take advantage of +# newer Linux and *BSD features. +# Bugs encountered were reported to bug-Danga-Socket@rt.cpan.org, +# fixed in Danga::Socket 1.62 and visible at: # https://rt.cpan.org/Public/Dist/Display.html?Name=Danga-Socket package PublicInbox::DS; use strict; |