about summary refs log tree commit homepage
path: root/lib/PublicInbox/HTTPD/Async.pm
DateCommit message (Collapse)
2023-10-25drop psgi_return, httpd/async and GetlineBody
Now that psgi_yield is used everywhere, the more complex psgi_return and it's helper bits can be removed. We'll also fix some outdated comments now that everything on psgi_return has switched to psgi_yield. GetlineResponse replaces GetlineBody and does a better job of isolating generic PSGI-only code.
2023-10-25httpd/async: require IO arg
Callers that want to requeue can call PublicInbox::DS::requeue directly and not go through the convoluted argument handling via PublicInbox::HTTPD::Async->new.
2023-10-08introduce ProcessIONBF for multiplexed non-blocking IO
This is required for reliable epoll/kevent/poll/select wakeup notifications, since we have no visibility into the buffer states used internally by Perl. We can safely use sysread here since we never use the :utf8 nor any :encoding Perl IO layers for readable pipes. I suspect this fixes occasional failures from t/solver_git.t when retrieving the WwwCoderepo summary.
2023-10-08rename ProcessPipe to ProcessIO
Since we deal with pipes (of either direction) and bidirectional stream sockets for this class, it's better to remove the `Pipe' from the name and replace it with `IO' to communicate that it works for any form of IO::Handle-like object tied to a process.
2023-01-06httpd/async: retry reads properly when parsing headers
While git-http-backend sends headers with one write syscall, upstream cgit still trickles them out line-by-line and we need to account for that and retry Qspawn {parse_hdr} callbacks.
2022-12-23httpd/async + qspawn: rename {fh} fields
Use more unique names within the project to minimize confusion since these packages interact quite a bit and using identical names leads to needless confusion.
2022-12-23httpd/async: remove useless undef
Assigning `undef' to a scalar doesn't free it's memory, we need to call `undef($var)' in the caller. It's also been pointless since we simplified ->async_pass in commit b7fbffd1f8c12556 (httpd/async: get rid of ephemeral main_cb, 2019-12-25)
2022-12-23httpd: avoid crash on cgit -> coderepo 404 fallback
A trickled cgit response can cause HTTPD::Async->event_step to fire an extra time after header parsing. We need to account for the lack of async_pass call populating ->{fh} and ->{http} in that case and avoid calling $self->{fh}->write when there's no {fh}.
2022-09-10httpd/async: describe which ->write subs it can call
I initially wanted to rename GzipFilter->write to GzipFilter->writev to reflect the multi-argument nature of the sub, and it wasn't worth the memory to maintain an alias.
2021-10-16httpd/async: switch to level-triggered epoll
We'll save ourselves some code here and let the kernel do more work, instead.
2021-02-07httpd/async: avoid unnecessary on-stack delete
While this doesn't fix a known problem, this was a risky construct in case somebody uses confess/longmess inside the user-supplied callback. cf. commit 0795b0906cc81f40 ("ds: guard against stack-not-refcounted quirk of Perl 5")
2021-02-05httpd/async: set O_NONBLOCK correctly
While Perl tie is nice for some things, getting IO::Handle->blocking to work transparently with it doesn't seem possible at the moment. Add some examples in t/spawn.t for future hackers. Fixes: 22e51bd9da476fa9 ("qspawn: switch to ProcessPipe via popen_rd")
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-06-28ds: remove fields.pm usage
Since the removal of pseudo-hash support in Perl 5.10, the "fields" module no longer provides the space or speed benefits it did in 5.8. It also does not allow for compile-time checks, only run-time checks. To me, the extra developer overhead in maintaining "use fields" args has become a hassle. None of our non-DS-related code uses fields.pm, nor do any of our current dependencies. In fact, Danga::Socket (which DS was originally forked from) and its subclasses are the only fields.pm users I've ever encountered in the wild. Removing fields may make our code more approachable to other Perl hackers. So stop using fields.pm and locked hashes, but continue to document what fields do for non-trivial classes.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2019-12-26httpd/async: get rid of ephemeral main_cb
Cheaper to use up two hash table slots than creating a new sub.
2019-12-26qspawn: replace anonymous $end callbacks w/ event_step
This will tie into the DS event loop if that's used, but event_step an be called directly without relying on the event loop from Apache or other HTTP servers (or PSGI tests).
2019-12-26httpd/async: support passing arg to callbacks
Another step towards removing anonymous subs to eliminate a possible source of memory leaks and high memory use.
2019-09-14httpd/async: improve naming and comments
Rename the {cleanup} field to {end}, since it's similar to END {} and is consistent with the variable in Qspawn.pm And document how certain subs get called, since we have many subs named "new" and "close".
2019-09-09run update-copyrights from gnulib for 2019
2019-06-29http: use bigger, but shorter-lived buffers for pipes
Linux pipes default to 65536 bytes in size, and we want to read external processes as fast as possible now that we don't use Danga::Socket or buffer to heap. However, drop the buffer ASAP if we need to wait on anything; since idle buffers can be idle for eons. This lets other execution contexts can reuse that memory right away.
2019-06-29httpd/async: switch to buffering-as-fast-as-possible
With DS buffering to a temporary file nowadays, applying backpressure to git-http-backend(1) hurts overall memory usage of the system. Instead, try to get git-http-backend(1) to finish as quickly as possible and use edge-triggered notifications to reduce wakeups on our end.
2019-06-29ds: handle deferred DS->close after timers
Our hacks in EvCleanup::next_tick and EvCleanup::asap were due to the fact "closed" sockets were deferred and could not wake up the event loop, causing certain actions to be delayed until an event fired. Instead, ensure we don't sleep if there are pending sockets to close. We can then remove most of the EvCleanup stuff While we're at it, split out immediate timer handling into a separate array so we don't need to deal with time calculations for the event loop.
2019-06-24ds: pass $self to code references
We can reduce the amount of short-lived anonymous subs we create by passing $self to code references.
2019-06-24http: don't pass extra args to PublicInbox::DS::close
YAGNI Followup-to: commit 30ab5cf82b9d47242640f748a0f9a088ca783e32 ("ds: reduce Errno imports and drop ->close reason")
2019-06-24ds: favor `delete' over assigning fields to `undef'
This is cleaner in most cases and may allow Perl to reuse memory from unused fields. We can do this now that we no longer support Perl 5.8; since Danga::Socket was written with struct-like pseudo-hash support in mind, and Perl 5.9+ dropped support for pseudo-hashes over a decade ago.
2019-06-24http|nntp: favor "$! == EFOO" over $!{EFOO} checks
Integer comparisions of "$!" are faster than hash lookups. See commit 6fa2b29fcd0477d126ebb7db7f97b334f74bbcbc ("ds: cleanup Errno imports and favor constant comparisons") for benchmarks.
2019-06-24httpd/async: remove EINTR check
This pipe is always non-blocking when run under public-inbox-httpd and it won't fail with EINTR in that case
2019-06-24ds: get rid of event_watch field
We don't need to keep track of that field since we always know what events we're interested in when using one-shot wakeups.
2019-06-24ds: set event flags directly at initialization
We can avoid the EPOLL_CTL_ADD && EPOLL_CTL_MOD sequence with a single EPOLL_CTL_ADD.
2019-06-24ds: lazy-initialize wbuf
We don't need write buffering unless we encounter slow clients requesting large responses. So don't waste a hash slot or (empty) arrayref for it.
2019-06-24ds: get rid of {closed} field
Merely checking the presence of the {sock} field is enough, and having multiple sources of truth increases confusion and the likelyhood of bugs.
2019-06-16ds: stop distinguishing event read and write callbacks
Having separate read/write callbacks in every class is too confusing to my easily-confused mind. Instead, give every class an "event_step" callback which is easier to wrap my head around. This will make future code to support IO::Socket::SSL-wrapped sockets easier-to-digest, since SSL_write() can require waiting on POLLIN events, and SSL_read() can require waiting on POLLOUT events.
2019-06-10ds: do not distinguish between POLLHUP and POLLERR
In my experience, both are worthless as any normal read/write call path will be wanting to check errors and deal with them appropriately; so we can just call event_read, for now. Eventually, there'll probably be only one callback for dealing with all in/out/err/hup events to simplify logic, especially w.r.t TLS socket negotiation.
2019-06-10ds: simplify write buffer accounting
Keeping track of write_buf_size was redundant and pointless when we can simply check the number of elements in the buffer array. Multiple sources of truth leads to confusion; confusion leads to bugs. Finally, rename the prefixes to 'wbuf' to ensure we loudly (instead of silently) break any external dependencies being ported over from Danga::Socket, as further changes are pending.
2019-05-04bundle Danga::Socket and Sys::Syscall
These modules are unmaintained upstream at the moment, but I'll be able to help with the intended maintainer once/if CPAN ownership is transferred. OTOH, we've been waiting for that transfer for several years, now... Changes I intend to make: * EPOLLEXCLUSIVE for Linux * remove unused fields wasting memory * kqueue bugfixes e.g. https://rt.cpan.org/Ticket/Display.html?id=116615 * accept4 support And some lower priority experiments: * switch to EV_ONESHOT / EPOLLONESHOT (incompatible changes) * nginx-style buffering to tmpfile instead of string array * sendfile off tmpfile buffers * io_uring maybe?
2019-01-26solver: rewrite to use Qspawn->psgi_qx and pi-httpd.async
The psgi_qx routine in the now-abandoned "repobrowse" branch allows us to break down blob-solving at each process execution point. It reuses the Qspawn facility for git-http-backend(1), allowing us to limit parallel subprocesses independently of Perl worker count. This is actually a 2-3% slower a fully-synchronous execution; but it is fair to other clients as it won't monopolize the server for hundreds of milliseconds (or even seconds) at a time.
2019-01-22httpd/async: stop running command if client disconnects
If an HTTP client disconnects while we're piping the output of a process to them, break the pipe of the process to reclaim resources as soon as possible.
2019-01-22qspawn|httpd/async: improve and fix out-of-date comments
2019-01-22httpd/async: remove needless sysread wrapper
We don't appear to be using it anywhere
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-01-04httpd/async: remove weaken usage
We do not need to use weaken() here, so avoid it to simplify our interactions with Perl; as weaken requires additional storage and (it seems) time complexity.
2016-12-25httpd/async: improve variable naming
We only refer to PublicInbox::HTTP objects here, so '$io' was a bad name.
2016-07-09httpd/async: reinstate D::S timer usage for cleanup
EvCleanup::asap events are not guaranteed to run after Danga::Socket closes sockets at the event loop. Thus we must use slower Danga::Socket timers which are guaranteed to run at the end of the event loop.
2016-07-09httpd/async: do not attempt future writes on closed sockets
Danga::Socket::close does not clear the write_buf_size field, so it's conceivable we could attempt to queue up data and callbacks we can never flush out.
2016-06-18daemon: be less misleading about graceful shutdown
We do not need to count the httpd.async object against our running client count, that is tied to the socket of the actual client. This prevents misleading sysadmins about connected clients during shutdown.
2016-05-30http: yield body->getline running time
We cannot let a client monopolize the single-threaded server even if it can drain the socket buffer faster than we can emit data. While we're at it, acknowledge the this behavior (which happens naturally) in httpd/async. The same idea is present in NNTP for the long_response code. This is the HTTP followup to: commit 0d0fde0bff97 ("nntp: introduce long response API for streaming") commit 79d8bfedcdd2 ("nntp: avoid signals for long responses")
2016-05-27httpd/async: do not needlessly weaken
The restart_read callback has no chance of circular reference, and weakening $self before we create it can cause $self to be undefined inside the callback (seen during stress testing). Fixes: 395406118cb2 ("httpd/async: prevent circular reference")
2016-05-27httpd/async: prevent circular reference
We must avoid circular references which can cause leaks in long-running processes. This callback is dangerous since it may never be called to properly terminate everything.
2016-05-24standardize timer-related event-loop code
Standardize the code we have in place to avoid creating too many timer objects. We do not need exact timers for things that don't need to be run ASAP, so we can play things fast and loose to avoid wasting power with unnecessary wakeups. We only need two classes of timers: * asap - run this on the next loop tick, after operating on @Danga::Socket::ToClose to close remaining sockets * later - run at some point in the future. It could be as soon as immediately (like "asap"), and as late as 60s into the future. In the future, we support an "emergency" switch to fire "later" timers immediately.