public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2019-09-17	http: remove unnecessary delete
	Only removing $http->{env} is needed to prevent circular references. $env->{'psgix.io'} does not need to be deleted since $env will no longer have any references to it when ->close returns.
2019-09-17	http: drop unused `$env' variable after delete
	And explain why we need to do that delete in a comment.
2019-09-14	tmpfile: give temporary files meaningful names
	Although we always unlink temporary files, give them a meaningful name so that we can we can still make sense of the pre-unlink name when using lsof(8) or similar tools on Linux.
2019-09-09	run update-copyrights from gnulib for 2019

2019-07-10	http\|nntp: avoid recursion inside ->write
	In HTTP.pm, we can use the same technique NNTP.pm uses with long_response with the $long_cb callback and avoid storing $pull in the per-client structure at all. We can also reuse the same logic to push the callback into wbuf from NNTP. This does NOT introduce a new circular reference, but documents it more clearly.
2019-07-08	http\|nntp: "use PublicInbox::DS" instead of ->import
	Relying on "use" to import during BEGIN means we get to take advantage of prototype checking of function args during the rest of the compilation phase.
2019-06-29	httpd/async: switch to buffering-as-fast-as-possible
	With DS buffering to a temporary file nowadays, applying backpressure to git-http-backend(1) hurts overall memory usage of the system. Instead, try to get git-http-backend(1) to finish as quickly as possible and use edge-triggered notifications to reduce wakeups on our end.
2019-06-29	http: support HTTPS (kinda)
	It's barely any effort at all to support HTTPS now that we have NNTPS support and can share all the code for writing daemons. However, we still depend on Varnish to avoid hug-of-death situations, so supporting reverse-proxying will be required.
2019-06-29	ds: handle deferred DS->close after timers
	Our hacks in EvCleanup::next_tick and EvCleanup::asap were due to the fact "closed" sockets were deferred and could not wake up the event loop, causing certain actions to be delayed until an event fired. Instead, ensure we don't sleep if there are pending sockets to close. We can then remove most of the EvCleanup stuff While we're at it, split out immediate timer handling into a separate array so we don't need to deal with time calculations for the event loop.
2019-06-29	http: use requeue instead of watch_in1
	Don't use epoll or kqueue to watch for anything unless we hit EAGAIN, since we don't know if a socket is SSL or not.
2019-06-29	ds: share lazy rbuf handling between HTTP and NNTP
	Doing this for HTTP cuts the memory usage of 10K idle-after-one-request HTTP clients from 92 MB to 47 MB. The savings over the equivalent NNTP change in commit 6f173864f5acac89769a67739b8c377510711d49, ("nntp: lazily allocate and stash rbuf") seems down to the size of HTTP requests and the fact HTTP is a client-sends-first protocol where as NNTP is server-sends-first.
2019-06-24	allow use of PerlIO layers for filesystem writes
	It may make sense to use PerlIO::mmap or PerlIO::scalar for DS write buffering with IO::Socket::SSL or similar (since we can't use MSG_MORE), so that means we need to go through buffering in userspace for the common case; while still being easily compatible with slow clients. And it also simplifies GitHTTPBackend slightly. Maybe it can make sense for HTTP input buffering, too...
2019-06-24	ds: hoist out do_read from NNTP and HTTP
	Both NNTP and HTTP have common needs and we can factor out some common code to make dealing with IO::Socket::SSL easier.
2019-06-24	http\|nntp: be explicit about bytes::length on rbuf
	It should not matter because our rbuf is always from a socket without encoding layers, but this makes things easier to follow.
2019-06-24	ds: pass $self to code references
	We can reduce the amount of short-lived anonymous subs we create by passing $self to code references.
2019-06-24	http: don't pass extra args to PublicInbox::DS::close
	YAGNI Followup-to: commit 30ab5cf82b9d47242640f748a0f9a088ca783e32 ("ds: reduce Errno imports and drop ->close reason")
2019-06-24	ds: favor `delete' over assigning fields to `undef'
	This is cleaner in most cases and may allow Perl to reuse memory from unused fields. We can do this now that we no longer support Perl 5.8; since Danga::Socket was written with struct-like pseudo-hash support in mind, and Perl 5.9+ dropped support for pseudo-hashes over a decade ago.
2019-06-24	http\|nntp: favor "$! == EFOO" over $!{EFOO} checks
	Integer comparisions of "$!" are faster than hash lookups. See commit 6fa2b29fcd0477d126ebb7db7f97b334f74bbcbc ("ds: cleanup Errno imports and favor constant comparisons") for benchmarks.
2019-06-24	ds: get rid of event_watch field
	We don't need to keep track of that field since we always know what events we're interested in when using one-shot wakeups.
2019-06-24	ds: set event flags directly at initialization
	We can avoid the EPOLL_CTL_ADD && EPOLL_CTL_MOD sequence with a single EPOLL_CTL_ADD.
2019-06-24	ds: switch write buffering to use a tempfile
	Data which can't fit into a generously-sized socket buffer, has no business being stored in heap.
2019-06-24	ds: share send(..., MSG_MORE) logic
	No sense in having similar Linux-specific functionality in both our NNTP.pm and HTTP.pm
2019-06-24	http: favor DS->write(strref) when reasonable
	This can avoid large memory copies when strings can't be copy-on-write and saves us the trouble of creating new refs in the code.
2019-06-24	ds: lazy-initialize wbuf
	We don't need write buffering unless we encounter slow clients requesting large responses. So don't waste a hash slot or (empty) arrayref for it.
2019-06-24	ds: get rid of {closed} field
	Merely checking the presence of the {sock} field is enough, and having multiple sources of truth increases confusion and the likelyhood of bugs.
2019-06-16	ds: stop distinguishing event read and write callbacks
	Having separate read/write callbacks in every class is too confusing to my easily-confused mind. Instead, give every class an "event_step" callback which is easier to wrap my head around. This will make future code to support IO::Socket::SSL-wrapped sockets easier-to-digest, since SSL_write() can require waiting on POLLIN events, and SSL_read() can require waiting on POLLOUT events.
2019-06-10	ds: do not distinguish between POLLHUP and POLLERR
	In my experience, both are worthless as any normal read/write call path will be wanting to check errors and deal with them appropriately; so we can just call event_read, for now. Eventually, there'll probably be only one callback for dealing with all in/out/err/hup events to simplify logic, especially w.r.t TLS socket negotiation.
2019-06-10	ds: simplify write buffer accounting
	Keeping track of write_buf_size was redundant and pointless when we can simply check the number of elements in the buffer array. Multiple sources of truth leads to confusion; confusion leads to bugs. Finally, rename the prefixes to 'wbuf' to ensure we loudly (instead of silently) break any external dependencies being ported over from Danga::Socket, as further changes are pending.
2019-06-04	http: require SERVER_PORT to be ASCII digit
	I'm not sure what middlewares care for for SERVER_PORT; but allowing non-ASCII digits seems non-sensical, here.
2019-05-15	remove hard Devel::Peek dependency and lazy load for daemons
	It's only useful for a corner case in long-running daemons when an admin decides to compact or vacuum a Xapian or SQLite DB. As a result, other scripts should run slightly faster. For instance, this saves about 80ms (2.710s => 2.630s) in t/mda.t on my remote workstation. While we're at it, make sure EvCleanup is properly require'd in Daemon.pm and HTTP.pm and document our use of Devel::Peek.
2019-05-04	bundle Danga::Socket and Sys::Syscall
	These modules are unmaintained upstream at the moment, but I'll be able to help with the intended maintainer once/if CPAN ownership is transferred. OTOH, we've been waiting for that transfer for several years, now... Changes I intend to make: * EPOLLEXCLUSIVE for Linux * remove unused fields wasting memory * kqueue bugfixes e.g. https://rt.cpan.org/Ticket/Display.html?id=116615 * accept4 support And some lower priority experiments: * switch to EV_ONESHOT / EPOLLONESHOT (incompatible changes) * nginx-style buffering to tmpfile instead of string array * sendfile off tmpfile buffers * io_uring maybe?
2019-02-13	ensure bytes::length is available to callers
	We were relying on Danga::Socket using the "bytes" pragma, previously. Nowadays, the "bytes" pragma is not recommended in general, but bytes::length remains acceptable for getting the byte-size of a scalar.
2019-02-07	http: cleanup partial-write handling on readonly values
	Don't bother assigning to $_[1]; just let Danga::Socket do its thing since $_[1] should be out-of-scope soon.
2018-03-27	http: fix modification of read-only value
	This fails in the rare case we get a partial send() on "\r\n" when writing chunked HTTP responses out.
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-01-04	http: remove weaken usage, reduce anonsub capture scope
	Avoiding weaken here is no more dangerous than the existing circular refs (e.g. psgix.io) we create and manage throughout the lifetime of the connection. So, trust ourselves to maintain the data structure properly and avoid triggering extra memory usage. While we're at it, avoid having anonymous subroutines capture more variables than necessary to simplify reference auditing.
2017-01-04	http: fix spelling error
	Oops. And we'll be fixing circular references from now...
2016-12-25	http: fix clobbering of $null_io
	Oops, this would be disatrous if we started handling bigger request bodies or slow clients. Fixes: c008654229a9 ("avoid IO::File for anonymous temporary files")
2016-11-26	avoid IO::File for anonymous temporary files
	We do not need to import IO::File into the main programs since Perl 5.8+ supports literal "undef" for generating anonymous temporary file handles.
2016-08-05	http: do not allow bad getline+close responses to kill us
	PSGI applications (like our WWW :P) can fail unpredictability, but lets try to avoid bringing the entire process down when this happens.
2016-07-08	http: drop extra newline in error message
	We already add the extra newline when we call print.
2016-07-07	http: additional info for write failures
	There was a spurious test failure in t/httpd-corner.t which I have not been able to reproduce.
2016-07-07	inbox: cleanup and consolidate object weakening
	This fixes some layering violations and consolidates the cleanup into the inbox object itself. Keeping in mind weakening does not work at all without our PSGI server.
2016-06-25	http: cork chunked responses for small savings
	This only affects Linux users with MSG_MORE support. We can avoid extra TCP overhead for sub-optimal chunk sizes by using MSG_MORE even with chunk trailers under Linux. This breaks real-time apps which require <= 200ms latency for streaming small packets (e.g. implementing "tail -F"), but the public-inbox WWW code does not (and will never) do such things.
2016-06-24	http: always yield on getline/body
	We want to maximize fairness for large responses which may download the entire mbox.
2016-06-19	http: constrain getline/close responses by time
	This allows us to yield control to other clients gracefully if getline takes too long to generate a chunk. This is more expensive but should not cost a syscall on modern 64-bit systems.
2016-06-19	http: avoid recursion when hitting write count limit
	Use the EvCleanup::asap handler to reschedule our writes after yielding to other clients.
2016-05-30	http: yield body->getline running time
	We cannot let a client monopolize the single-threaded server even if it can drain the socket buffer faster than we can emit data. While we're at it, acknowledge the this behavior (which happens naturally) in httpd/async. The same idea is present in NNTP for the long_response code. This is the HTTP followup to: commit 0d0fde0bff97 ("nntp: introduce long response API for streaming") commit 79d8bfedcdd2 ("nntp: avoid signals for long responses")
2016-05-28	http: clarify comments about layering violation
	It's a low priority, but acknowledge it.
2016-05-27	http: avoid circular reference for getline responses
	Lightly tested, this seems to work when mass-aborting responses. Will still need to automate the testing...