about summary refs log tree commit homepage
path: root/lib/PublicInbox/GzipFilter.pm
DateCommit message (Collapse)
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-09treewide: replace {-inbox} with {ibx} for consistency
{ibx} is shorter and is the most prevalent abbreviation in indexing and IMAP code, and the `$ibx' local variable is already prevalent throughout. In general, the codebase favors removal of vowels in variable and field names to denote non-references (because references are "lighter" than non-references). So update WWW and Filter users to use the same code since it reduces confusion and may allow easier code sharing.
2020-08-01www: rework async_* to use method table
Although the ->async_next method does not take $self as a receiver, but rather a PublicInbox::HTTP object, we may still retrieve it to be called with the HTTP object via UNIVERSAL->can.
2020-07-06gzipfilter: check http->{forward} for client disconnects
We actually don't do anything with {env} or {'psgix.io'} on client aborts, so checking the truthiness of '{forward}' is necessary.
2020-07-06daemon: warn on missing blobs
Since -edit and -purge should be rare and TOCTOU around them rarer still; missing {blobs} could be indicative of a real bug elsewhere. Warn on them. And I somehow ended up with 3 different field names for Inbox objects. Perhaps they'll be made consistent in the future.
2020-07-06gzipfilter: drop HTTP connection on bugs or data corruption
While all the {async_next} callbacks needed eval guards anyways because of DS->write, {async_eml} callbacks did not. Ensure any bugs in our code or data corruption result in termination of the HTTP connection, so as not to leave clients hanging on a response which never comes or is mangled in some way.
2020-07-06www: update internal docs
We no longer favor getline+close for streaming PSGI responses when using public-inbox-httpd. We still support it for other PSGI servers, though.
2020-07-06remove unused/redundant zlib-related imports
Z_FINISH is the default for Compress::Raw::Zlib::Deflate->flush, anyways, so there's no reason to import it. And none of C::R::Z is needed in WwwText now that gzf_maybe handles it all.
2020-07-06www: start making gzipfilter the parent response class
Virtually all of our responses are going to be gzipped, anyways. This will allow us to utilize zlib as a buffering layer and share common code for async blob retrieval responses. To streamline this and allow GzipFilter to be a parent class, we'll replace the NoopFilter with a similar CompressNoop class which emulates the two Compress::Raw::Zlib::Deflate methods we use. This drops a bunch of redundant code and will hopefully make upcoming WwwStream changes easier to reason about.
2020-07-06qspawn: learn to gzip streaming responses
This will allow us to gzip responses generated by cgit and any other CGI programs or long-lived streaming responses we may spawn.
2020-07-06{gzip,noop}filter: ->zmore returns undef, always
This simplifies callers, as witnessed by the change to WwwListing. It adds overhead to NoopFilter, but NoopFilter should see little use as nearly all HTTP clients request gzip.
2020-07-06gzipfilter: replace Compress::Raw::Deflate usages
The new ->zmore and ->zflush APIs make it possible to replace existing verbose usages of Compress::Raw::Deflate and simplify buffering logic for streaming large gzipped data. One potentially user visible change is we now break the mbox.gz response on zlib failures, instead of silently continuing onto the next message. zlib only seems to fail on OOM, which should be rare; so it's ideal we drop the connection anyways.
2020-07-06wwwlisting: use GzipFilter for HTML
The changes to GzipFilter here may be beneficial for building HTML and XML responses in other places, too.
2020-07-06www*stream: gzip ->getline responses
Our most common endpoints deserve to be gzipped.
2020-07-06wwwstream: oneshot: perform gzip without middleware
Plack::Middleware::Deflater forces us to use a memory-intensive closure. Instead, work towards building compressed strings in memory to reduce the overhead of buffering large HTML output.
2020-07-06gzipfilter: minor cleanups
We currently don't use bytes::length in ->write, so there's no need to `use bytes'. Favor `//=' to describe the intent of the conditional assignment since the C::R::Z::Deflate object is always truthy. Also use the local $gz variable to avoid unnecessary {gz} hash lookups.
2020-03-25gzipfilter: lazy allocate the deflate context
zlib contexts are memory-intensive, particularly when used for compression. Since the gzip filter may be sitting in a limiter queue for a long period, delay the allocation we actually have data to translate, and not a moment sooner.
2020-03-25qspawn: reinstate filter support, add gzip filter
We'll be supporting gzipped from sqlite3(1) dumps for altid files in future commits. In the future (and if we survive), we may replace Plack::Middleware::Deflater with our own GzipFilter to work better with asynchronous responses without relying on memory-intensive anonymous subs.