Date | Commit message (Collapse) |
|
While it's not in a code path intended WwwCoderepo and RepoAtom,
those classes provide their own ->zflush, this can future-proof
our code against future subclasses at a minor performance cost.
|
|
Now that psgi_yield is used everywhere, the more complex
psgi_return and it's helper bits can be removed. We'll also fix
some outdated comments now that everything on psgi_return has
switched to psgi_yield. GetlineResponse replaces GetlineBody
and does a better job of isolating generic PSGI-only code.
|
|
This is intended to replace psgi_return and HTTPD/Async
entirely, hopefully making our code less convoluted while
maintaining the ability to handle slow clients on
memory-constrained systems
This was made possible by the philosophy shift in commit 21a539a2df0c
(httpd/async: switch to buffering-as-fast-as-possible, 2019-06-28).
We'll still support generic PSGI via the `pull' model with a
GetlineResponse class which is similar to the old GetlineBody.
|
|
The $oid arg for Git->cat_async is defined on async_abort using
the original request, so use undefined $type to distinguish that
case in caller-supplied callbacks. async_abort isn't common, of
course, but sometimes git subprocesses can die unexpectedly.
|
|
carp is more useful since it shows the perspective of the caller
and can be made to show a full backtrace with
PERL5OPT=-MCarp=verbose
|
|
This should be similar or identical to what's in cgit;
and tie into the rest of the www_coderepo stuff.
|
|
While we must name this function ->write for PSGI compatibility,
our own uses of it can make it operate more like writev(2)
or `print' in Perl.
|
|
This will let us drop some calls to zmore in subsequent commits.
|
|
Calling Compress::Raw::Zlib::deflate is fairly expensive.
Relying on the `.=' (concat) operator inside ->zadd operator is
faster, but the method dispatch overhead is noticeable compared
to the original code where we had bare `.=' littered throughout.
Fortunately, `print' and `say' with the PerlIO::scalar IO layer
appears to offer better performance without high method dispatch
overhead. This doesn't allow us to save as much memory as I
originally hoped, but does allow us to rely less on concat
operators in other places and just pass a list of args to
`print' and `say' as a appropriate.
This does reduce scratchpad use, however, allowing for large
memory savings, and we still ->deflate every single $eml.
|
|
This allows us to focus string concatenations in one place to
allow Perl internal scratchpad optimizations to reuse memory.
Calling Compress::Raw::Zlib::deflate repeatedly proves too
expensive in terms of CPU cycles.
|
|
This may help us identify hot spots and reduce pad space
as needed.
|
|
We can work towards delaying zlib context allocations in future
commits, too.
|
|
This seems like the least disruptive way to allow more use of
->zmore when streaming large messages to sockets.
|
|
This will make writev-like use easier for the next commit,
and also future changes where I'll rely more on zlib for
buffering.
|
|
->zflush must return a string to its caller, not undef.
Additionally, {http_out} may be deleted on ->write if ->close
recurses.
This should fix the following errors:
Use of uninitialized value $_[1] in string eq at PublicInbox/HTTP.pm line 211.
E: Can't call method "close" on an undefined value at GzipFilter.pm line 167.
Fixes: a6d50dc1098c01a1 (www: gzip_filter: gracefully handle socket ->write failures, 2022-08-03)
|
|
A few things I noticed while reviewing and evaluating
the PSGI code for JMAP support.
|
|
Socket ->write failures are expected and common for TCP traffic,
especially if it's facing unreliable remote connections. So
just bail out silently if our {gz} field was already clobbered
during the small bit of recursion we hit on ->write failures
from async responses.
This ought to fix some GzipFilter::zflush errors (via $forward
->close from PublicInbox::HTTP) I've been noticing on
deployments running -netd. I'm still unsure as to why I hadn't
seen them before, but it might've only been ignorance on my
part...
Link: https://public-inbox.org/meta/20220802065436.GA13935@dcvr/
|
|
By using the charset specified in the message, web browsers are
more likely to display the raw text properly for human readers.
Inspired by a patch by Thomas Weißschuh:
https://public-inbox.org/meta/20211024214337.161779-3-thomas@t-8ch.de/
Cc: Thomas Weißschuh <thomas@t-8ch.de>
|
|
This will let us modify the response header later to set
a proper charset for Content-Type when displaying raw
messages.
Cc: Thomas Weißschuh <thomas@t-8ch.de>
|
|
Large chunks of our codebase and 3rd-party dependencies do not
use ->{psgi.errors}, so trying to standardize on it was a
fruitless endeavor. Since warn() and carp() are standard
mechanism within Perl, just use that instead and simplify a
bunch of existing code.
|
|
SQLite files may be replaced or removed by admins while
generating a large threads or mailbox responses. Ensure we
don't hold onto DBI handles and associated file descriptors
past their cleanup.
|
|
While each git blob request is treated fairly w.r.t other git
blob requests, responses triggering thousands of git blob
requests can still noticeably increase latency for
less-expensive responses.
Move large mbox results and the nasty all.mbox endpoint to
a low priority queue which only fires once per-event loop
iteration. This reduces the response time of short HTTP
responses while many gigantic mboxes are being downloaded
simultaneously, but still maximizes use of available I/O
when there's no inexpensive HTTP responses happening.
This only affects PublicInbox::WWW users who use
public-inbox-httpd, not generic PSGI servers.
|
|
While both git and libgit2 take around 16 minutes to load 100K
alternates there's already a proposed patch to make git faster:
<https://lore.kernel.org/git/20210624005806.12079-1-e@80x24.org/>
It's also easier to patch and install git locally since the
git.git build system defaults to prefix=$HOME and dealing with
dynamic linking with libgit2 is more difficult for end users
relying on Inline::C.
libgit2 remains in use for the non-ALL.git case, but maybe it's
not necessary (libgit2 is significantly slower than git in
Debian 10 due to SHA-1 collision checking).
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
{ibx} is shorter and is the most prevalent abbreviation
in indexing and IMAP code, and the `$ibx' local variable
is already prevalent throughout.
In general, the codebase favors removal of vowels in variable
and field names to denote non-references (because references are
"lighter" than non-references).
So update WWW and Filter users to use the same code since
it reduces confusion and may allow easier code sharing.
|
|
Although the ->async_next method does not take $self as
a receiver, but rather a PublicInbox::HTTP object, we may
still retrieve it to be called with the HTTP object via
UNIVERSAL->can.
|
|
We actually don't do anything with {env} or {'psgix.io'}
on client aborts, so checking the truthiness of '{forward}'
is necessary.
|
|
Since -edit and -purge should be rare and TOCTOU around them
rarer still; missing {blobs} could be indicative of a real bug
elsewhere. Warn on them.
And I somehow ended up with 3 different field names for Inbox
objects. Perhaps they'll be made consistent in the future.
|
|
While all the {async_next} callbacks needed eval guards anyways
because of DS->write, {async_eml} callbacks did not.
Ensure any bugs in our code or data corruption result in
termination of the HTTP connection, so as not to leave clients
hanging on a response which never comes or is mangled in some
way.
|
|
We no longer favor getline+close for streaming PSGI responses
when using public-inbox-httpd. We still support it for other
PSGI servers, though.
|
|
Z_FINISH is the default for Compress::Raw::Zlib::Deflate->flush,
anyways, so there's no reason to import it. And none of C::R::Z
is needed in WwwText now that gzf_maybe handles it all.
|
|
Virtually all of our responses are going to be gzipped, anyways.
This will allow us to utilize zlib as a buffering layer and
share common code for async blob retrieval responses.
To streamline this and allow GzipFilter to be a parent class,
we'll replace the NoopFilter with a similar CompressNoop class
which emulates the two Compress::Raw::Zlib::Deflate methods we
use.
This drops a bunch of redundant code and will hopefully make
upcoming WwwStream changes easier to reason about.
|
|
This will allow us to gzip responses generated by cgit
and any other CGI programs or long-lived streaming
responses we may spawn.
|
|
This simplifies callers, as witnessed by the change to
WwwListing. It adds overhead to NoopFilter, but NoopFilter
should see little use as nearly all HTTP clients request gzip.
|
|
The new ->zmore and ->zflush APIs make it possible to replace
existing verbose usages of Compress::Raw::Deflate and simplify
buffering logic for streaming large gzipped data.
One potentially user visible change is we now break the mbox.gz
response on zlib failures, instead of silently continuing onto
the next message. zlib only seems to fail on OOM, which should
be rare; so it's ideal we drop the connection anyways.
|
|
The changes to GzipFilter here may be beneficial for building
HTML and XML responses in other places, too.
|
|
Our most common endpoints deserve to be gzipped.
|
|
Plack::Middleware::Deflater forces us to use a memory-intensive
closure. Instead, work towards building compressed strings in
memory to reduce the overhead of buffering large HTML output.
|
|
We currently don't use bytes::length in ->write, so there's no
need to `use bytes'. Favor `//=' to describe the intent of the
conditional assignment since the C::R::Z::Deflate object is
always truthy. Also use the local $gz variable to avoid
unnecessary {gz} hash lookups.
|
|
zlib contexts are memory-intensive, particularly when used for
compression. Since the gzip filter may be sitting in a limiter
queue for a long period, delay the allocation we actually have
data to translate, and not a moment sooner.
|
|
We'll be supporting gzipped from sqlite3(1) dumps
for altid files in future commits.
In the future (and if we survive), we may replace
Plack::Middleware::Deflater with our own GzipFilter to work
better with asynchronous responses without relying on
memory-intensive anonymous subs.
|