From: Eric Wong <e@yhbt.net> To: meta@public-inbox.org Subject: [PATCH 12/43] qspawn: learn to gzip streaming responses Date: Sun, 5 Jul 2020 23:27:28 +0000 [thread overview] Message-ID: <20200705232759.3161-13-e@yhbt.net> (raw) In-Reply-To: <20200705232759.3161-1-e@yhbt.net> This will allow us to gzip responses generated by cgit and any other CGI programs or long-lived streaming responses we may spawn. --- lib/PublicInbox/GzipFilter.pm | 16 ++++++++++++++++ lib/PublicInbox/Qspawn.pm | 6 ++++-- t/httpd-corner.psgi | 7 +++++++ t/httpd-corner.t | 9 ++++++++- 4 files changed, 35 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/GzipFilter.pm b/lib/PublicInbox/GzipFilter.pm index 0fbb4476a..0a6c56a5d 100644 --- a/lib/PublicInbox/GzipFilter.pm +++ b/lib/PublicInbox/GzipFilter.pm @@ -32,6 +32,22 @@ sub gzf_maybe ($$) { bless { gz => $gz }, __PACKAGE__; } +sub qsp_maybe ($$) { + my ($res_hdr, $env) = @_; + return if ($env->{HTTP_ACCEPT_ENCODING} // '') !~ /\bgzip\b/; + my $hdr = join("\n", @$res_hdr); + return if $hdr !~ m!^Content-Type\n + (?:(?:text/(?:html|plain))| + application/atom\+xml)\b!ixsm; + return if $hdr =~ m!^Content-Encoding\ngzip\n!smi; + return if $hdr =~ m!^Content-Length\n[0-9]+\n!smi; + return if $hdr =~ m!^Transfer-Encoding\n!smi; + # in case Plack::Middleware::Deflater is loaded: + return if $env->{'plack.skip-deflater'}++; + push @$res_hdr, @GZIP_HDRS; + bless {}, __PACKAGE__; +} + sub gzip_or_die () { my ($gz, $err) = Compress::Raw::Zlib::Deflate->new(%OPT); $err == Z_OK or die "Deflate->new failed: $err"; diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm index d395a10b3..88b6d390a 100644 --- a/lib/PublicInbox/Qspawn.pm +++ b/lib/PublicInbox/Qspawn.pm @@ -25,8 +25,8 @@ package PublicInbox::Qspawn; use strict; -use warnings; use PublicInbox::Spawn qw(popen_rd); +use PublicInbox::GzipFilter; # n.b.: we get EAGAIN with public-inbox-httpd, and EINTR on other PSGI servers use Errno qw(EAGAIN EINTR); @@ -255,7 +255,9 @@ sub psgi_return_init_cb { my ($self) = @_; my $r = rd_hdr($self) or return; my $env = $self->{psgi_env}; - my $filter = delete $env->{'qspawn.filter'}; + my $filter = delete $env->{'qspawn.filter'} // + PublicInbox::GzipFilter::qsp_maybe($r->[1], $env); + my $wcb = delete $env->{'qspawn.wcb'}; my $async = delete $self->{async}; if (scalar(@$r) == 3) { # error diff --git a/t/httpd-corner.psgi b/t/httpd-corner.psgi index 446296200..cb41cfa05 100644 --- a/t/httpd-corner.psgi +++ b/t/httpd-corner.psgi @@ -94,6 +94,13 @@ my $app = sub { return $qsp->psgi_return($env, undef, sub { [ 200, [ qw(Content-Type application/octet-stream)]] }); + } elsif ($path eq '/psgi-return-compressible') { + require PublicInbox::Qspawn; + my $cmd = [qw(echo goodbye world)]; + my $qsp = PublicInbox::Qspawn->new($cmd); + return $qsp->psgi_return($env, undef, sub { + [200, [qw(Content-Type text/plain)]] + }); } elsif ($path eq '/psgi-return-enoent') { require PublicInbox::Qspawn; my $cmd = [ 'this-better-not-exist-in-PATH'.rand ]; diff --git a/t/httpd-corner.t b/t/httpd-corner.t index 681486550..514672a1b 100644 --- a/t/httpd-corner.t +++ b/t/httpd-corner.t @@ -340,11 +340,18 @@ SKIP: { is($n, 30 * 1024 * 1024, 'got expected output from curl'); is($non_zero, 0, 'read all zeros'); - require_mods(@zmods, 2); + require_mods(@zmods, 4); my $buf = xqx([$curl, '-sS', "$base/psgi-return-gzip"]); is($?, 0, 'curl succesful'); IO::Uncompress::Gunzip::gunzip(\$buf => \(my $out)); is($out, "hello world\n"); + my $curl_rdr = { 2 => \(my $curl_err = '') }; + $buf = xqx([$curl, qw(-sSv --compressed), + "$base/psgi-return-compressible"], undef, $curl_rdr); + is($?, 0, 'curl --compressed successful'); + is($buf, "goodbye world\n", 'gzipped response as expected'); + like($curl_err, qr/\bContent-Encoding: gzip\b/, + 'curl got gzipped response'); } {
next prev parent reply other threads:[~2020-07-05 23:28 UTC|newest] Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-07-05 23:27 [PATCH 00/43] www: async git cat-file w/ -httpd Eric Wong 2020-07-05 23:27 ` [PATCH 01/43] gzipfilter: minor cleanups Eric Wong 2020-07-05 23:27 ` [PATCH 02/43] wwwstream: oneshot: perform gzip without middleware Eric Wong 2020-07-05 23:27 ` [PATCH 03/43] www*stream: gzip ->getline responses Eric Wong 2020-07-05 23:27 ` [PATCH 04/43] wwwtext: gzip text/plain responses, as well Eric Wong 2020-07-05 23:27 ` [PATCH 05/43] wwwtext: switch to html_oneshot Eric Wong 2020-07-05 23:27 ` [PATCH 06/43] www: need: use WwwStream::html_oneshot Eric Wong 2020-07-05 23:27 ` [PATCH 07/43] wwwlisting: use GzipFilter for HTML Eric Wong 2020-07-05 23:27 ` [PATCH 08/43] gzipfilter: replace Compress::Raw::Deflate usages Eric Wong 2020-07-05 23:27 ` [PATCH 09/43] {gzip,noop}filter: ->zmore returns undef, always Eric Wong 2020-07-05 23:27 ` [PATCH 10/43] mbox: remove html_oneshot import Eric Wong 2020-07-05 23:27 ` [PATCH 11/43] wwwstatic: support gzipped directory listings Eric Wong 2020-07-05 23:27 ` Eric Wong [this message] 2020-07-05 23:27 ` [PATCH 13/43] stop auto-loading Plack::Middleware::Deflater Eric Wong 2020-07-05 23:27 ` [PATCH 14/43] mboxgz: do asynchronous git blob retrievals Eric Wong 2020-07-05 23:27 ` [PATCH 15/43] mboxgz: reduce hash depth Eric Wong 2020-07-05 23:27 ` [PATCH 16/43] mbox: async blob fetch for "single message" raw mboxrd Eric Wong 2020-07-05 23:27 ` [PATCH 17/43] wwwatomstream: simplify feed_update callers Eric Wong 2020-07-05 23:27 ` [PATCH 18/43] wwwatomstream: use PublicInbox::Inbox->modified for feed_updated Eric Wong 2020-07-05 23:27 ` [PATCH 19/43] wwwatomstream: reuse $ctx as $self Eric Wong 2020-07-05 23:27 ` [PATCH 20/43] xt/httpd-async-stream: allow more options Eric Wong 2020-07-05 23:27 ` [PATCH 21/43] wwwatomstream: support async blob fetch Eric Wong 2020-07-05 23:27 ` [PATCH 22/43] wwwstream: reduce object graph depth Eric Wong 2020-07-05 23:27 ` [PATCH 23/43] wwwstream: reduce blob fetch paths for ->getline Eric Wong 2020-07-05 23:27 ` [PATCH 24/43] www: start making gzipfilter the parent response class Eric Wong 2020-07-05 23:27 ` [PATCH 25/43] remove unused/redundant zlib-related imports Eric Wong 2020-07-05 23:27 ` [PATCH 26/43] wwwstream: use parent.pm and no warnings Eric Wong 2020-07-05 23:27 ` [PATCH 27/43] wwwstream: subclass off GzipFilter Eric Wong 2020-07-05 23:27 ` [PATCH 28/43] view: make /$INBOX/$MSGID/ permalink async Eric Wong 2020-07-05 23:27 ` [PATCH 29/43] view: /$INBOX/$MSGID/t/ reads blobs asynchronously Eric Wong 2020-07-05 23:27 ` [PATCH 30/43] view: update /$INBOX/$MSGID/T/ to be async Eric Wong 2020-07-05 23:27 ` [PATCH 31/43] feed: generate_i: eliminate pointless loop Eric Wong 2020-07-05 23:27 ` [PATCH 32/43] feed: /$INBOX/new.html fetches blobs asynchronously Eric Wong 2020-07-05 23:27 ` [PATCH 33/43] ssearchview: /$INBOX/?q=$QUERY&x=t uses async blobs Eric Wong 2020-07-05 23:27 ` [PATCH 34/43] view: eml_entry: reduce parameters Eric Wong 2020-07-05 23:27 ` [PATCH 35/43] view: /$INBOX/$MSGID/t/: avoid extra hash lookup in eml case Eric Wong 2020-07-05 23:27 ` [PATCH 36/43] wwwstream: eliminate ::response, use html_oneshot Eric Wong 2020-07-05 23:27 ` [PATCH 37/43] www: update internal docs Eric Wong 2020-07-05 23:27 ` [PATCH 38/43] view: simplify eml_entry callers further Eric Wong 2020-07-05 23:27 ` [PATCH 39/43] wwwtext: simplify gzf_maybe use Eric Wong 2020-07-05 23:27 ` [PATCH 40/43] wwwattach: support async blob retrievals Eric Wong 2020-07-05 23:27 ` [PATCH 41/43] gzipfilter: drop HTTP connection on bugs or data corruption Eric Wong 2020-07-05 23:27 ` [PATCH 42/43] daemon: warn on missing blobs Eric Wong 2020-07-05 23:27 ` [PATCH 43/43] gzipfilter: check http->{forward} for client disconnects Eric Wong
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: https://public-inbox.org/README * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200705232759.3161-13-e@yhbt.net \ --to=e@yhbt.net \ --cc=meta@public-inbox.org \ --subject='Re: [PATCH 12/43] qspawn: learn to gzip streaming responses' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/public-inbox.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).