Date | Commit message (Collapse) |
|
Just let Plack::Util::run_app catch the error and generate
a 500 response for it.
|
|
Large chunks of our codebase and 3rd-party dependencies do not
use ->{psgi.errors}, so trying to standardize on it was a
fruitless endeavor. Since warn() and carp() are standard
mechanism within Perl, just use that instead and simplify a
bunch of existing code.
|
|
The only place where we could return wide characters with -httpd
was the raw $INBOX_DIR/description text, which is now converted
to octets.
All daemon (HTTP/NNTP/IMAP) sockets are opened in binary mode,
so length() and bytes::length() are equivalent on reads. For
socket writes, any non-octet data would warn about wide characters
and we are strict in warnings with test_httpd.
All gzipped buffers are also octets, as is PublicInbox::Eml->body,
and anything from PerlIO objects ("git cat-file --batch" output,
filesystems), so bytes::length was unnecessary in all those places.
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
{ibx} is shorter and is the most prevalent abbreviation
in indexing and IMAP code, and the `$ibx' local variable
is already prevalent throughout.
In general, the codebase favors removal of vowels in variable
and field names to denote non-references (because references are
"lighter" than non-references).
So update WWW and Filter users to use the same code since
it reduces confusion and may allow easier code sharing.
|
|
The resulting OID ("oid_b") is a required arg and part of
$env->{PATH_INFO}, instead; so it's never part of an optional
query parameter.
|
|
This means we need to filter out "" from query parameters.
While we're at it, update comments for the WWW endpoint.
|
|
git-cat-file(1) may return less than the $BIN_DETECT value for
some blobs, so ensure we repopulate the values in $ctx for
retries in that case, otherwise we'll lose `$ctx->{-res}' and
die when attempting to use `undef' as an array ref.
|
|
And use Exporter to make our life easier, since WwwAltId was
using a non-existent PublicInbox::WwwResponse namespace in error
paths which doesn't get noticed by `perl -c' or exercised by
tests on normal systems.
Fixes: 6512b1245ebc6fe3 ("www: add endpoint to retrieve altid dumps")
|
|
The ->getline API is only useful for limiting memory use when
streaming responses containing multiple emails or log messages.
However it's unnecessary complexity and overhead for callers
(PublicInbox::HTTP) when there's only a single message.
|
|
Instead, we add CRLF conversion to the only remaining place
which needs it, ViewVCS. This save many redundant ops in in
many places.
The only other place where this mattered was in
View::add_text_body, but we already started doing CRLF
conversions when we added diff parsing and link generation for
ViewVCS. Otherwise, all other places we used this was for
header viewing and Email::MIME doesn't preserve CRLF in headers.
|
|
I didn't wait until September to do it, this year!
|
|
We use the same idiom in many places for doing two-step
linkification and HTML escaping. Get rid of an outdated
comment in flush_quote while we're at it.
|
|
Get rid of the confusingly named {rv} and {tip} fields
and unify them into {obuf} for readability.
{obuf} usage may be expanded to more areas in the future. This
will eventually make it easier for us to experiment with
alternative buffering schemes.
|
|
This allows us to get rid of the requirement to capture
on-stack variables with an anonymous sub, as illustrated
with the update to viewvcs to take advantage of this.
v2: fix error handling for missing OIDs
|
|
No need to create a new sub for every HTML page we render
with our VCS viewer.
|
|
Callers can supply an arg to parse_hdr, now, eliminating the
need for closures to capture local variables.
|
|
By passing a user-supplied arg to $qx_cb, we can eliminate the
callers' need to capture on-stack variables with a closure.
This saves several kilobytes of memory allocation at the expense
of some extra hash table lookups in user-supplied callbacks. It
also reduces the risk of memory leaks by eliminating a common
source of circular references.
|
|
Expose MAX_SIZE via "our" will make it possible
to use in tests, and configure, later.
Additionally, returning HTTP 500 code for big files is not an
Internal Server Error, just a memory limit... Some browsers
won't show our HTML response with the link to the raw file in
case of errors, either, so we'll return 200 to ensure users can
use the link to access the raw blob.
Finally, throw in some tests to the existing solver_git testcase,
since that was incomplete and was pointlessly loading Plack
modules without testing PSGI.
|
|
Although we always unlink temporary files, give them a
meaningful name so that we can we can still make sense
of the pre-unlink name when using lsof(8) or similar
tools on Linux.
|
|
Streaming large blobs can take multiple iterations of the event
loop in our -httpd; so we must not let the File::Temp::Dir
result go out-of-scope when streaming large blobs created from
patches.
|
|
No need to scan the entire string, but prefer to match git
behavior. This might be faster if/when Perl can create
substrings efficiently using CoW.
Fix a 80-column violation while we're at it.
|
|
Eventually, we'll have special displays for various git objects
(commit, tree, tag). But for now, we'll just use git-show
to spew whatever comes from git.
|
|
Not entirely sure what is causing this, but it appears to
be causing infinite loops when attempting to display certain
blobs.
Fortunately, the fair scheduling of public-inbox-httpd prevented
this from becoming a real problem aside from increasing CPU
usage.
|
|
We were relying on Danga::Socket using the "bytes" pragma,
previously. Nowadays, the "bytes" pragma is not recommended in
general, but bytes::length remains acceptable for getting the
byte-size of a scalar.
|
|
We want to be able to take advantage of this in other modules
|
|
It turns out there's no point in having multiple instances of
this or having to worry about destruction or destruction
ordering.
This will make it easier to reuse the one instance we have
across different modules.
|
|
Favor in-place utf8::decode since it's a bit faster without
method dispatch overhead; and don't care about validity just
yet.
HlMod->do_hl itself should return "utf8" strings, since other
parts of our code can use it, so it's not the job of ViewVCS to
post-process HlMod output.
|
|
Only to be pedantic...
|
|
|
|
This will become critical for future changes to display
git commits, diffs, and trees.
Use "qspawn.wcb" instead of "qspawn.response" to enhance
readability.
|
|
Forking off git-cat-file here for streaming large blobs is
reasonably efficient, at least no worse than using
git-http-backend for serving clones. So let our limiter
framework deal with it.
git itself isn't great for large files, and AFAIK there's no
stable/widely-available mechanisms for reading smaller chunks
of giant blobs in git itself.
Tested with some giant GPU headers in the Linux kernel.
|
|
Proper ordering of destruction seems required to avoid segfaults
at shutdown.
|
|
We need to post-process "highlight" output to ensure it doesn't
contain odd bytes which cause "wide character" warnings or
require odd glyphs in source form.
|
|
And update 216dark.css to match a color scheme I'm used to;
which is fairly minimal and doesn't use all the classes
"highlight" provides.
|
|
SolverGit::ERR already writes the exception to the debug
log before calling {user_cb}, so there's no need for viewvcs
to append it.
|
|
The psgi_qx routine in the now-abandoned "repobrowse" branch
allows us to break down blob-solving at each process execution
point. It reuses the Qspawn facility for git-http-backend(1),
allowing us to limit parallel subprocesses independently of Perl
worker count.
This is actually a 2-3% slower a fully-synchronous execution;
but it is fair to other clients as it won't monopolize the server
for hundreds of milliseconds (or even seconds) at a time.
|
|
We need to keep line-numbers from <a> tags synced to the actual
line numbers in the code when working in smaller viewports.
Maybe I only work on reasonable projects, but excessively
long lines seem to be less of a problem in code than they are
in emails.
|
|
As with our use of the trailing slash in $MESSAGE_ID/T/ and
'$MESSAGE_ID/t/' endpoints, this for 'wget -r --mirror'
compatibility as well as allowing sysadmins to quickly stand up
a static directory with "index.html" in it to reduce load.
|
|
Meaningful names in URLs are nice, and it can make
life easier for supporting syntax-highlighting
|
|
|