about summary refs log tree commit homepage
path: root/lib/PublicInbox/WwwAttach.pm
DateCommit message (Collapse)
2022-03-22www: loosen deep-linking prevention
Apparently some browsers can set a Referer: header which fails to match. I'm not certain why, but making "$schema://$HOST_PORT" matches case-insensitive seems more correct regardless. In case that doesn't work, we'll also allow bypassing deep-link prevention via a POST form button. Reported-by: Vlastimil Babka <vbabka@suse.cz> Link: https://public-inbox.org/meta/93ebfbd1-9924-481c-4edc-9b232d1e995c@suse.cz/
2021-08-28get rid of unnecessary bytes::length usage
The only place where we could return wide characters with -httpd was the raw $INBOX_DIR/description text, which is now converted to octets. All daemon (HTTP/NNTP/IMAP) sockets are opened in binary mode, so length() and bytes::length() are equivalent on reads. For socket writes, any non-octet data would warn about wide characters and we are strict in warnings with test_httpd. All gzipped buffers are also octets, as is PublicInbox::Eml->body, and anything from PerlIO objects ("git cat-file --batch" output, filesystems), so bytes::length was unnecessary in all those places.
2021-03-11www_attach: remove unnecessary parse_content_type import
It's not needed since we have the handy eml->ct method.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-09treewide: replace {-inbox} with {ibx} for consistency
{ibx} is shorter and is the most prevalent abbreviation in indexing and IMAP code, and the `$ibx' local variable is already prevalent throughout. In general, the codebase favors removal of vowels in variable and field names to denote non-references (because references are "lighter" than non-references). So update WWW and Filter users to use the same code since it reduces confusion and may allow easier code sharing.
2020-11-24wwwattach: prevent deep-linking via Referer match
This prevents `<img src=' tags from being used to deep-link image attachments from HTML outside of the current host and reduces potential for abuse. Some browsers (e.g. Firefox) favor content detection and will display images irrespective of the Content-Type header being "application/octet-stream", and "Content-Disposition: attachment" doesn't stop them, either. Tested with dillo and Firefox. Reported-by: Leah Neukirchen <leah@vuxu.org>
2020-08-01www: rework async_* to use method table
Although the ->async_next method does not take $self as a receiver, but rather a PublicInbox::HTTP object, we may still retrieve it to be called with the HTTP object via UNIVERSAL->can.
2020-07-06wwwattach: support async blob retrievals
We can reuse some of the GzipFilter infrastructure used by other WWW components to handle slow blob retrieval, here. The difference from previous changes is we don't decide on the 200 status code until we've retrieved the blob and found the attachment. While we're at it, ensure we can compress text attachment responses once again, since all text attachments are served as text/plain.
2020-05-10emlcontentfoo: drop the {discrete} and {composite} fields
We don't have to worry about compatibility with old installations of Email::MIME::ContentType any longer, so save some space.
2020-05-09EmlContentFoo: Email::MIME::ContentType replacement
Since we're getting rid of Email::MIME, get rid of Email::MIME::ContentType, too; since we may introduce speedups down the line specific to our codebase.
2020-05-09replace most uses of PublicInbox::MIME with Eml
PublicInbox::Eml has enough functionality to replace the Email::MIME-based PublicInbox::MIME.
2020-05-09msg_iter: pass $idx as a scalar, not array
This doesn't make any difference for most multipart messages (or any single part messages). However, this starts having space savings when parts start nesting. It also slightly simplifies callers.
2020-05-09msg_iter: make ->each_part method for PublicInbox::MIME
The reliance on Email::MIME->subparts is a tad inefficient with a work-in-progress module to replace Email::MIME. So move towards using ->each_part as a class-specific iterator which can take advantage of more class-specific optimizations in the yet-to-be-revealed PublicInbox::Eml and PublicInbox::Gmime classes. The msg_iter() sub remains for compatibility with existing 3rd-party scripts/modules which use our small public Perl API and Email::MIME.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-01-12www: discard multipart parent on iteration
We're often iterating through messages while writing to another buffer in our WWW interface, causing memory usage to multiply. Since we know we won't need to keep the MIME object around in some cases, and can tell msg_iter to clobber the on-stack variable while it operates on subparts of multipart messages. With xt/mem-msgview.t switched to multipart from the previous commit, this shows a 13 MB memory reduction on that test.
2019-12-27wwwattach: avoid anonymous sub for msg_iter
We can pass arguments to msg_iter for msg_iter to pass to our user-supplied callback, now.
2019-09-09run update-copyrights from gnulib for 2019
2019-06-04wwwattach: only pass the charset through if ASCII
AFAIK all names of charsets are ASCII, so passing non-ASCII characters from emails to clients would probably confuse clients.
2019-02-13ensure bytes::length is available to callers
We were relying on Danga::Socket using the "bytes" pragma, previously. Nowadays, the "bytes" pragma is not recommended in general, but bytes::length remains acceptable for getting the byte-size of a scalar.
2018-02-28use PublicInbox::MIME consistently
It works around some bugs in older Email::MIME which we'll find useful.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-01-10introduce PublicInbox::MIME wrapper class
This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2016-06-20feed: various object-orientation cleanups
Favor Inbox objects as our primary source of truth to simplify our code. This increases our coupling with PSGI to make it easier to write tests in the future. A lot of this code was originally designed to be usable standalone without PSGI or CGI at all; but that might increase development effort.
2016-05-19www: support downloading attachments
This can be useful for lists where the convention is to attach (rather than inline) patches into the message body.