about summary refs log tree commit homepage
path: root/lib/PublicInbox
DateCommit message (Collapse)
2016-05-19www: validate and check filenames in URLs
We shall ensure links continue working for this.
2016-05-19msg_iter: workaround broken Email::MIME versions
Email::MIME >= 1.923 and < 1.935 would drop too many newlines in attachments. This would lead to ugly text files without a proper trailing newline if using quoted-printable, 7bit, or 8bit. Attachments encoded with base64 were not affected. These versions of Email::MIME are widely available in Debian 8 (Jessie) and even Ubuntu LTS distros so we will need to support this workaround for a while.
2016-05-19www: support downloading attachments
This can be useful for lists where the convention is to attach (rather than inline) patches into the message body.
2016-05-19switch read-only uses of walk_parts to msg_iter
msg_iter lets us know the index of the attachment, allow us to make more sensible labels and in a future commit, hyperlinks to download attachments.
2016-05-19msg_iter: new internal API for iterating through MIME
Unlike Email::MIME::walk_parts, this is non-recursive and gives depth + index offset information about the part for creating links for later retrieval It is intended for read-only access and changes are not propagated to the parent; however future versions of it may clobber bodies or the original version as it iterates to reduce memory overhead. It is intended for making it easy to locate attachments within a message in the WWW view.
2016-05-19view: rely on Email::MIME::body_str for decoding
Or is it "encoding"? Gah, Perl character set handling confuses me no matter how many times I RTFM :< This contains placeholders for attachment downloading which will be in a future commit.
2016-05-19nntpd: avoid uninitialized warning
Oops, but at least it was mostly harmless, just ugly. Followup-to: 9bfe40e7a4ac 'nntp: use "newsgroup" instead of "name"''
2016-05-18nntpd: reject control characters entirely
There's no place for them in the commands and we don't take messages; potentially printing them into a log opened in a terminal is too dangerous. Hoist out read_til_dot in the test while we're at it.
2016-05-18view: avoid redirect to reply endpoint
Oops, but perhaps the "reply" endpoint should be embedded into the permalink message view itself to reduce URLs.
2016-05-18feed: inline feed entry generation
Remove unnecessary wrapper subroutines and constants which are only used once.
2016-05-17http: release resources when idle
This lets us release old git processes so unlinked packs (leftover from repacking) can be released. This may also be helpful for Xapian as indices get rebuilt for tuning. For SQLite (msgmap), the there may be no benefit besides reducing FD pressure. Followup changes will unify the Inbox and NewsGroup classes and allow better code-sharing between NNTP and HTTP classes (as well as the planned POP3 class).
2016-05-17view: escape Message-ID for "next" link
Oops, we need to escape Message-IDs since they can contain bad characters such as '%' in them. '@' actually seems fine and does not need to be escaped; however, but we've been doing it forever.
2016-05-16www: fix for running under mount paths
We try to avoid issues like these by using relative URLs in hrefs, but we can't avoid the problem with Location: for redirects and Atom feeds which are likely to be rehosted elsewhere. We also reorder some of the code to work around a weird issue on the psgi-plack mailing list: <20160516073750.GA11931@dcvr.yhbt.net> (Somewhere on https://groups.google.com/group/psgi-plack but it's probably not bookmarkable)
2016-05-16config: allow taking an existing reference
This should make creating test cases easier and faster.
2016-05-16declare Inbox object for reusability
From the beginning, we've avoided objects here in favor of faster startup time; but it may not be worth it since a persistent httpd/nntpd is faster and -mda isn't hit as often.
2016-05-15mbox: support /$INBOX/all.mbox.gz endpoint
Allows easily downloading the entire archive without special tools. In any case, it's not yet advertised to via HTML until we can test it better. It'll also support range queries in the future to avoid wasting bandwidth.
2016-05-15mbox: consistent header order when decompressed
This should make validating the output easier when testing between different servers.
2016-05-15git-http-backend: set cache headers
Mostly stolen from git upstream, these should prevent any caches such as varnish or squid from acting improperly.
2016-05-14rename most instances of "list" to "inbox"
A public-inbox is NOT necessarily a mailing list, but it could serve as an input point for zero, one, or infinite mailing lists :D
2016-05-14nntp: use "newsgroup" instead of "name"
This reduces the cognitive overhead for mapping names of configuration values to internal field names of our classes. Further changes along these lines coming...
2016-05-13nntp: fixup "Wide character" warnings
We need Perl to believe everything we send is UTF-8, make it so, even if it may not be. Fixes: 265e79ff82ce 'Revert "nntp: proper UTF-8 support (hopefully?)"'
2016-05-13Revert "nntp: proper UTF-8 support (hopefully?)"
This reverts commit f81ad477cb013d05b9b11fa051a9ebc5983a5be6. The raw, undecoded body is probably what should be sent over the wire anyways for clients to deal with. We'll need this to avoid deprecation warnings with Perl 5.24+ since we use send()/recv()/sysread().
2016-05-12git-http-backend: do not drop connection on successful finish
We can maintain the client HTTP connection if the process exited with failure as long as we terminated our own response properly.
2016-05-12import: fallback to email if '<>' exists in author name
git doesn't handle '<' and '>' characters in the author name at all regardless of quoting, not just matched pairs. So fall back to using the email as the author name since the commit info isn't critical, anyways (shallow clones are fine).
2016-05-12import: normalize body by stripping trailing newlines
Mbox formatters may add extra newlines at the end of the message, and that's not relevant for comparing messages for deletion.
2016-05-06mbox: sort messages by ascending date
This allows messages to be read in chronological order when read without a mail client (e.g. with "zcat t.mbox.gz | less")
2016-05-03git-http-backend: reduce memory use for clone/fetch
When serving large static files or large packs, we may call Danga::Socket::write directly to queue up callbacks to resume reading and defer firing them until the socket is writable. This prevents us from scheduling writes or buffering until we know the socket is writable and prevents needless buffering by Danga::Socket when faced with slow clients. For smart clones, this comes at the cost of throttling the output of "git pack-objects" to the speed of the client connection. This is probably not ideal, but is the behavior of the standard git-daemon, too; and is preferable to running the httpd out-of-memory. Buffering to the filesystem may be an option in the future...
2016-05-03http: move empty string check into write callback
This empty string check is for middlewares such as Deflater which may write empty strings, not for direct real callers of Danga::Socket who (presumably) know what they're doing.
2016-05-03spawnpp: use native perl %ENV outside of mod_perl
We only need to use env(1) under mod_perl; since mod_perl is uncommon nowadays, support native %ENV for a teeny speedup for folks uncomfortable with running vfork via Inline::C snippet.
2016-05-02nntp: append Archived-At and List-Archive headers
For readers using NNTP, we should do our best to advertise the clonable HTTP/HTTPS URLs and the message permalink URL for ease-of-referencing messages, since we don't want the NNTP server and it's sequential article numbers to be relied on.
2016-05-02view: disable subject threading
Broken threads should be exposed to hopefully encourage people to use proper mail clients which set In-Reply-To headers.
2016-05-02http: remove needless binmode call
Unnecessary on *nix, and we won't support systems which do insane things.
2016-05-02spawn: proper signal handling for vfork
We cannot afford to fire Perl-level signal handlers in the vforked child process since they're not designed to run in the child like that. Thus we need to block all signals before calling vfork, reset signal dispositions in the child, and restore the signal mask in the parent. ref: https://ewontfix.com/7
2016-05-01git-http-backend: use real lseek for Content-Range
Since we use sysread, we must use sysseek for symmetry although PerlIO may be doing a real lseek with "seek", anyways. Fixes: 310819ea86ac ("git-http-backend: favor sysread for regular files")
2016-05-01daemon: reduce timer-related allocations
We can reduce the allocation and overhead needed for Danga::Socket timers for immediately-executed responses by combining identical timers and reducing anonymous sub creation.
2016-05-01mda: export @BAD_HEADERS variable
This should allow users to change and add headers as needed. While we're at it, add the X-Original-To header Postfix likes to add; it seems like pointless bloat with the existence of (important) Received: headers.
2016-05-01linkify: match more URL characters [:,\$] and schemes
Adding ':' (colon), ',' (comma), '$' (dollar sign) and supporting TLS-enabled schemes: ftps, nntps variants as well as gopher :D
2016-05-01linkify: match '~' (tilde) in URLs
Tilde is common for some homepages: http://example.org/~user/ There's probably some other acceptable characters I'm missing.
2016-04-30daemon: graceful shutdown warning and limit removal
git clones may take longer than 30s, much longer... So prepare to wait almost indefinitely for sockets to timeout and document the second signal behavior for immediate shutdown. While we're at it, move parent death handling to a separate class to avoid Danga::Socket->AddOtherFds, since that does not allow proper handling the parent pipe being closed and would actually misterminate a worker prematurely. t/nntpd.t is update to illustrate the failure with workers enabled. We will work to keep memory usage low and let clients take their time without interrupting them.
2016-04-30http: graceful shutdown for pi-httpd.async callers
git clones may take a long time and it's wrong to drop connections in the middle of a transaction.
2016-04-30searchmsg: ensure long subject lines are not broken
Noticed when using a long URL in the subject.
2016-04-29http: avoid lseek if no input
This saves us a system call for common GET/HEAD requests with no upload body.
2016-04-29http: improve error handling for aborted responses
We need to abort connections properly if a response is prematurely truncated. This includes problems with serving static files, since a clumsy admin or broken FS could return truncated responses and inadvertently leave a client waiting (since the client saw "Content-Length" in the header and expected a certain length).
2016-04-29git-http-backend: check EINTR as well as EAGAIN
The blocking PSGI server may cause EINTR to be hit, here.
2016-04-29http: avoid corking on "Content-Length: 0" response
We must use a normal write instead of send(.., MSG_MORE) when writing responses of "Content-Length: 0" to avoid the corking effect MSG_MORE provides. We only want to cork headers if we will send a non-empty body. Fixes: c3eeaf664cf0 ("http: clarify intent for persistence") This needs a proper test.
2016-04-28githttpbackend: clamp to one smart HTTP request at-a-time
Server admins may not be able to afford to have too many git-pack-objects processes running at once. Since PSGI HTTP servers should already be configured to use multiple processes for other requests; limit concurrency of smart backends to one; and fall back to dumb responses if we're already generating a pack.
2016-04-28githttpbackend: fall back to dumb if smart HTTP is off
Using http.getanyfile still keeps the http-backend process alive, so it's better to break out of that process and handle serving entirely within the HTTP server.
2016-04-28import: run git-update-server-info when done
We should update $GIT_DIR/info/refs for dumb HTTP clients whenever we make changes to the repository. The best place to update is immediately after making commits. This fixes a bug where public-inbox-learn did not properly update $GIT_DIR/info/refs after inserting or removing messages.
2016-04-27import: document API for public consumption
This is probably trivial enough to be final?
2016-04-25githttpbackend: require IO::File explicitly
This is used all over the place, but may not be in the future, so ensure we explicitly load it ourselves.