public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2018-04-18	use %H consistently to disable abbreviations
	We generally do not want git to waste time finding abbreviations and we do not want the possibility of them becoming ambiguous over time, either.
2018-04-18	feed: respect feedmax, again
	Gigantic feeds probably make some clients unhappy, clamp it to what it was in the past. Fixes: b9534449ecce2c59 ("view: avoid offset during pagination")
2018-04-03	view: avoid offset during pagination
	OFFSET in SQLite gets painful to deal with. Instead, rely on timestamps (from Received:) for pagination. This also sets us up for more precise Date searching in case we want it.
2018-04-02	www: rework query responses to avoid COUNT in SQLite
	In many cases, we do not care about the total number of messages. It's a rather expensive operation in SQLite (Xapian only provides an estimate). For LKML, this brings top-level /$INBOX/ loading time from ~375ms to around 60ms on my system. Days ago, this operation was taking 800-900ms(!) for me before introducing the SQLite overview DB.
2018-03-30	feed: optimize query for feeds, too
	This is a smaller improvement than the landing /$INBOX/ page because full message bodies are shown; but still saves around 100ms for my system with LKML.
2018-03-27	view: depend on SearchMsg for Message-ID
	Since we need to handle messages with multiple and duplicate Message-ID headers, our thread skeleton display must account for that. Since we have a "preferred" Message-ID in case of conflicts, use it as the UUID in an Atom feed so readers do not get confused by conflicts.
2018-03-23	feed: fix new.html for v2
	I forget this endpoint is still accessible (even if not linked). This also simplifies new.html all around and removes some unused clutter from the old days while we're at it.
2018-03-22	feed: $INBOX/new.atom endpoint supports v2 inboxes
	We can no longer rely on tree name lookups for v2. This also optimizes v1 by relying on git blob object_id lookups while avoiding process spawning overhead for "git log".
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-01-10	introduce PublicInbox::MIME wrapper class
	This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2016-12-17	feed: support publicinbox.<name>.feedmax
	This allows users to customize by using smaller or larger Atom feeds than the default value of 25 entries.
2016-12-03	atom: switch to getline/close for response bodies
	This will let us stream larger Atom documents bodies without wasting too much memory and reduce the amount of round-trip requests needed to get necessary information. Hopefully clients are using streaming (SAX) parsers, too. This is the final transition in the core public-inbox code to allow migrating to a "pull"-based body streaming scheme which allows a HTTP server to respond appropriately to backpressure from slow clients.
2016-08-14	www: do not double-clean Message-IDs from internal DBs
	Ensure we usually strip one level of '<>' from Message-IDs, since our internal SQLite, Xapian, and SHA-1 storage all assume that. Realistically, we screw up if somebody has '<<' or '>>', but those are screwed up mail clients and we can deal with it another time. Currently, this means some messages with '>>' in References or Message-Id are not handled correctly, yet, but we match the behavior of Mail::Thread in keeping the extra '>'.
2016-08-14	www: do not unecessarily escape some chars in paths
	Based on reading RFC 3986, it seems '@', ':', '!', '$', '&', "'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed in path-absolute where we have the Message-ID. In any case, it seems '@' is fairly common in path components nowadays and too common in Message-IDs.
2016-08-06	www: use <hr> to delimit messages in /new.html view, too
	This is necessary to delimit messages when viewed without threading.
2016-07-09	feed: remove dead code and unneeded use
	We've cleaned up our code in recent days and WwwStream provides a consistent header for our HTML pages.
2016-07-09	www: cleanup parameter passing
	Reduce the size of hashes a bit and drops some unneeded hash lookups for uncommon paths.
2016-07-07	www: remove old footer generation code and normalize new.html
	We now generate all of our HTML using WwwStream which forces us to have consistent headers and footers in the HTML itself. This also makes the search-capable vs search-less installs go to the new.html endpoint to maintain consistency (in case an admin decides to enable Xapian).
2016-07-06	feed: fix links to attachments in Atom feed
	Oops...
2016-07-06	www: use HTML <hr> instead of XHTML <hr />
	We only need XHTML-compatibility inside Atom feeds, as anecdotally, feed readers are stricter than normal browsers and some do not support HTML, only XHTML. So we will continue to accomodate them. However we favor HTML elsewhere since it tends to be smaller than the equivalent well-formed XHTML.
2016-07-02	www: remove Plack::Request dependency entirely
	Lighter and ever-so-slightly faster! Most importantly, this won't do non-obvious stuff behind our backs like trying to parse a POST request body for a query string param.
2016-07-02	inbox: base_url method takes PSGI env hashref instead
	This is lighter and we can work further towards eliminating our Plack::Request dependency entirely.
2016-06-30	www_stream: add response wrapper sub
	This encapsulates an entire PSGI response array, hopefully making it easier to generate responses and avoid typos when setting the Content-Type.
2016-06-30	feed: add $INBOX/new.html endpoint
	This acts like the Atom feed; but should be viewable directly from browsers.
2016-06-30	view: merge $state hash with existing $ctx
	This reduces the level of indirection to reach certain objects within the hash and there are no namespace or lifetime conflicts anyways.
2016-06-30	view: show thread context in the thread-aware flat view
	This lets user have a small window of the context of the current message relative to other threads.
2016-06-30	www: use WwwStream for dumping thread and search views
	This allows us the HTTP server to react to backpressure from slow clients when writing. As a side effect, this also makes it easier for us to maintain a consistent header/footer across our HTML.
2016-06-25	address: remove Address::from_name
	Address::names is sufficient to handle what from_name did.
2016-06-20	feed: various object-orientation cleanups
	Favor Inbox objects as our primary source of truth to simplify our code. This increases our coupling with PSGI to make it easier to write tests in the future. A lot of this code was originally designed to be usable standalone without PSGI or CGI at all; but that might increase development effort.
2016-06-20	feed: remove dependence on fh->write for streaming
	We'll be switching to a getline/close response body to give the HTTP server more control when dealing with slow clients.
2016-06-20	feed: avoid needless method dispatches on 404
	We overuse streaming, here. Allow Content-Length to be calculated in this case.
2016-06-17	feed: split out top-of-page generation
	This will eventually allow us to reuse code to generate a common header.
2016-05-30	www: remove a few more Plack::Request dependencies
	Still a work in progress, but SearchView no longer depends on Plack::Request at all and Feed is getting there. We now parse all query parameters up front, but we may do that lazily again in the future.
2016-05-25	remove Email::Address dependency
	git has stricter requirements for ident names (no '<>') which Email::Address allows. Even in 1.908, Email::Address also has an incomplete fix for CVE-2015-7686 with a DoS-able regexp for comments. Since we don't care for or need all the RFC compliance of Email::Address, avoiding it entirely may be preferable. Email::Address will still be installed as a requirement for Email::MIME, but it is only used by the Email::MIME::header_str_set which we do not use
2016-05-21	localize $/ in more places to avoid potential problems
	This hopefully makes the intent of the code clearer, too. The the HTTP use of the numeric reference for getline caused problems in Git.pm, already.
2016-05-19	www: support downloading attachments
	This can be useful for lists where the convention is to attach (rather than inline) patches into the message body.
2016-05-18	feed: inline feed entry generation
	Remove unnecessary wrapper subroutines and constants which are only used once.
2016-05-16	www: fix for running under mount paths
	We try to avoid issues like these by using relative URLs in hrefs, but we can't avoid the problem with Location: for redirects and Atom feeds which are likely to be rehosted elsewhere. We also reorder some of the code to work around a weird issue on the psgi-plack mailing list: <20160516073750.GA11931@dcvr.yhbt.net> (Somewhere on https://groups.google.com/group/psgi-plack but it's probably not bookmarkable)
2016-05-14	rename most instances of "list" to "inbox"
	A public-inbox is NOT necessarily a mailing list, but it could serve as an input point for zero, one, or infinite mailing lists :D
2016-04-25	www: add rel=next and rel=prev navigation hints
	This can makes navigation easier with some browsers or or browser extensions. ref: https://www.w3.org/TR/html/links.html#sequential-link-types
2016-04-13	www: stop generating /$MESSAGE_ID/f/ links
	Quote-folding can be detrimental as it fails to hide the real problem of over-quoting. Over-quoting wastes bandwidth and space for all readers, not just WWW readers of the public-inbox. So hopefully removing quote-folding support from the WWW interface can shame those repliers into quoting only relevant portions of what they reply to.
2016-04-02	www: various style changes and comment updates
	Reduce stack depth of arguments and rely more on state hashref to store response state. We may end up shoving everything in ctx eventually.
2016-03-12	feed: fix brain farts in new_oneline removal
	Ugh... Fixes: 476fc666c223 (reduce "PublicInbox::Hval->new_oneline" use)
2016-03-12	reduce "PublicInbox::Hval->new_oneline" use
	It's probably a bad idea to strip extraneous whitespace from some headers as an extra space may convey useful information. Newlines don't seem to be preserved by Email::MIME or Email::Simple anyways, so there's no danger in breaking formatting.
2016-03-05	feed: remove unnecessary encoding lookup
	We handle encoding-related things elsewhere.
2016-03-03	use raw header for Message-ID
	Message-IDs should not be MIME encoded, but in case they are, use the raw form for compatibility with ssoma and possibly other tools. This prevents a potential problem where a malicious client could confuse our storage layer into indexing incorrect contents.
2016-02-28	reduce calls to close unless error checks are needed
	We can rely on timely auto-destruction based on reference counting; reducing the chance of redundant close(2) calls which may hit the wront FD. We do care about certain close calls (e.g. writing to a buffered IO handle) if we require error-checking for write-integrity. In other cases, let things go out-of-scope so it can be freed automatically after use.
2016-02-25	remove direct CGI.pm support
	Relying on Plack::Handler::CGI is much easier for long-term maintenance and development. Nowadays, we even include our own httpd implementation to facilitate easier deployment with PSGI/Plack.
2016-02-13	feed: favor relative URL to Atom feed in HTML
	Normal browsers should understand relative path names. Atom feeds may be hosted externally and seems to need full URLs.
2016-02-08	feed: declare alternate link to HTML interface
	It seems we need to declare "alternate" and "text/html" at least for feed readers.