public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2019-01-04	t/cgi.t: remove more redundant tests
	Most of these test cases are in t/plack.t, already; and that runs much faster. Just ensure the slashy corner case and search stuff works. While we're at it, avoid using the public-inbox-index command and just use the internal API to index.
2019-01-04	t/cgi.t: move expected failure tests to t/plack.t
	No point in implementing these slowly with the CGI wrapper when PSGI is sufficient for testing.
2019-01-04	t/cgi.t: move dumb HTTP git clone/fetch tests to plack.t
	No need to test this via CGI .cgi is a wrapper around PSGI and PSGI tests are way faster.
2019-01-04	t/cgi.t: remove atom.xml test
	It is redundant with what is in t/plack.t
2019-01-04	t/cgi.t: remove redundant redirect check
	t/plack.t already has the same test.
2019-01-04	t/cgi.t: eliminate some cruft and unnecessary tests
	More of this test will be, we use PSGI nowadays; and most of these tests can be ported over to use PSGI and not fork+exec as much.
2018-12-29	t/cgi.t: shorten %ENV setting
	No need to write our own loop when an assignment will do.
2018-06-26	www: use undecoded paths for Message-ID extraction
	In PSGI, PATH_INFO contains URI-decoded paths which cause problems when Message-IDs contain ambiguous characters for used for routing. Instead, extract the undecoded path from REQUEST_URI and use that. Reported-by: Leah Neukirchen <leah@vuxu.org> https://public-inbox.org/meta/8736xsb5s5.fsf@vuxu.org/
2018-04-22	extmsg: use Xapian only for partial matches
	"LIKE" in SQLite (and other SQL implementations I've seen) is expensive with nearly 3 million messages in the archives. This caused some partial Message-ID lookups to take over 600ms on my workstation (~300ms on a faster Xeon). Cut that to below under 30ms on average on my workstation by relying exclusively on Xapian for partial Message-ID lookups as we have in the past. Unlike in the past when we tried using Xapian to match partial Message-IDs; we now optimize our indexing of Message-IDs to break apart "words" in Message-IDs for searching, yielding (hopefully) "good enough" accuracy for folks who get long URLs broken across lines when copy+pasting. We'll also drop the (in retrospect) pointless stripping of "/[tTf]" suffixes for the partial match, since anybody who hits that codepath would be hitting an invalid message ID. Finally, limit wildcard expansion to prevent easy DoS vectors on short terms. And blame Pine and alpine for generating Message-IDs with low-entropy prefixes :P
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-02-14	www: do not unescape PATH_INFO twice
	PSGI specs already require PATH_INFO to be unescaped; so our tests were wrong, too.
2016-08-14	www: do not unecessarily escape some chars in paths
	Based on reading RFC 3986, it seems '@', ':', '!', '$', '&', "'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed in path-absolute where we have the Message-ID. In any case, it seems '@' is fairly common in path components nowadays and too common in Message-IDs.
2016-07-07	www: remove old footer generation code and normalize new.html
	We now generate all of our HTML using WwwStream which forces us to have consistent headers and footers in the HTML itself. This also makes the search-capable vs search-less installs go to the new.html endpoint to maintain consistency (in case an admin decides to enable Xapian).
2016-06-17	remove dependency on IPC::Run
	We no longer depend on it for the core code, and tests are optional for users. Hopefully this makes this easier-to-install.
2016-05-14	rename most instances of "list" to "inbox"
	A public-inbox is NOT necessarily a mailing list, but it could serve as an input point for zero, one, or infinite mailing lists :D
2016-05-02	t/*.t: reduce -mda calls
	Process startup times are atrocious for fast tests and there's far too much setup involved. Rely on git-fast-import instead; but more work is needed in this area.
2016-04-25	remove GIT_DIR env usage in favor of --git-dir
	No need to maintain per-block environment state when we can localize it to per-command. We've had --git-dir= in git since 1.4.2 (2006-08-12) and already use it all over the place.
2016-04-15	www: redirect /$MESSAGE_ID/f/ endpoints
	Quote-folding was a major design mistake pre-1.0. Since this project is still in its infancy and unlikely to be in wide use at the moment, redirect the /f/ endpoints back to the plain message.
2016-03-03	t/*.t: use identifiable tempdir names
	This should make identifiying leftover directories due to SIGKILL-ed tests easier.
2016-02-04	t/cgi.t: fix broken test for dumb HTTP
	This should not be dependent on what is in the users' $HOME config, oops.
2016-02-02	www: support git cloning via dumb HTTP
	This is enabled by default, for now. Smart HTTP cloning support will be added later, but it will be optional since it can be highly CPU and memory intensive.
2015-09-06	update copyright headers and email addresses
	In the future, it should be possible to use this: git ls-files \| UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-03	feed: use application/atom+xml for Content-Type
	This is the correct Content-Type for Atom feeds, especially since we updated to use ".atom" as the suffix.
2015-09-03	ExtMsg: 300 to external mailing list archives
	Since cross-posting is inevitable, we shall link to external message archives for interopability.
2015-09-01	completely revamp URL structure to shorten permalinks
	This allows common /m/ links to be used without a prefix, saving 2 precious bytes for permalinks and raw messages. Old URLs continue to redirect.
2015-09-01	implement per-thread Atom feeds
	This allows users to subscribe to only a single thread with their feed reader without subscribing to the rest of the thread. Update our endpoint notes while we're at it.
2015-08-27	wire up to display non-suffixed Message-ID links
	These URLs are preferable in case somebody decides to get cute and use a suffix we would've used to prevent others from linking to their message. The common /m/$MESSAGE_ID/ URLs are now 4 characters shorter so may fit better on terminals.
2015-08-27	wire up shorter, less ambiguous URLs
	We will prefer URLs without suffixes for now to avoid ambiguity in case a Message-ID ends with ".html", ".txt", ".mbox.gz" or any other suffix we may use. Static file compatibility is preserved by using a trailing slash as most servers can/will fall back to an index.html file in this case. For raw text files, we will follow gmane's lead with "/raw"
2015-08-21	switch to gzipped mboxes
	Mboxes may be huge, so only support downloading gzipped mboxes to save bandwidth and to get free checksumming. Streaming output means we should not be wasting too much memory on this unless the chosen server sucks.
2015-08-21	support dumping thread as an mbox
	Some folks may not want to download and install Perl code like ssoma, so allow downloading an mbox containing the entire thread.
2015-08-12	view: consistent ordering of Cc: addresses
	This fixes a minor test failure in t/cgi.t Tested with perl 5.18.2-2ubuntu1 on Ubuntu 14.04.3 LTS
2014-04-21	config: use description file for gitweb
	Do not repeat ourselves, just use the same description file gitweb uses to avoid surprising users.
2014-04-21	feed: there is only one atom feed, with all messages
	This is not a blog. All posts, whether replies or not, carry equal weight.
2014-04-20	use ORIGINAL_RECIPIENT once again
	It should be common for a single users to be subscribed to multiple addresses/lists, so we must use the address before alias expansion. This partially reverts commit b949afc9edf89dd494cac6255c78b124d58e11a5
2014-04-19	mda: rename PI_FAILBOX to PI_EMERGENCY
	The emergency destination may be Maildir. A Maildir emergency destination is better for volatile data which is written to and deleted-from frequently.
2014-04-19	cgi: index pages allow iterating some pagination
	This allows WWW readers to slowly page through the entire history of the mailing list.
2014-04-15	Revert "cgi: relax path restriction for top-level"
	CGI mounts should probably handle this internally. We're reverting this since it adds too much potential for abuse with fake/extra prefixes in the URL. We also need to reorder our redirect handling as a result. This reverts commit c394de9f2c91c2c5ed1f7832a5a7cc0206120b7f.
2014-04-14	cgi: fix up top-level index
	We do not have all messages in the top-level index (and we need to adjust the test while we're at it).
2014-04-14	cgi: 301 for list-indices without trailing slash
	It is common to type upper-level URLs without the slash, redirect users to the correct page for usability.
2014-04-14	t/cgi: cleanup: no need for additional block
	Not sure what I was thinking...
2014-04-12	cgi: ensure we unescape MIDs correctly in URLs
	MIDs may have strange characters in them, so we need to handle escaping/unescaping properly to avoid broken links or worse.
2014-04-12	cgi: relax path restriction for top-level
	We may have something like /foo.cgi/m/$MID.html in there.
2014-04-12	cgi: rename to have .cgi suffix
	This makes it easier to configure for systems which determine a script is a CGI script based on suffix.
2014-04-11	cgi: wire up HTML pages for messages
	These need better tests and verification, but it's something for now.
2014-04-11	cgi: update feed/view and tests for shorter URLs
	Code should be consistent with the design docs (and we will need better tests).
2014-04-11	cgi: /$LISTNAME/ and /$LISTNAME/index.html are equal
	This prevents ambiguity when switching URLs between static file servers and CGI. The /$LISTNAME/index.html URL appearing in the wild is inevitable because of our static file server support. Worst yet, there's no easy/consistent way to get all installations detect and 301 them to the shorter /$LISTNAME/. So we make the CGI support /$LISTNAME/index.html. The downside of this is the potential duplicate entry in all caches.
2014-04-10	cgi: implement get_mid_txt
	This is essential when telling people to use something like: curl $URL \| git am
2014-04-10	cgi: wire up index + tests
	Remove the specified /all.html while we're at it, we only have /all.atom.xml because it's convenient for feed readers.
2014-04-05	get a basic CGI feed sender running
	We should be able to wire up the rest, soon.