public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2020-02-07	tests: switch to XML::TreePP for testing Atom feeds
	XML::Feed pulls in a lot of dependencies, some of which XS. That makes testing with blead or any non-OS-supplied Perl installations more time consuming and more difficult because of the need to have development headers and libraries for libexpat1 or libxml2. Performance from libexpat1 or libxml2 for our small tests cases isn't relevant, either, and the pure Perl XML::TreePP seems up to the task. It's also available in CentOS 7.x, FreeBSD 11.x, and Debian, at least.
2020-02-06	treewide: run update-copyrights from gnulib for 2019
	I didn't wait until September to do it, this year!
2020-01-11	make Plack optional for non-WWW and non-httpd users
	Some users just want to run -mda, -watch, and/or -nntpd. Let them run just those without forcing them to pull in a bunch of dependencies.
2020-01-06	treewide: "require" + "use" cleanup and docs
	There's a bunch of leftover "require" and "use" statements we no longer need and can get rid of, along with some excessive imports via "use". IO::Handle usage isn't always obvious, so add comments describing why a package loads it. Along the same lines, document the tmpdir support as the reason we depend on File::Temp 0.19, even though every Perl 5.10.1+ user has it. While we're at it, favor "use" over "require", since it it gives us extra compile-time checking.
2020-01-05	tests: remove some "git config" calls after "git init"
	Creating a hash and iterating through it just to run "git config" is ugly and slow. Just write out the text file in a human-friendly way since the git-config file format is stable and won't break randomly.
2019-12-19	tests: move t/common.perl to PublicInbox::TestCommon
	We want to be able to use run_script with *.t files, so t/common.perl putting subs into the top-level "main" namespace won't work. Instead, make it a module which uses Exporter like other libraries.
2019-11-24	tests: use File::Temp->newdir instead of tempdir()
	We'll also introduce a tmpdir() API to give tempdirs consistent names.
2019-11-16	t/common: introduce run_script wrapper for t/cgi.t
	This will give us a consistent interface for running test scripts in more performant ways while still giving us a consistent interface to recreate real-world behavior via spawn() (fork + execve), if needed. The default run_mode (1) is faster and can run within the test process with some minor adjustments to our code to avoid global state. This avoids the significante overhead of Perl code loading, parsing and compilation phases.
2019-10-16	config: support "inboxdir" in addition to "mainrepo"
	"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-09-09	run update-copyrights from gnulib for 2019

2019-05-23	v1writable: retire in favor of InboxWritable
	In retrospect, introducing V1Writable was unnecessary and InboxWritable->importer is in a better position to abstract away differences between v1 and v2 writers. So teach InboxWritable to initialize inboxes and get rid of V1Writable.
2019-05-15	lazy load Xapian and make it optional for v2
	More tests work without Search::Xapian, now. Usability issues still need to be fixed
2019-05-14	tests: get rid of unnecessary Cwd module use
	We only need it for tests that chdir, and maybe for ENV{PATH} portability (dash seems fine, not sure about others). v2: revert change to solver_git.t for FreeBSD 11.2 and document
2019-01-04	t/cgi.t: remove more redundant tests
	Most of these test cases are in t/plack.t, already; and that runs much faster. Just ensure the slashy corner case and search stuff works. While we're at it, avoid using the public-inbox-index command and just use the internal API to index.
2019-01-04	t/cgi.t: move expected failure tests to t/plack.t
	No point in implementing these slowly with the CGI wrapper when PSGI is sufficient for testing.
2019-01-04	t/cgi.t: move dumb HTTP git clone/fetch tests to plack.t
	No need to test this via CGI .cgi is a wrapper around PSGI and PSGI tests are way faster.
2019-01-04	t/cgi.t: remove atom.xml test
	It is redundant with what is in t/plack.t
2019-01-04	t/cgi.t: remove redundant redirect check
	t/plack.t already has the same test.
2019-01-04	t/cgi.t: eliminate some cruft and unnecessary tests
	More of this test will be, we use PSGI nowadays; and most of these tests can be ported over to use PSGI and not fork+exec as much.
2018-12-29	t/cgi.t: shorten %ENV setting
	No need to write our own loop when an assignment will do.
2018-06-26	www: use undecoded paths for Message-ID extraction
	In PSGI, PATH_INFO contains URI-decoded paths which cause problems when Message-IDs contain ambiguous characters for used for routing. Instead, extract the undecoded path from REQUEST_URI and use that. Reported-by: Leah Neukirchen <leah@vuxu.org> https://public-inbox.org/meta/8736xsb5s5.fsf@vuxu.org/
2018-04-22	extmsg: use Xapian only for partial matches
	"LIKE" in SQLite (and other SQL implementations I've seen) is expensive with nearly 3 million messages in the archives. This caused some partial Message-ID lookups to take over 600ms on my workstation (~300ms on a faster Xeon). Cut that to below under 30ms on average on my workstation by relying exclusively on Xapian for partial Message-ID lookups as we have in the past. Unlike in the past when we tried using Xapian to match partial Message-IDs; we now optimize our indexing of Message-IDs to break apart "words" in Message-IDs for searching, yielding (hopefully) "good enough" accuracy for folks who get long URLs broken across lines when copy+pasting. We'll also drop the (in retrospect) pointless stripping of "/[tTf]" suffixes for the partial match, since anybody who hits that codepath would be hitting an invalid message ID. Finally, limit wildcard expansion to prevent easy DoS vectors on short terms. And blame Pine and alpine for generating Message-IDs with low-entropy prefixes :P
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-02-14	www: do not unescape PATH_INFO twice
	PSGI specs already require PATH_INFO to be unescaped; so our tests were wrong, too.
2016-08-14	www: do not unecessarily escape some chars in paths
	Based on reading RFC 3986, it seems '@', ':', '!', '$', '&', "'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed in path-absolute where we have the Message-ID. In any case, it seems '@' is fairly common in path components nowadays and too common in Message-IDs.
2016-07-07	www: remove old footer generation code and normalize new.html
	We now generate all of our HTML using WwwStream which forces us to have consistent headers and footers in the HTML itself. This also makes the search-capable vs search-less installs go to the new.html endpoint to maintain consistency (in case an admin decides to enable Xapian).
2016-06-17	remove dependency on IPC::Run
	We no longer depend on it for the core code, and tests are optional for users. Hopefully this makes this easier-to-install.
2016-05-14	rename most instances of "list" to "inbox"
	A public-inbox is NOT necessarily a mailing list, but it could serve as an input point for zero, one, or infinite mailing lists :D
2016-05-02	t/*.t: reduce -mda calls
	Process startup times are atrocious for fast tests and there's far too much setup involved. Rely on git-fast-import instead; but more work is needed in this area.
2016-04-25	remove GIT_DIR env usage in favor of --git-dir
	No need to maintain per-block environment state when we can localize it to per-command. We've had --git-dir= in git since 1.4.2 (2006-08-12) and already use it all over the place.
2016-04-15	www: redirect /$MESSAGE_ID/f/ endpoints
	Quote-folding was a major design mistake pre-1.0. Since this project is still in its infancy and unlikely to be in wide use at the moment, redirect the /f/ endpoints back to the plain message.
2016-03-03	t/*.t: use identifiable tempdir names
	This should make identifiying leftover directories due to SIGKILL-ed tests easier.
2016-02-04	t/cgi.t: fix broken test for dumb HTTP
	This should not be dependent on what is in the users' $HOME config, oops.
2016-02-02	www: support git cloning via dumb HTTP
	This is enabled by default, for now. Smart HTTP cloning support will be added later, but it will be optional since it can be highly CPU and memory intensive.
2015-09-06	update copyright headers and email addresses
	In the future, it should be possible to use this: git ls-files \| UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-03	feed: use application/atom+xml for Content-Type
	This is the correct Content-Type for Atom feeds, especially since we updated to use ".atom" as the suffix.
2015-09-03	ExtMsg: 300 to external mailing list archives
	Since cross-posting is inevitable, we shall link to external message archives for interopability.
2015-09-01	completely revamp URL structure to shorten permalinks
	This allows common /m/ links to be used without a prefix, saving 2 precious bytes for permalinks and raw messages. Old URLs continue to redirect.
2015-09-01	implement per-thread Atom feeds
	This allows users to subscribe to only a single thread with their feed reader without subscribing to the rest of the thread. Update our endpoint notes while we're at it.
2015-08-27	wire up to display non-suffixed Message-ID links
	These URLs are preferable in case somebody decides to get cute and use a suffix we would've used to prevent others from linking to their message. The common /m/$MESSAGE_ID/ URLs are now 4 characters shorter so may fit better on terminals.
2015-08-27	wire up shorter, less ambiguous URLs
	We will prefer URLs without suffixes for now to avoid ambiguity in case a Message-ID ends with ".html", ".txt", ".mbox.gz" or any other suffix we may use. Static file compatibility is preserved by using a trailing slash as most servers can/will fall back to an index.html file in this case. For raw text files, we will follow gmane's lead with "/raw"
2015-08-21	switch to gzipped mboxes
	Mboxes may be huge, so only support downloading gzipped mboxes to save bandwidth and to get free checksumming. Streaming output means we should not be wasting too much memory on this unless the chosen server sucks.
2015-08-21	support dumping thread as an mbox
	Some folks may not want to download and install Perl code like ssoma, so allow downloading an mbox containing the entire thread.
2015-08-12	view: consistent ordering of Cc: addresses
	This fixes a minor test failure in t/cgi.t Tested with perl 5.18.2-2ubuntu1 on Ubuntu 14.04.3 LTS
2014-04-21	config: use description file for gitweb
	Do not repeat ourselves, just use the same description file gitweb uses to avoid surprising users.
2014-04-21	feed: there is only one atom feed, with all messages
	This is not a blog. All posts, whether replies or not, carry equal weight.
2014-04-20	use ORIGINAL_RECIPIENT once again
	It should be common for a single users to be subscribed to multiple addresses/lists, so we must use the address before alias expansion. This partially reverts commit b949afc9edf89dd494cac6255c78b124d58e11a5
2014-04-19	mda: rename PI_FAILBOX to PI_EMERGENCY
	The emergency destination may be Maildir. A Maildir emergency destination is better for volatile data which is written to and deleted-from frequently.
2014-04-19	cgi: index pages allow iterating some pagination
	This allows WWW readers to slowly page through the entire history of the mailing list.
2014-04-15	Revert "cgi: relax path restriction for top-level"
	CGI mounts should probably handle this internally. We're reverting this since it adds too much potential for abuse with fake/extra prefixes in the URL. We also need to reorder our redirect handling as a result. This reverts commit c394de9f2c91c2c5ed1f7832a5a7cc0206120b7f.