public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2020-08-07	www: avoid warnings on YYYYMMDD-only t= query parameter
	While we always generate YYYYMMDDhhmmss query parameters ourselves, the regexps in paginate_recent allow YYYYMMDD-only (no hhmmss) timestamps, so don't trigger Time::Local::timegm warnings about empty numeric comparisons on empty strings when a client starts making up their own URLs.
2020-07-06	wwwatomstream: support async blob fetch
	This allows -httpd to handle other requests while waiting for git to retrieve and decode blobs. We'll also break apart t/psgi_v2.t further to ensure tests run against -httpd in addition to generic PSGI testing. Using xt/httpd-async-stream.t to test against clones of meta@public-inbox.org shows a 10-12% performance improvement with the following env: TEST_JOBS=1000 TEST_CURL_OPT=--compressed TEST_ENDPOINT=new.atom
2020-07-06	mbox: async blob fetch for "single message" raw mboxrd
	This restores gzip-by-default behavior for /$INBOX/$MSGID/raw endpoints for all indexed inboxes. Unindexed v1 inboxes will remain uncompressed, for now.
2020-05-09	remove most internal Email::MIME usage
	We no longer load or use Email::MIME outside of comparison tests.
2020-04-26	tests: remove Email::MIME->create use entirely
	Replace them with .eml files generated with the help of Email::MIME, but without some extraneous and unnecessary headers, and strip mime_load down to just loading files. This will give us more freedom to experiment with other mail libraries which may be more correct, better maintained, use less memory and/or be faster than Email::MIME.
2020-04-26	testcommon: introduce mime_load sub
	We'll use this to create, memoize, and reuse .eml files. This will be used to reduce (and eventually eliminate) our dependency on Email::MIME in tests.
2020-04-22	t/*.t: use Email::MIME->create over PublicInbox::MIME->create
	PublicInbox::MIME only supports ->new, and is only different from Email::MIME for old versions of Email::MIME. In the future, PublicInbox::MIME may not be a subclass of Email::MIME at all.
2020-04-19	reduce scope of mbox From_ line removal
	It's unnecessary overhead for anything which does Email::MIME parsing. It was never done for v2 indexing, even though v1->v2 conversions did NOT remove those From_ lines. There was never a need to remote From_ lines the v1 SearchIdx paths, either. Hitting a /$INBOX_URL/$MSGID/T/ endpoint with an 18 message thread reveals a ~0.5% speed improvement. This will become more apparent when we have a faster MIME parser.
2020-04-03	view: handle the topic-free case properly
	There may be no topics for a given timestamp range, so don't attempt to treat `undef' as an arrayref.
2020-02-07	tests: switch to XML::TreePP for testing Atom feeds
	XML::Feed pulls in a lot of dependencies, some of which XS. That makes testing with blead or any non-OS-supplied Perl installations more time consuming and more difficult because of the need to have development headers and libraries for libexpat1 or libxml2. Performance from libexpat1 or libxml2 for our small tests cases isn't relevant, either, and the pure Perl XML::TreePP seems up to the task. It's also available in CentOS 7.x, FreeBSD 11.x, and Debian, at least.
2020-02-06	treewide: run update-copyrights from gnulib for 2019
	I didn't wait until September to do it, this year!
2020-02-04	www: serve $INBOX_DIR/description as $INBOX_URL/description
	Instead of serving $INBOX_DIR/all.git/description, since $INBOX_DIR/all.git/description is not described in the default message when it's missing.
2020-01-11	make Plack optional for non-WWW and non-httpd users
	Some users just want to run -mda, -watch, and/or -nntpd. Let them run just those without forcing them to pull in a bunch of dependencies.
2019-12-25	t/psgi_v2: test search results Atom feed endpoint
	The "x=A" search results endpoint finally gets test coverage.
2019-12-24	search: support SWIG-generated Xapian.pm
	Xapian upstream is slowly phasing out the XS-based Search::Xapian in favor of the SWIG-generated "Xapian" package. While Debian and both FreeBSD have Search::Xapian, OpenBSD only includes the "Xapian" binding. More information about the status of the "Xapian" Perl module here: https://trac.xapian.org/ticket/523
2019-12-24	testcommon: add require_mods method and use it
	This cuts down on lines of code in individual test cases and fixes some misnamed error messages by using "$0" consistently. This will also provide us with a method of swapping out dependencies which provide equivalent functionality (e.g "Xapian" SWIG can replace "Search::Xapian" XS bindings).
2019-12-20	view: show percentage in search results thread skeleton
	The displays the Xapian ->get_percent value in the skeleton to improve scanning of relevancy; irrelevant results do not display that. This fixes broken #anchor links introduced in the previous commit, irrelevant messages now link to the /$INBOX/$MESSAGE_ID page.
2019-12-19	tests: move t/common.perl to PublicInbox::TestCommon
	We want to be able to use run_script with *.t files, so t/common.perl putting subs into the top-level "main" namespace won't work. Instead, make it a module which uses Exporter like other libraries.
2019-11-24	tests: use File::Temp->newdir instead of tempdir()
	We'll also introduce a tmpdir() API to give tempdirs consistent names.
2019-10-28	view: move '<' and '>' outside <a>
	Browsers may underline '<' and '>' in links, which may be confused with '≤' and '≥'. So have the Message-ID header display follow what we do with In-Reply-To headers and move the "<" and ">" outside of <a> in the HTML.
2019-10-16	config: support "inboxdir" in addition to "mainrepo"
	"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-15	config: we always have {-section_order}
	Rewrite a bunch of tests to use ordered input (emulating "git config -l" output) so we can always walk sections in the order they were given in the config file.
2019-09-09	run update-copyrights from gnulib for 2019

2019-06-16	Merge remote-tracking branch 'origin/newspeak' into xcpdb
	* origin/newspeak: comments: replace "partition" with "shard" t/xcpdb-reshard: use 'shard' term in local variables xapcmd: favor 'shard' over 'part' in local variables search: use "shard" for local variable v2writable: use "epoch" consistently when referring to git repos adminedit: "part" => "shard" for local variables v2writable: rename local vars to match Xapian terminology v2writable: avoid "part" in internal subs and fields search*: rename {partition} => {shard} xapcmd: update comments referencing "partitions" v2: rename SearchIdxPart => SearchIdxShard inboxwritable: s/partitions/shards/ in local var tests: change messages to use "shard" instead of partition v2writable: rename {partitions} field to {shards} v2writable: count_partitions => count_shards searchidxpart: start using "shard" in user-visible places rename reference to git epochs as "partitions" admin\|xapcmd: user-facing messages say "shard" v2writable: update comments regarding xcpdb --reshard doc: rename our Xapian "partitions" to "shards"
2019-06-15	searchview: add link at bottom to reverse results
	I could not find a place to put the link the top without making navigation too cluttered. Putting it at the bottom of the page seems reasonable...
2019-06-14	tests: change messages to use "shard" instead of partition
	Another potentially user-facing piece made consistent with Xapian terminology.
2019-06-09	www: support $INBOX/git/$EPOCH.git for v2 cloning
	And use it in manifest.js. To ease maintaining mirrors with grokmirror(1), we can accept a "git/" directory prefix before the epoch, and ".git" suffix after the epoch number. We maintain compatibility with "$INBOX/$EPOCH" cloning, of course, and it's still easier-to-type on the command-line.
2019-01-10	check git version requirements
	This allows v1 tests to continue working on git 1.8.0 for now. This allows git 2.1.4 packaged with Debian 8 ("jessie") to run old tests, at least. I suppose it's safe to drop Debian 7 ("wheezy") due to our dependency on git 1.8.0 for "merge-base --is-ancestor". Writing V2 repositories requires git 2.6 for "get-mark" support, so mask out tests for older gits.
2018-04-18	search: preserve References in Xapian smsg for x=t view
	I'm not sure how useful this view is, but it exists for now.
2018-04-06	www: favor reading more from SQLite, and less from Xapian
	Favor simpler internal APIs this time around, this cuts a fair amount of code out and takes another step towards removing Xapian as a dependency for v2 repos.
2018-04-04	searchidx: ensure duplicated Message-IDs can be linked together
	This allows us to emulate the display of thread-aware MUAs when multiple messages share the same Message-ID. This also is a place where "public-inbox-index --reindex" is useful to fix existing messages and no schema version bump is necessary.
2018-04-03	mbox: remove remaining OFFSET usage in SQLite
	We can use id_batch in the common case to speed up full mbox retrievals. Gigantic msets are still a problem, but will be fixed in future commits.
2018-03-29	www: fix attachment downloads for conflicted Message-IDs
	By using the "primary" Message-ID in WwwAttach, we can avoid conflicts in the links we use for downloading attachments.
2018-03-29	v2writable: append, instead of prepending generated Message-ID
	The original Message-ID is still the most important when discussing with other recipients who do not rely on a message flowing through public-inbox. So whatever Message-ID we use to deduplicate internally will be secondary and less important. All of our front-end v2 code is order-independent, so we won't let the message count against us, that way.
2018-03-27	www: support cloning individual v2 git partitions
	This will require multiple client invocations, but should reduce load on the server and make it easier for readers to only clone the latest data. Unfortunately, supporting a cloneurl file for externally-hosted repos will be more difficult as we cannot easily know if the clones use v1 or v2 repositories, or how many git partitions they have.
2018-03-27	view: depend on SearchMsg for Message-ID
	Since we need to handle messages with multiple and duplicate Message-ID headers, our thread skeleton display must account for that. Since we have a "preferred" Message-ID in case of conflicts, use it as the UUID in an Atom feed so readers do not get confused by conflicts.
2018-03-27	view: permalink (per-message) view shows multiple messages
	This needs tests and further refinement, but current tests pass.
2018-03-23	feed: fix new.html for v2
	I forget this endpoint is still accessible (even if not linked). This also simplifies new.html all around and removes some unused clutter from the old days while we're at it.
2018-03-23	t/psgi_v2: minimal test for Atom feed and t.mbox.gz
	Some test coverage is better than none, here.
2018-03-23	search: reopen DB if each_smsg_by_mid fails
	This gives more-up-to-date data in case and allows us to avoid reopening in more places ourselves.
2018-03-23	www: $MESSAGE_ID/raw endpoint supports "duplicates"
	Since v2 supports duplicate messages, we need to support looking up different messages with the same Message-Id. Fortunately, our "raw" endpoint has always been mboxrd, so users won't need to change their parsing tools.