public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2015-08-22	INSTALL: document IO::Compress::Gzip dependency
	Otherwise folks won't get downloadable mboxes
2015-08-22	cgi: remove static file generation support for now
	We may not support this after all, CGI.pm is already legacy-enough and far more powerful.
2015-08-22	stream HTML views as much as possible
	This should allow progressive rendering on the client and reduce memory usage on the server. Unfortunately XML::Atom::SimpleFeed does not yet support streaming, so we may not use it in the future.
2015-08-21	search: s/count/total/ for results
	This is hopefully less ambiguous, as the word "count" confused me, too.
2015-08-21	mbox: drop unnecessary imports
	These are not necessary, anymore
2015-08-21	switch to gzipped mboxes
	Mboxes may be huge, so only support downloading gzipped mboxes to save bandwidth and to get free checksumming. Streaming output means we should not be wasting too much memory on this unless the chosen server sucks.
2015-08-21	mbox: stream entire thread, regardless of size
	Since mbox is usually downloaded, support fetching infinitely large responses via streaming.
2015-08-21	support dumping thread as an mbox
	Some folks may not want to download and install Perl code like ssoma, so allow downloading an mbox containing the entire thread.
2015-08-21	view: "next" link in thread view goes to next Subject line
	It's a bit disconcerting to jump to the authorship line.
2015-08-21	view: cleanup and reduce duplication
	This also avoids incorrectly incrementing $part_nr when we skip a part due to bad Content-Type.
2015-08-20	feed: fix extra, unnecessary quote
	Oops!
2015-08-20	search: preserve References: order in document data
	We need proper ordering of References to thread messages correctly. We would lose this order if we load the terms from the database, so set it directly document data. Do not bother with a separate In-Reply-To, since Mail::Thread just merges the IRT into References. This bumps our schema version once again.
2015-08-20	avoid using header_raw for Message-ID retrieval
	This is for consistency with ssoma. I doubt it makes a difference in practice, but in case somebody decides any of the Message-ID-containing headers should have strange characters, we'll decode and attempt to thread them. This isn't an attack vector, just a way to make messages thread improperly which is pointless...
2015-08-20	view: simplify message threading dumpers

2015-08-20	dead code cleanup
	We may not be using subject_path after all.
2015-08-20	www: remove useless no-op assignment statement
	Oops
2015-08-20	misc documentation updates
	Threading in Xapian is mostly supported by now; so start documenting things.
2015-08-20	replace references to lynx
	Table rendering in lynx is crap compared to w3m and links. However, we still use it for filtering HTML since the renderer is otherwise nice...
2015-08-20	search: index_sync allows specifying alternate HEAD
	This should allow us to sync the index to a temporary head to update the Xapian index before we update the real HEAD index.
2015-08-20	view: do not fold top-level messages in thread
	This hopefully reduces clicking. We may drop folding entirely since we can use Xapian to make searching easier.
2015-08-20	index: layout fix + title and Atom feed links at top
	Add some spacing between topics to improve readability when scanning or in case a subject gets too long. The title and Atom feed may not be highly-visible otherwise. While we're at it, use the proper "Atom feed" terminology since some folks may not understand just what "atom" means.
2015-08-20	search: bump schema version to 5 for subject_path
	In "index: simplify main landing page if search-enabled", subject normalization went a little farther to drop trailing '.' characters, so we will need to re-index.
2015-08-20	view: reduce memory usage when displaying large threads
	We want to minimize the time any large objects or strings are referenced. We can do threading entirely from the mini_mime-generated messages and lazilly load full messages when rendering the display.
2015-08-20	search: reject ghosts in all cases
	We do not need ghost messages in any of our thread views
2015-08-20	search: avoid needless decode
	Email::MIME should handle everything for us and make things work nicely with Xapian (assuming I understand how encoding works in Perl). While we're at it, reduce temporary strings and arrays by using destructive operations and clobbering parts as we iterate through them.
2015-08-20	index: simplify main landing page if search-enabled
	We can display /t/$MESSAGE_ID.html easily with a Xapian search index, so rely on it instead of trying to display messages inline.
2015-08-20	view: avoid nesting <a> tags from auto-linkification
	It is wrong HTML to have <a> tags nested due to auto-linkification.
2015-08-20	use tables for rendering comment nesting
	This is more space efficient since we don't need to place padding bytes in front of every line. While this unfortunately does not render well on lynx; w3m, links, elinks can all render tables sanely. Tables are also superior for long lines which require wrapping inside <pre> containers.
2015-08-20	feed: move timestamp parsing to view
	We don't need share duplicate logic across both files.
2015-08-20	feed: remove threading from index
	We'll be making the index smarter for people with search support enabled. Otherwise, it'll be chronological and a bit dumb. At least it'll use less memory.
2015-08-19	www: redirect /f/$MESSAGE_ID.txt links to /m/$MESSAGE_ID.txt
	Some people (e.g. myself :p) may try to guess URLs and hit a 404. Redirect to the /m/ version. Note: we prefer to redirect to canonical URLs to improve caching.
2015-08-19	view: return empty string to avoid undefined values
	Sometimes we have filter bugs and let HTML slip through...
2015-08-19	view: fix spacing on missing ghosts
	We must not prematurely indent if we have no message header to display.
2015-08-18	view: close anchor tag correctly before starting another
	Noticed by tidy
2015-08-18	public-inbox-index: exit with usage if not given an arg
	I often forget how to use this myself :x
2015-08-18	thread: another workaround for a Mail::Thread bug
	Yay for monkey patching! ref: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795913 ref: https://rt.cpan.org/Ticket/Display.html?id=106498
2015-08-18	search: bump SCHEMA_VERSION to 4
	The following two commits affect indexing behavior, so change the schema version to avoid compatibility problems or missing messages: search: common Subject: normalization for Re: prefixes search: avoid creating ghosts for circular References
2015-08-18	search: expose $PublicInbox::Search::LANG variable
	This makes it easier to reconfigure for non-English users
2015-08-18	search: common Subject: normalization for Re: prefixes
	Drop German ("Aw:") support since it's non-standard and is not supported by Mail::Thread and non-English prefixes are more likely to conflict with prefixes used in Free Software development where ("subsection:") prefixes are common and English is the common language. Anyways we don't filter "Vs: " (Finnish) or "Sv: " (Norwegian, Swedish, Danish, Icelandic), either. ref: https://en.wikipedia.org/wiki/RE_(e-mail)#Abbreviations_in_other_languages
2015-08-18	search: avoid creating ghosts for circular References
	Some mail software incorrectly creates circular references and causes us to create ghosts before the actual mail doc is created.
2015-08-18	view: cleaner Message-ID filtering for References
	Avoid compiling a weird and potentially fragile regexp every time and use the same logic as our search module to dedupe References.
2015-08-17	view: do not recompress already-compressed MID for anchors
	This is merely for display, so on the off chance somebody does send a 40-byte MID with nothing but hexadecimal characters, the worst that could happen is we repeat an anchor name in the rendered HTML. This has no impact on git archival or Xapian indexing.
2015-08-17	search: simplify indexing operation
	There's no need to make a transaction for each message when doing incremental indexing against a git repository. While we're at it, simplify the interface for callers, too and do not auto-create the Xapian database if it was not explicitly enabled.
2015-08-17	public-inbox-{learn,mda}: automatically sync index
	We'll ignore errors, for now, but should eventually warn or log. And yes, this is a dirty, dirty hack but we'll fix this ASAP tomorrow.
2015-08-17	view: always compress Message-IDs for anchors
	Valid URLs do not make valid anchor ids.
2015-08-17	search: bump schema version for '%' compression change
	commit 0fea7793b22efd2596983283947ee43687e0cfac ("mid: compress Message-IDs with '%' in them") requires re-indexing of repositories with '%' in Message-IDs :<
2015-08-17	mid: compress Message-IDs with '%' in them
	Some HTTP servers (apache2 2.2.22-13+deb7u5) on my system apparently do not handle "%25" correctly. I'm not yet sure if it's something weird with my rewrite rules or what....
2015-08-17	search: apply mid_compression to subject paths, too
	Otherwise we'll be wasting space in our index for long subjects.
2015-08-17	drop bodies and messages ASAP after processing
	We can rely on reference counting to lower memory usage for big messages.
2015-08-17	feed: disable the generator statement
	No need to waste bandwidth, here