public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2016-11-29	note the source code is AGPL for cloning
	This should be adequate warning for folks who may be uncomfortable or uncertain about even possessing AGPL source code due to employer agreements and such. Disclaimer: I remain completely in favor of AGPL and strong copyleft, and am more than willing to risk my own future on it. However, I refuse to even nudge people into downloading AGPL source code if it presents any legal risk to them.
2016-11-26	avoid IO::File for anonymous temporary files
	We do not need to import IO::File into the main programs since Perl 5.8+ supports literal "undef" for generating anonymous temporary file handles.
2016-11-26	githttpbackend: error checking for input handling
	This was sloppy code, all calls need to be checked for failure.
2016-11-22	view: fix spaces in mailto: link
	Some mail clients do not seem to handle '+' as a space in query parameters for the mail subject, use the more common '%20' for compatibility.
2016-11-18	index: allow indexing before configuration
	One may build the initial index on a powerful host and transfer it to a weaker one for incremental indexing. Thus there is no requirement to have a configured public-inbox for building the index unless a user needs altid support or some such.
2016-10-16	import: failed GC runs are non-fatal
	We should not completely kill a process if "git gc --auto" errors out due to a warning or whatnot.
2016-10-14	thread: reinstates stable ordering when ghosts are present
	This reverts commit 3c9dd6619f825f0515e7e4afa1bd55c99c1a68d3 ("thread: fix sorting without topmost") and reinstates the "topmost" routine for sorting purposes.
2016-10-13	thread: fix parent/child relationships
	The ordering change in add_child is critical if $self == $parent as the {children} hash was lost before this change. has_descendent can be simplified by walking upwards from the child instead of downwards from the parent. This fixes threading regressions introduced in commit 30100c46326e2eac275e0af13116636701d2537e ("thread: use hash + array instead of hand-rolled linked list")
2016-10-13	thread: reduce indentation level
	This should reduce differences from the original Mail::Thread code and hopefully make things easier-to-follow.
2016-10-05	thread: remove weaken dependency
	We have to walk through all the messages after threading anyways to build the rootset, so we can just delete all the parent references at that point.
2016-10-05	t/thread-cycle: test self-referential messages
	Some broken (or malicious) mailers may include a generated Message-ID in its References header, so be prepared for it.
2016-10-05	view: remove redundant children array in thread views
	Each node has an entire arrayref of its children nowadays, so there's no need to waste time and memory creating another one.
2016-10-05	thread: use hash + array instead of hand-rolled linked list
	This starts to show noticeable performance improvements when attempting to thread over 400 messages; but the improvement may not be measurable with less. However, the resulting code is much shorter and (IMHO) much easier to understand.
2016-10-05	thread: fix sorting without topmost
	This bug was hidden, and we may not be able to efficiently implement a topmost subroutine with the hash-based (vs linked-list) based container for threading in the next commit.
2016-10-05	thread: inline and remove recurse_down logic
	We no longer recurse, and it's too hard to come up with a new name for a sub we will only use once.
2016-10-05	thread: order_children no longer cares about depth
	We never use the depth anywhere in this sub
2016-10-05	thread: avoid incrementing undefined value
	It is pointless to increment when setting a true value is simpler as there is no need to read before writing.
2016-10-05	thread: remove iterate_down
	Unnecessary subs and complexity. This was hiding the fact that $before is never used.
2016-10-05	thread: simplify
	Single use subroutines actually make the code more complex in this case, and there's never a {seen} field in $self.
2016-10-05	thread: remove rootset accessor method
	It doesn't buy us much and copying to a new array is slower; but probably not measurable in real-world use.
2016-10-05	thread: remove Email::Abstract wrapping
	This roughly doubles performance due to the reduction in object creation and abstraction layers.
2016-10-05	inbox: deal with ghost smsg
	smsg will be undef for ghost messages in a subsequent commit
2016-10-05	thread: remove accessor usage in internals
	This improves top-level index generation performance by 3-4%.
2016-10-05	thread: pass array refs instead of entire arrays
	Copying large arrays is expensive, so avoid it. This reduces /$INBOX/ time by around 1%.
2016-10-05	thread: remove Mail::Thread dependency
	Introduce our own SearchThread class for threading messages. This should allow us to specialize and optimize away objects in future commits.
2016-10-05	view: remove "subject dummy" references
	We will not care for inexact threading by subject or pruning.
2016-09-13	help: document new search prefixes
	Support (and document) 'a:' after all, as "mairix -h" uses it, so this should reduce the learning curve for mairix users.
2016-09-09	nntp: cleanup: move use statements out of sub scope
	This clarifies the code somewhat, and we don't care to lazy-load in NNTP.pm anyways since this is only used for a long-lived daemon.
2016-09-09	TODO: updates for done items
	The existing string -> number date range Xapian query is good enough, and having too much flexibility is probably bad for caching (as well as increasing our attack surface, because parsing queries is tricky). Tags-as-skiplists are probably not worth the effort given Xapian, and we may have to import old messages after-the-fact, anyways, and message delivery for mirrors is never orderly. Other items are all done and need to be maintained (like the search engine docs for the mairix-compatibility features that just got pushed out)
2016-09-09	t/httpd-unix: warn about connection failure
	Output $! for diagnostic purposes since I've noticed this on two slow machines, today (and seemingly, never prior).
2016-09-09	search: index attachment filenames
	And while we're at it, ensure searching inside displayable attachment bodies works.
2016-09-09	search: match the behavior of WWW for indexing text
	The basic rule is that if it is displayable via our WWW interface, it should be indexable text for Xapian search.
2016-09-09	search: avoid mindlessly calling body_set
	It's not worth entering a complex codepath in Email::MIME to save some (probably immeasurable amount of) memory, here. We've already stopped doing this in our WWW code a while back, too. If we really cared enough about it, we'd prioritize work on a streaming replacement for Email::MIME.
2016-09-09	search: fix compatibility with Debian wheezy
	Specifying the "d:" field only worked for NumberValueRangeProcessor in older versions of Xapian, such as the one in Debian wheezy (libsearch-xapian-perl=1.2.10.0-1) This slipped through since I rarely use wheezy, anymore, and perhaps nobody else does, either. Perhaps wheezy support may be dropped, soon. Unfortunately, this requires a schema version bump.
2016-09-09	search: increase term positions for each quoted hunk
	We pay a storage cost for storing positional information in Xapian, make good use of it by attempting to preserve it for (hopefully) better search results.
2016-09-09	search: match quote detection behavior of view
	This is stricter than the mutt quote_regexp default ("^([ \t]*[\|>:}#])+" on Debian jessie), but matches what we have in View.pm. I prefer the stricter quote detection since it is less ambiguous and less likely to hide/obscure important details.
2016-09-09	search: fix space regressions from recent changes
	As of Xapian 1.0.4 (from 2007) is possible to use Search::Xapian::QueryParser::add_prefix multiple times with the same user field name but different term prefixes. This brings my current git@vger mirror from 6.5GB to 2.1GB (both sizes are after xapian-compact).
2016-09-09	search: more granular message body searching
	"bs:" and "b:" are adapted from mairix(1) We will also support searching explicitly for quoted vs non-quoted text via "q:" and "nq:" prefixes since sometimes readers will not care for quoted text. In the future, we will support parsing diffs (perhaps when repobrowse integration is complete). Note: this roughly doubles the size of the Xapian database due to the additional information; so this change may not be worth it.
2016-09-09	search: drop longer subject: prefix for search
	We only document the "s:" anyways. While the long name is more descriptive, the ambiguity makes agnostic caching (by Varnish or similar) slightly harder and longer URLs are more likely to be accidentally truncated when shared.
2016-09-09	search: allow searching user fields (To/Cc/From)
	Sometimes it can be useful to search based on who the message was sent to, sent by, or Cc:-ed. Of course, headers can be faked, but they usually are not... Anyways this mostly matches the behavior of mairix(1).
2016-09-08	import: run "git gc --auto" when done
	We need to prevent excessive repository growth for public-inbox-watch and public-inbox-mda users.
2016-09-08	import: hoist out common run_die subroutine
	We will be reusing this in the next commit, too.
2016-09-08	doc: document PERL_INLINE_DIRECTORY usage
	For now, we will document this since it allows better performance without the burden of extensions. Perhaps one day far in the future Perl can natively support vfork(2) AND that version of Perl will be widely available, but I suspect that day is at least a decade away, if not two: https://rt.perl.org/Ticket/Display.html?id=128227
2016-09-08	import: hoist out _check_path function
	This reduces duplication, slightly. We may be using it yet again in a to-be-introduced function (or we may not introduce it).
2016-09-08	view: handle missing Content-Type in message
	Email::MIME internally assumes "text/plain" for messages missing a Content-Type, but does not expose that in the Email::MIME::content_type API method. We must assume it ourselves to avoid uninitialized value warnings for the rare (nowadays) MUAs which do not set it.
2016-09-07	doc: flesh out public-inbox-index documentation
	And include it into the build + website
2016-09-07	doc: new docs for user-level commands
	Hopefully more folks can download and run public-inbox, nowadays.
2016-09-02	config: use "publicinboxlimiter" prefix
	Just having "limiter" in the prefix may confuse it with something else. Use the full prefix to avoid this confusion.
2016-09-02	init: enable pack bitmaps by default
	We want to encourage users to serve repositories. So enable bitmaps by default so performance suffers less with smart HTTP.
2016-09-01	watch: use "publicinboxwatch" namespace
	We'll keep supporting "publicinboxlearn" indefinitely, but "publicinboxwatch" is probably more appropriate at the moment. Noticed while writing documentation.