public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2019-06-09	www: wire up /$INBOX/manifest.js.gz, too
	I can imagine myself just wanting to clone a single v2 inbox and all its epochs without thinking about include/exclude rules in a grokmirror config file.
2019-06-09	wwwlisting: generate grokmirror-compatible manifest.js.gz
	Support on-demand generation of "/manifest.js.gz" for inboxes. By default, this matches inboxes with URLs matching the given request hostname by default. This makes it easier to create full mirrors of several inboxes without needing to configure static file serving. cf. https://git.kernel.org/pub/scm/utils/grokmirror/grokmirror.git
2019-06-04	www: require ASCII word characters for CSS filenames
	Allowing admins to set non-ASCII CSS filenames could cause unnecessary problems for client and proxies.
2019-06-04	www: require ASCII range for mbox downloads
	We do not support many mboxrd download range specifications at the moment; but parsing non-ASCII characters isn't planned. This makes no difference aside from being able to return 404 slightly earlier than we would've in the past.
2019-06-04	www: require ASCII digit for git epoch
	Don't inadvertantly serve git repos containing non-ASCII digit characters.
2019-06-04	www: require ASCII filenames in git blob downloads
	Our Hval::to_filename sub has always been strict about emitting ASCII-only characters for ViewVCS "raw" links. However, somebody could manually generate a filename with non-ASCII words for somebody else to download (we have no cheap and fast way of mapping filenames back to blobs for validation).
2019-06-04	www: only emit ASCII chars in attachment filenames
	We don't want to emit funky URLs which can be lost in translation or cause problems with non-Unicode-aware clients. Then, don't accept non-ASCII filenames in URLs, since a manually-generated URL/filename in attachment downloads could be used for Unicode homographs to confuse folks who down the attachment.
2019-05-21	Merge remote-tracking branch 'origin/xap-optional' into master
	* origin/xap-optional: admin: improve warnings and errors for missing modules searchidx: do not create empty Xapian partitions for basic lazy load Xapian and make it optional for v2 www: use Inbox->over where appropriate nntp: use Inbox->over directly inbox: add ->over method to ease access
2019-05-16	www: unescape '+' => ' ' before general URI unescape
	This allows searching for terms with "+" in them properly.
2019-05-15	lazy load Xapian and make it optional for v2
	More tests work without Search::Xapian, now. Usability issues still need to be fixed
2019-05-15	www: use Inbox->over where appropriate
	We don't need to rely on Xapian search functionality for the majority of the WWW code, even. subject_normalized is moved to SearchMsg, where it (probably) makes more sense, anyways.
2019-04-19	www: support listing of inboxes
	We will still return a 404 by default to '/' for compatibility with users of Plack::App::Cascade or similar. Inboxes are sorted by modification times to help users detect activity (similar to the /$INBOX/ topic view). New configuration options: * publicinbox.wwwlisting - configure the listing type * publicinbox.<name>.hide - hide a particular inbox from the listing See changes to public-inbox-config.pod for full descriptions of the new options. Requested-by: Leah Neukirchen <leah@vuxu.org> https://public-inbox.org/meta/871sdfzy80.fsf@gmail.com/
2019-04-19	start depending on Perl 5.10.1+
	I mainly want to start using the '//' (defined-or) operator to simplify code, and Perl 5.10.1 is roughly a decade old at this point. "given/when" would've be nice, but it's future is in doubt AFAIK. I also started using the 'parent' module in WwwHighlight, and 'autodie' in UserContent.pm, both of which were only distributed with Perl since 5.10.1; and testing with ancient versions/distros is time-consuming. Anyways, I think this a small-enough jump to not break any existing installations, given we already depend on fairly recent versions of git and Xapian. Maybe we can use more newish Perl features in the future...
2019-04-16	cleanup: use '$ibx' consistently when referring to Inbox refs
	'$inbox' is more human-readable, so that is for the more human-readable name in most cases. Making our variable naming more consistent should make the code easier-to-review and harder to screw up.
2019-04-16	www: remove unnecessary Git object reference
	We access the Git object via the Inbox object nowadays, so there's no point in having a shortcut to it, anymore.
2019-04-04	www: fix missing cgit fallback after legacy redirects
	We need to instate our cgit handler everywhere we use NewsWWW to catch wildcard requests which our normal endpoints do not handle.
2019-04-04	www: wire up cgit as a 404 handler if cgitrc is configured
	Requests intended for cgit are unlikely to conflict with requests to inboxes. So we can safely hand those requests off to cgit.cgi.
2019-02-23	www: prevent '!important' in BOFH-specified CSS
	CSS specified by the BOFH must never take precedence over what a user sets in userContent.css.
2019-02-13	ensure bytes::length is available to callers
	We were relying on Danga::Socket using the "bytes" pragma, previously. Nowadays, the "bytes" pragma is not recommended in general, but bytes::length remains acceptable for getting the byte-size of a scalar.
2019-01-20	$INBOX/_/text/color/ and sample user-side CSS
	Since we now support more CSS classes for coloring, give this feature more visibility.
2019-01-20	www: admin-configurable CSS via "publicinbox.css"
	Maybe we'll default to a dark theme to promote energy savings... See contrib/css/README for details
2019-01-20	view: enforce trailing slash for /$INBOX/$OID/s/ endpoints
	As with our use of the trailing slash in $MESSAGE_ID/T/ and '$MESSAGE_ID/t/' endpoints, this for 'wget -r --mirror' compatibility as well as allowing sysadmins to quickly stand up a static directory with "index.html" in it to reduce load.
2019-01-19	view: wire up diff and vcs viewers with solver

2019-01-15	config: inbox name checking matches git.git more closely
	Actually, it turns out git.git/remote.c::valid_remote_nick rules alone are insufficient. More checking is performed as part of the refname in the git.git/refs.c::check_refname_component I also considered rejecting URL-unfriendly inbox names entirely, but realized some users may intentionally configure names not handled by our WWW endpoint for archives they don't want accessible over HTTP.
2018-06-26	www: use undecoded paths for Message-ID extraction
	In PSGI, PATH_INFO contains URI-decoded paths which cause problems when Message-IDs contain ambiguous characters for used for routing. Instead, extract the undecoded path from REQUEST_URI and use that. Reported-by: Leah Neukirchen <leah@vuxu.org> https://public-inbox.org/meta/8736xsb5s5.fsf@vuxu.org/
2018-03-29	www: cleanup expensive fallback for legacy URLs
	Back in the day, we compressed long Message-IDs to SHA-1 hexdigests for the URL. This now redirects to a 301 in the hopes we can remove these checks some day to reduce overhead.
2018-03-27	www: support cloning individual v2 git partitions
	This will require multiple client invocations, but should reduce load on the server and make it easier for readers to only clone the latest data. Unfortunately, supporting a cloneurl file for externally-hosted repos will be more difficult as we cannot easily know if the clones use v1 or v2 repositories, or how many git partitions they have.
2018-03-27	view: depend on SearchMsg for Message-ID
	Since we need to handle messages with multiple and duplicate Message-ID headers, our thread skeleton display must account for that. Since we have a "preferred" Message-ID in case of conflicts, use it as the UUID in an Atom feed so readers do not get confused by conflicts.
2018-03-27	www: get rid of unnecessary 'inbox' name reference
	We use the actual Inbox object everywhere else and don't need the name of the inbox separated from the object.
2018-03-27	view: permalink (per-message) view shows multiple messages
	This needs tests and further refinement, but current tests pass.
2018-03-23	www: $MESSAGE_ID/raw endpoint supports "duplicates"
	Since v2 supports duplicate messages, we need to support looking up different messages with the same Message-Id. Fortunately, our "raw" endpoint has always been mboxrd, so users won't need to change their parsing tools.
2018-02-16	www: stop assuming mainrepo == git_dir
	It won't be in v2
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-12-08	search: force large mbox result downloads to POST
	This should prevent crawlers (including most robots.txt ignoring ones) from burning our CPU time without severely compromising usability for humans.
2017-05-23	www: do not mangle characters from search queries
	Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> https://public-inbox.org/meta/CACBZZX5Gnow08r=0A1J_kt3a=zpGyMfvsqu8nAN7kacNnDm+dg@mail.gmail.com/
2017-05-09	www: avoid undefined warnings for query string parsing
	Sometimes bots generate malformed queries with sequential "&" and ";" characters.
2017-02-14	www: do not unescape PATH_INFO twice
	PSGI specs already require PATH_INFO to be unescaped; so our tests were wrong, too.
2017-01-10	introduce PublicInbox::MIME wrapper class
	This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2016-10-05	thread: remove Mail::Thread dependency
	Introduce our own SearchThread class for threading messages. This should allow us to specialize and optimize away objects in future commits.
2016-08-18	www: implement generic help text
	Begin documenting some basic help functionality. I may tweak the anchor names of the various HTML endpoints to be more consistent with each other (old ones will be supported for a short while), so I'm not documenting those, for now. This may become part of a builtin key-value store for basic texts, but this probably shouldn't become a wiki engine, either.
2016-08-14	www: do not unecessarily escape some chars in paths
	Based on reading RFC 3986, it seems '@', ':', '!', '$', '&', "'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed in path-absolute where we have the Message-ID. In any case, it seems '@' is fairly common in path components nowadays and too common in Message-IDs.
2016-08-09	www: avoid misinterpreting '&' and ';' in query parameters
	Oops, we must unescape each key=value pair in a QUERY_STRING individually; otherwise we cannot interpret '&' or ';' in query parameter values.
2016-07-09	www: cleanup parameter passing
	Reduce the size of hashes a bit and drops some unneeded hash lookups for uncommon paths.
2016-07-09	www: drop unused constants
	We no longer generate our footer, here. We are not currently advertising ssoma, here.
2016-07-07	www: remove old footer generation code and normalize new.html
	We now generate all of our HTML using WwwStream which forces us to have consistent headers and footers in the HTML itself. This also makes the search-capable vs search-less installs go to the new.html endpoint to maintain consistency (in case an admin decides to enable Xapian).
2016-07-07	inbox: cleanup and consolidate object weakening
	This fixes some layering violations and consolidates the cleanup into the inbox object itself. Keeping in mind weakening does not work at all without our PSGI server.
2016-07-02	www: remove Plack::Request dependency entirely
	Lighter and ever-so-slightly faster! Most importantly, this won't do non-obvious stuff behind our backs like trying to parse a POST request body for a query string param.
2016-07-02	www: use PSGI env directly
	More work on on the Plack::Request/CGI.pm removal front, No need to access the PSGI env through an extra hash lookup.
2016-07-02	inbox: base_url method takes PSGI env hashref instead
	This is lighter and we can work further towards eliminating our Plack::Request dependency entirely.
2016-06-30	www_stream: add response wrapper sub
	This encapsulates an entire PSGI response array, hopefully making it easier to generate responses and avoid typos when setting the Content-Type.