about summary refs log tree commit homepage
path: root/lib/PublicInbox/WWW.pm
DateCommit message (Collapse)
2016-05-30www: remove a few more Plack::Request dependencies
Still a work in progress, but SearchView no longer depends on Plack::Request at all and Feed is getting there. We now parse all query parameters up front, but we may do that lazily again in the future.
2016-05-30www: remove gratuitous use of Plack::Request methods
Accessing $env directly is faster and we will eventually remove all Plack::Request dependencies.
2016-05-30git-http-backend: remove dependency on Plack::Request
Plack::Request is unnecessary overhead for this given the strictness of git-http-backend. Furthermore, having to make commit 311c2adc8c63 ("avoid Plack::Request parsing body") to avoid tempfiles should not have been necessary.
2016-05-28remove redundant NewsGroup class
Most of its functionality is in the PublicInbox::Inbox class. While we're at it, we no longer auto-create newsgroup names based on the inbox name, since newsgroup names probably deserve some thought when it comes to hierarchy.
2016-05-28config: remove try_cat
It's moved into the Inbox module and we no longer use it in WWW
2016-05-28www: remove footer_html support
I haven't used it in a while and the existing "description" is probably good enough. If we support it again, it should be plain-text + auto-linkified for ease-of-maintenance and consistency.
2016-05-19www: tighten up allowable filenames for attachments
Having a file start with '.' or '-' can be confusing and for users, so do not allow it.
2016-05-19www: validate and check filenames in URLs
We shall ensure links continue working for this.
2016-05-19www: support downloading attachments
This can be useful for lists where the convention is to attach (rather than inline) patches into the message body.
2016-05-17http: release resources when idle
This lets us release old git processes so unlinked packs (leftover from repacking) can be released. This may also be helpful for Xapian as indices get rebuilt for tuning. For SQLite (msgmap), the there may be no benefit besides reducing FD pressure. Followup changes will unify the Inbox and NewsGroup classes and allow better code-sharing between NNTP and HTTP classes (as well as the planned POP3 class).
2016-05-16www: fix for running under mount paths
We try to avoid issues like these by using relative URLs in hrefs, but we can't avoid the problem with Location: for redirects and Atom feeds which are likely to be rehosted elsewhere. We also reorder some of the code to work around a weird issue on the psgi-plack mailing list: <20160516073750.GA11931@dcvr.yhbt.net> (Somewhere on https://groups.google.com/group/psgi-plack but it's probably not bookmarkable)
2016-05-16declare Inbox object for reusability
From the beginning, we've avoided objects here in favor of faster startup time; but it may not be worth it since a persistent httpd/nntpd is faster and -mda isn't hit as often.
2016-05-15mbox: support /$INBOX/all.mbox.gz endpoint
Allows easily downloading the entire archive without special tools. In any case, it's not yet advertised to via HTML until we can test it better. It'll also support range queries in the future to avoid wasting bandwidth.
2016-05-14rename most instances of "list" to "inbox"
A public-inbox is NOT necessarily a mailing list, but it could serve as an input point for zero, one, or infinite mailing lists :D
2016-04-15www: redirect /$MESSAGE_ID/f/ endpoints
Quote-folding was a major design mistake pre-1.0. Since this project is still in its infancy and unlikely to be in wide use at the moment, redirect the /f/ endpoints back to the plain message.
2016-04-02www: more explicit "git clone" usage
Little harm in having the entire command-line for users and avoiding the cognitive overhead of figuring out $URL.
2016-03-03use raw header for Message-ID
Message-IDs should not be MIME encoded, but in case they are, use the raw form for compatibility with ssoma and possibly other tools. This prevents a potential problem where a malicious client could confuse our storage layer into indexing incorrect contents.
2016-02-29fixup Plack-related requires
We do not need to load Plack::Request outside of WWW anymore.
2016-02-29distinguish error messages intended for users vs developers
For error messages intended to show user error (e.g. giving invalid options), we add a newline ("\n") at the end to polluting the output with location information. However, for diagnosing non-user-triggered errors, we should show the location of where the error occured.
2016-02-26www: add News* wrappers to preload
We want to preload as much as possible in -httpd when forking to save memory via CoW.
2016-02-26www: workaround for malformed NNTP links
Some linkifiers to create invalid HTTP links when it sees a link intended for NNTP services. This means we may see links to news.public-inbox.org/inbox.comp.mail.public-inbox.meta point to "http://" on port 80 instead of 119. Try to redirect users to http://public-inbox.org/meta/ in this case.
2016-02-25hval: implement common UI for protocol-relative URLs
This allows users to avoid HTTPS -> HTTP downgrade warnings, but we will also avoid encouraging them towards HTTPS, for now. IMHO: the CA system gives a false sense of security, TLS libraries (e.g. OpenSSL) can introduce new bugs and problems (even to attack clients), and TLS libraries also eats memory on cheap servers.
2016-02-25www: make interface more OO
This allows multiple instances the WWW app from running within the same process space
2016-02-25remove direct CGI.pm support
Relying on Plack::Handler::CGI is much easier for long-term maintenance and development. Nowadays, we even include our own httpd implementation to facilitate easier deployment with PSGI/Plack.
2016-02-24www: support $MESSAGE_ID/R/ endpoint for replies
Setting the "In-Reply-To:" header via mailto: links is not well-supported and should probably not be encouraged unless the client situation improves. So instead, teach users more widely-supported ways of setting the In-Reply-To: header to ensure proper threading of replies.
2016-02-13www: document "git clone --mirror" usage
Not everybody is willing to install or run ssoma; but at least document "git clone --mirror" usage to promote data distribution. It's wasteful to clone without "--mirror", so we'll suggest that to avoid wasting disk space and inodes.
2016-02-13www: advertise clone-ability over http/https
All public-inbox instances shall be clone-able.
2016-02-07support smart HTTP cloning
This requires POST and (small file) upload support from the PSGI/Plack web server. CGI.pm is currently not supported with this feature. We'll serve everything git can handle by default for performance in the general case. To avoid introducing cognitive overhead for sysadmins managing existing HTTP backends, we do not introduce new configuration directives. Thus, setting http.uploadpack=false in the relevant git config file for each public-inbox (ssoma) git repo will disable smart HTTP for CPU/memory-constrained systems. Technically we could support http.receivepack to allow posting messages to a public-inbox over HTTP(S), but that breaks the public-inbox model of encouraging users to Cc: everyone. Again, we encourage users to Cc: everyone to reduce the chance of a public-inbox becoming a centralized point of failure/censorship.
2016-02-02www: support git cloning via dumb HTTP
This is enabled by default, for now. Smart HTTP cloning support will be added later, but it will be optional since it can be highly CPU and memory intensive.
2016-01-09www: fix redirection loops
Sometimes users forget trailing slashes; but we should not punish them with infinite loops.
2016-01-03www: comments for denoting Plack::Request vs CGI
We'll probably want to continue supporting CGI for mod_perl compatibility.
2016-01-02www: redirect with query string
We use query strings for search and index pages, so we should not drop them if somebody types a URL by hand and omits the trailing slash.
2015-12-22rename 'GitCatFile' package to 'Git'
We'll be using it for more than just cat-file. Adding a `popen' API for internal use allows us to save a bunch of code in other places.
2015-12-22config: hoist out common functions
These will be reused elsewhere.
2015-11-20various internal documentation updates
Hopefully this gives new hackers a better overview of how the components relate to each other.
2015-10-04mbox: generate Archived-At, List-Post, List-Archive headers
Downloaded mboxen can be archived/stored indefinitely, try to make it easy for future archaelogists to find the online archive location.
2015-09-12searchview: support displaying entire threads
This hopefully makes it easy to perform queries to display an entire thread. Raise the limit in the threaded view to display more results and hopefully improve the output of thread display.
2015-09-06update copyright headers and email addresses
In the future, it should be possible to use this: git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-05extmsg: fall back to partial Message-ID matching
In case a URL gets truncated (as is common with long URLs), we can rely on Xapian for partial matches and bring the user to their destination.
2015-09-05view: preliminary HTML search interface
This hopefully makes it easier to find things without resorting to proprietary external services.
2015-09-04www: extra redirects for the '/'-challenged
Omitting a slash should not be fatal if unambiguous. Add fallbacks so users who expect a directory structure-like experience can have it at the cost of one extra HTTP request/response pair. This matches behavior of static sites.
2015-09-03www: move fallback after legacy matches
We do not want to get legacy URLs swallowed up by our workaround for weird and wonky servers that attempt to unescape PATH_INFO before the app sees it.
2015-09-03www: attempt to handle Message-IDs with slashes
Unfortunately, some HTTP servers will try to be clever with %2F and escape it to '/', making life difficult for us. Fortunately, not many Message-IDs have slashes in them.
2015-09-03get rid of Message-ID compression entirely
Provide a fallback for legacy SHA-1 messages, but do not advertise shorter URLs anymore for data portability concerns. This fixes a regression introduced in commit 81a9c1b476987d845b340ab9013d26cf4487cb9a ("search: disable Message-ID compression in Xapian") which ended up breaking thread-related endpoints for large Message-IDs, as lookups on the SHA-1 message no longer worked.
2015-09-02implement external Message-ID finder
Currently, this looks at other public-inbox configurations served in the same process. In the future, it will generate links to other Message-ID lookup endpoints.
2015-09-02view: optional flat view for recent messages
For still-active threads, it will likely be easier to follow them chronologically, especially if we have links to parent messages.
2015-09-01completely revamp URL structure to shorten permalinks
This allows common /m/ links to be used without a prefix, saving 2 precious bytes for permalinks and raw messages. Old URLs continue to redirect.
2015-09-01www: root atom feed is "new.atom" and not "atom.xml"
The MIME type entry for Atom feed relies on "atom", so allow properly-configured static file servers to serve it with the correct Content-Type header.
2015-09-01www: compile mbox regexp only once
No need for 'x' modifier to span more lines, though
2015-09-01implement per-thread Atom feeds
This allows users to subscribe to only a single thread with their feed reader without subscribing to the rest of the thread. Update our endpoint notes while we're at it.