about summary refs log tree commit homepage
path: root/lib/PublicInbox/WWW.pm
DateCommit message (Collapse)
2016-03-03use raw header for Message-ID
Message-IDs should not be MIME encoded, but in case they are, use the raw form for compatibility with ssoma and possibly other tools. This prevents a potential problem where a malicious client could confuse our storage layer into indexing incorrect contents.
2016-02-29fixup Plack-related requires
We do not need to load Plack::Request outside of WWW anymore.
2016-02-29distinguish error messages intended for users vs developers
For error messages intended to show user error (e.g. giving invalid options), we add a newline ("\n") at the end to polluting the output with location information. However, for diagnosing non-user-triggered errors, we should show the location of where the error occured.
2016-02-26www: add News* wrappers to preload
We want to preload as much as possible in -httpd when forking to save memory via CoW.
2016-02-26www: workaround for malformed NNTP links
Some linkifiers to create invalid HTTP links when it sees a link intended for NNTP services. This means we may see links to news.public-inbox.org/inbox.comp.mail.public-inbox.meta point to "http://" on port 80 instead of 119. Try to redirect users to http://public-inbox.org/meta/ in this case.
2016-02-25hval: implement common UI for protocol-relative URLs
This allows users to avoid HTTPS -> HTTP downgrade warnings, but we will also avoid encouraging them towards HTTPS, for now. IMHO: the CA system gives a false sense of security, TLS libraries (e.g. OpenSSL) can introduce new bugs and problems (even to attack clients), and TLS libraries also eats memory on cheap servers.
2016-02-25www: make interface more OO
This allows multiple instances the WWW app from running within the same process space
2016-02-25remove direct CGI.pm support
Relying on Plack::Handler::CGI is much easier for long-term maintenance and development. Nowadays, we even include our own httpd implementation to facilitate easier deployment with PSGI/Plack.
2016-02-24www: support $MESSAGE_ID/R/ endpoint for replies
Setting the "In-Reply-To:" header via mailto: links is not well-supported and should probably not be encouraged unless the client situation improves. So instead, teach users more widely-supported ways of setting the In-Reply-To: header to ensure proper threading of replies.
2016-02-13www: document "git clone --mirror" usage
Not everybody is willing to install or run ssoma; but at least document "git clone --mirror" usage to promote data distribution. It's wasteful to clone without "--mirror", so we'll suggest that to avoid wasting disk space and inodes.
2016-02-13www: advertise clone-ability over http/https
All public-inbox instances shall be clone-able.
2016-02-07support smart HTTP cloning
This requires POST and (small file) upload support from the PSGI/Plack web server. CGI.pm is currently not supported with this feature. We'll serve everything git can handle by default for performance in the general case. To avoid introducing cognitive overhead for sysadmins managing existing HTTP backends, we do not introduce new configuration directives. Thus, setting http.uploadpack=false in the relevant git config file for each public-inbox (ssoma) git repo will disable smart HTTP for CPU/memory-constrained systems. Technically we could support http.receivepack to allow posting messages to a public-inbox over HTTP(S), but that breaks the public-inbox model of encouraging users to Cc: everyone. Again, we encourage users to Cc: everyone to reduce the chance of a public-inbox becoming a centralized point of failure/censorship.
2016-02-02www: support git cloning via dumb HTTP
This is enabled by default, for now. Smart HTTP cloning support will be added later, but it will be optional since it can be highly CPU and memory intensive.
2016-01-09www: fix redirection loops
Sometimes users forget trailing slashes; but we should not punish them with infinite loops.
2016-01-03www: comments for denoting Plack::Request vs CGI
We'll probably want to continue supporting CGI for mod_perl compatibility.
2016-01-02www: redirect with query string
We use query strings for search and index pages, so we should not drop them if somebody types a URL by hand and omits the trailing slash.
2015-12-22rename 'GitCatFile' package to 'Git'
We'll be using it for more than just cat-file. Adding a `popen' API for internal use allows us to save a bunch of code in other places.
2015-12-22config: hoist out common functions
These will be reused elsewhere.
2015-11-20various internal documentation updates
Hopefully this gives new hackers a better overview of how the components relate to each other.
2015-10-04mbox: generate Archived-At, List-Post, List-Archive headers
Downloaded mboxen can be archived/stored indefinitely, try to make it easy for future archaelogists to find the online archive location.
2015-09-12searchview: support displaying entire threads
This hopefully makes it easy to perform queries to display an entire thread. Raise the limit in the threaded view to display more results and hopefully improve the output of thread display.
2015-09-06update copyright headers and email addresses
In the future, it should be possible to use this: git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright
2015-09-05extmsg: fall back to partial Message-ID matching
In case a URL gets truncated (as is common with long URLs), we can rely on Xapian for partial matches and bring the user to their destination.
2015-09-05view: preliminary HTML search interface
This hopefully makes it easier to find things without resorting to proprietary external services.
2015-09-04www: extra redirects for the '/'-challenged
Omitting a slash should not be fatal if unambiguous. Add fallbacks so users who expect a directory structure-like experience can have it at the cost of one extra HTTP request/response pair. This matches behavior of static sites.
2015-09-03www: move fallback after legacy matches
We do not want to get legacy URLs swallowed up by our workaround for weird and wonky servers that attempt to unescape PATH_INFO before the app sees it.
2015-09-03www: attempt to handle Message-IDs with slashes
Unfortunately, some HTTP servers will try to be clever with %2F and escape it to '/', making life difficult for us. Fortunately, not many Message-IDs have slashes in them.
2015-09-03get rid of Message-ID compression entirely
Provide a fallback for legacy SHA-1 messages, but do not advertise shorter URLs anymore for data portability concerns. This fixes a regression introduced in commit 81a9c1b476987d845b340ab9013d26cf4487cb9a ("search: disable Message-ID compression in Xapian") which ended up breaking thread-related endpoints for large Message-IDs, as lookups on the SHA-1 message no longer worked.
2015-09-02implement external Message-ID finder
Currently, this looks at other public-inbox configurations served in the same process. In the future, it will generate links to other Message-ID lookup endpoints.
2015-09-02view: optional flat view for recent messages
For still-active threads, it will likely be easier to follow them chronologically, especially if we have links to parent messages.
2015-09-01completely revamp URL structure to shorten permalinks
This allows common /m/ links to be used without a prefix, saving 2 precious bytes for permalinks and raw messages. Old URLs continue to redirect.
2015-09-01www: root atom feed is "new.atom" and not "atom.xml"
The MIME type entry for Atom feed relies on "atom", so allow properly-configured static file servers to serve it with the correct Content-Type header.
2015-09-01www: compile mbox regexp only once
No need for 'x' modifier to span more lines, though
2015-09-01implement per-thread Atom feeds
This allows users to subscribe to only a single thread with their feed reader without subscribing to the rest of the thread. Update our endpoint notes while we're at it.
2015-08-30www: avoid BEGIN block for config loading
It fails the syntax check if a user does not have ~/.public-inbox/config setup. Anyways we can safely use ||= on a global since we do not support threads.
2015-08-29avoid length in boolean context
Perl does not currently optimize for this. ref (from p5p): http://mid.gmane.org/D5C27970-9176-4C7A-8B99-7D78360E67A2@pobox.com
2015-08-27implement legacy redirects for old URLs
We should not break existing URLs. Redirect them to the newer, less-ambiguous URLs to improve cache hit ratios.
2015-08-27wire up to display non-suffixed Message-ID links
These URLs are preferable in case somebody decides to get cute and use a suffix we would've used to prevent others from linking to their message. The common /m/$MESSAGE_ID/ URLs are now 4 characters shorter so may fit better on terminals.
2015-08-27wire up shorter, less ambiguous URLs
We will prefer URLs without suffixes for now to avoid ambiguity in case a Message-ID ends with ".html", ".txt", ".mbox.gz" or any other suffix we may use. Static file compatibility is preserved by using a trailing slash as most servers can/will fall back to an index.html file in this case. For raw text files, we will follow gmane's lead with "/raw"
2015-08-27www: minor cleanups to shorten code
Less scrolling is more efficient.
2015-08-27www: reduce unused arguments in internal API
Less code is easier-to-manage, although we make a few extra hash insertions now.
2015-08-24view: refactor $state as a hash
Using hash means we no longer have to document and remember what every field does. The original array form was insane premature optimization and crazy. Who wrote that? Oh wait, I was on drugs :<
2015-08-23.txt links return an mbox instead
This improves compatibility and allows individual messages to be concatenated into an existing mbox without further modifications. "git format-patch" does something similar (but does not do "From " line escaping(!))
2015-08-22mbox: support uncompressed mbox
Some folks may want to view the mbox inline as a string of raw text, when guessing URLs. Let them do this...
2015-08-22remove XML::Atom::SimpleFeed dependency
We will attempt to generate Atom feeds "by hand" as the XML::Atom::SimpleFeed API does not support streaming output. Since email is large and servers are small, this should prevent wasting memory when we generate larger feeds. Of course, we hope clients use SAX parsers capable of handling large streams without slurping.
2015-08-22www: enable and expand preload from mod_perl2
Hopefully this saves us some memory with CoW on *nix.
2015-08-22stream HTML views as much as possible
This should allow progressive rendering on the client and reduce memory usage on the server. Unfortunately XML::Atom::SimpleFeed does not yet support streaming, so we may not use it in the future.
2015-08-21switch to gzipped mboxes
Mboxes may be huge, so only support downloading gzipped mboxes to save bandwidth and to get free checksumming. Streaming output means we should not be wasting too much memory on this unless the chosen server sucks.
2015-08-21support dumping thread as an mbox
Some folks may not want to download and install Perl code like ssoma, so allow downloading an mbox containing the entire thread.
2015-08-20dead code cleanup
We may not be using subject_path after all.