about summary refs log tree commit homepage
path: root/lib/PublicInbox/WwwText.pm
DateCommit message (Collapse)
2023-11-29www: start working on a repo listing
The HTML is still extremely rough, but links seem to be mostly working...
2023-11-29www: load and use cindex join data
This is a major step in solving the problem of having to manually associate hundreds/thousands of coderepos with hundreds/thousands of public-inboxes to power solver (and more).
2023-09-15pop3: limit default mailbox to 1K messages
This is probably friendlier to webmail providers which support importing mail from POP3.
2023-01-05www: make coderepo URL generation more consistent
WwwStream and WwwText basically show the same thing, except the latter relies on Linkify to create links.
2022-09-10www_text: reduce parameter passing for response header
This is a tiny step in making the code slightly less confusing by reusing common field names and reducing dependencies on argument ordering.
2022-08-29www: provide text/help/#search anchor
This allows jumping to the appropriate section of the "help" from under the dfblob textarea search.
2022-08-29www: allow html_oneshot to take an array arg
Another step towards making our internal APIs more writev-like and reducing the copies needed for `join' or `.=' concatenation.
2022-08-20www: use absolute URLs for coderepo URLs
Showing "../../foo.git" looks awkward and isn't conducive to users who want to "git clone" a URL.
2022-08-11www_text: fix #nntp anchor for there's a single NNTP server
We use "Newsgroup" (singular) when there's only one NNTP server address configured.
2022-08-10www_text: add AUTH=ANONYMOUS to IMAP URLs
While the ';' requires escaping on the command-line, the presence of ";AUTH=ANONYMOUS" communicates clearly that anonymous access is supported in accordance to RFC 4505.
2022-08-10www_text: clarify the password+username is for POP3
NNTP and IMAP can also exist in the same area, so clarify that the username + password is only for POP3
2022-08-10www_text: add #nntp, #pop3, and #imap anchors to help HTML
This will make it easier to link to these sections in 3rd-party documentation.
2022-08-03www: simplify GzipFilter->zflush callers
->zflush can take a buffer arg, so there's no need to make a separate call to ->translate in some cases.
2022-07-30doc|www: flesh out POP3 documentation for servers and users
Hopefully it makes sense to new users deploying or using POP3...
2021-10-26www: mirror: fix rendering of NNTP URLs
As of commit 738c4a65, the code for reporting NNTP information in _/text/mirror/ incorrectly uses ->imap_url rather than ->nntp_url. Fixes: 738c4a65719e6278 ("www: various help text updates")
2021-10-15www: various help text updates
`dt:' documentation is redundant with `d:' approxidate support; so drop `dt:' since mairix uses `d:'. We'll also document `rt:' since there are legit messages from senders with broken clocks. Reduce indentation level of help texts to be in 2-space increments to using too much horizontal space. We'll always place IMAP ahead of NNTP since it's alphabetical and there's likely more IMAP clients out there. Add "--ng NEWSGROUP" to -init instructions if configured. There's also some minor wording changes throughout.
2021-10-12www: _/text/config/raw Last-Modified: is mm->created_at
This allows IMAP mirrors to keep UIDVALIDITY synchronized (and "LIST ACTIVE.TIMES" in NNTP). "lei add-external --mirror" will automatically set it, as will the combination of public-inbox-clone + public-inbox-index. This avoids the need for extra endpoints or config entries, at least...
2021-09-16www: support publicinbox.imapserver
This allows PublicInbox::WWW hosts to advertise the existence of IMAP servers in addition to NNTP servers.
2021-09-16inbox: streamline ->nntp_url
We no longer waste a precious hash slot for a per-Inbox {nntpserver} if it's only configured globally for all inboxes.
2021-08-31www_text/mirror: spell out "external index" and "public inbox"
"extindex" and "public-inbox" are project-specific terms which are probably unsuitable for folks who are seeing this for the first time. Use "public inbox" when referring to actual public inboxes, since "public-inbox" is merely the name for this particular implementation and others have adopted the same concept (IMHO the concept is more important than any particular implementation).
2021-08-30www: move mirror instructions to /text/
This makes the mirroring and code retrieval instructions less obstructive. Relying on WwwText means we only use our Linkify module to make hrefs of full URLs; making relative and shortened hrefs off-limits; hopefully this isn't too much of a problem. coderepo information remains duplicated on every page since (IMHO) coderepos are an important feature; but nobody besides me has ever bothered to configure coderepos, so I suppose it's fine... Suggested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://public-inbox.org/meta/20210826132747.6gxuwnhftyf7c6hp@nitro.local/
2021-08-28www_text: add coderepo config support for extindex
At least manually configured coderepos "just work" for extindex, though it probably could be automatic and inherited from the publicinbox configs.
2021-08-28www_text: fix example config snippet for extindex
extindex doesn't use the same config stuff as normal "publicinbox" entries, so we'll need a separate function for them.
2021-08-28get rid of unnecessary bytes::length usage
The only place where we could return wide characters with -httpd was the raw $INBOX_DIR/description text, which is now converted to octets. All daemon (HTTP/NNTP/IMAP) sockets are opened in binary mode, so length() and bytes::length() are equivalent on reads. For socket writes, any non-octet data would warn about wide characters and we are strict in warnings with test_httpd. All gzipped buffers are also octets, as is PublicInbox::Eml->body, and anything from PerlIO objects ("git cat-file --batch" output, filesystems), so bytes::length was unnecessary in all those places.
2021-03-17config: lazy-load coderepos, support extindex
Extsearch objects are duck-types of Inbox objects, and are capable of supporting code repos all the same.
2021-02-04www: call curl with -d '' in the altid instructions
Nginx doesn't appear to be happy with just -XPOST, so use -d '' to avoid potential confusion about why the instructions aren't working. cf. commit 533e1234bc03a1ca8754d249aa8c2ce157e26780 (lei_xsearch: use curl -d '' for nginx compatibility, 2021-01-24)
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-09rename {pi_config} fields to {pi_cfg}
{pi_config} may be confused with the documented `PI_CONFIG' environment variable, and we'll favor vowel-removal to be consistent with our usage of object references. The `pi_' prefix may stay in some places, for now; since a separate namespace may come into this codebase for local/private client-tooling. For InboxIdle, we'll also remove an invalid comment about holding a reference to the PublicInbox::Config object, too.
2020-12-09treewide: replace {-inbox} with {ibx} for consistency
{ibx} is shorter and is the most prevalent abbreviation in indexing and IMAP code, and the `$ibx' local variable is already prevalent throughout. In general, the codebase favors removal of vowels in variable and field names to denote non-references (because references are "lighter" than non-references). So update WWW and Filter users to use the same code since it reduces confusion and may allow easier code sharing.
2020-12-05isearch: emulate per-inbox search with ->ALL
Using "eidx_key:" boolean prefix to limit results to a given inbox, we can use ->ALL to emulate and replace per-Inbox xap15/[0-9] search indices. With this change, the presence of "extindex.all.topdir" in the $PI_CONFIG will cause the WWW code to use that extindex and ignore per-inbox Xapian DBs in xap15/[0-9]. Unfortunately IMAP search still requires old per-inbox indices, for now. Mapping extindex Xapian docids to per-Inbox UIDs and vice-versa is proving tricky. Fortunately, IMAP search is rarely used and optional. The RFCs don't specify expensive phrase search, either, so `indexlevel=medium' can be used in per-inbox Xapian indices to save space. For primarily WWW (and future JMAP) users; this should result in significant disk space, FD, and page cache footprint savings for large instances with many inboxes and many cross-posted messages.
2020-09-16wwwtext: link to public-inbox.org/meta archives
Since we're advertising our address at meta@public-inbox.org, we should advertise the archives, too.
2020-09-10wwwtext: config comment improvements
Use the full URL of the inbox being mirrored to reduce ambiguity (instead of just the inbox name). Using asymmetric quotes (e.g `foo') improves readability for me in that it's more obvious when a quote begins and ends. It also lights up fewer pixels and reduces visual noise compared to double-quotes. We'll also reflow the `mainrepo' vs `inboxdir' comment slightly to emphasize the word `instead'.
2020-09-10wwwtext: don't blindly quote "git clone" destination
Save screen space and light up fewer pixels to reduce visual noise.
2020-09-10wwwtext: describe the use of `coderepo' entries
The `solver' feature is not very obvious, give potential users a hint about it.
2020-07-06wwwtext: simplify gzf_maybe use
gzf_maybe always returns a GzipFilter object, even if it uses CompressNoop. We can also use ->zflush instead of ->translate(undef) here for the final bit.
2020-07-06remove unused/redundant zlib-related imports
Z_FINISH is the default for Compress::Raw::Zlib::Deflate->flush, anyways, so there's no reason to import it. And none of C::R::Z is needed in WwwText now that gzf_maybe handles it all.
2020-07-06wwwtext: switch to html_oneshot
No point in streaming a tiny response via ->getline, but we may stream to a gzipped buffer, later.
2020-07-06wwwtext: gzip text/plain responses, as well
Most of our plain-text responses are config files big enough to warrant compression.
2020-04-20watchmaildir: support multiple watchheader values
The watchheader key supports only a single value. Supporting multiple watchheader values was mentioned in discussion [1] of 8d3e3bd8 (doc: explain publicinbox.<name>.watchheader, 2019-10-09), and it wasn't clear if there was a need. One scenario in which matching multiple headers would be convenient is when someone wants to set up public-inbox archives for some small projects but does _not_ want to run mailing lists for them, instead allowing others to follow the project by any of the pull mechanisms. Using a common underlying address, an address alias for each project is configured via a third-party email provider, with messages for each alias being exposed as a separate public-inbox archive. In this setup, messages for an inbox cannot be selected by a List-ID header but can be identified by the inbox's address in either the To or Cc header. To support such a use case, update the watchheader handling to consider multiple values, accepting a message if it matches any value. While selecting a message based on matching _any_ rather than _all_ values is motivated by the above scenario, it's worth noting that the "any" behavior is consistent with how multiple listid config values are handled. [1] https://public-inbox.org/meta/20191010085118.r3amey4cayazfycb@dcvr/
2020-03-26wwwtext: show altid instructions in config
Exposing altid dumps will help and ensure total reproducibility of existing instances. AFAIK, sqlite3(1) can't execute arbitrary code, so it's not quite as fashionable as the "curl | bash" stuff the cool people are doing, these days :P
2020-03-25wwwtext: show thread endpoint w/ indexlevel=basic
And show contact info when there's no indexing, at all. Installations where Xapian is too expensive can still support threading since it only depends on SQLite, so we need to inform users of what's available.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-02-01config: assume multiple cgit URLs, too
Since we support inboxes with multiple URLs and multiple infourls to reduce reliance on SPOFs, we'll do the same with cgit URLs.
2020-02-01wwwtext: give "url" examples in sample config
inbox.$NAME.url is a common parameter and set by public-inbox-init(1), so ensure we have lines for it and emphasize it can be multi-value for .onion hidden services or otherwise mirrored and available under multiple URLs.
2020-02-01wwwtext: show multiple infourl values properly
This is now an array, so ensure it's shown properly in the sample config, instead of "ARRAY(0xI8BADBEEF)" or similar. Fixes: 1988d730c0088e8b "config: support multi-value inbox.*.*url"
2019-12-27wwwtext: avoid anonymous sub in response
We can pass arbitrary local variables via WWW $ctx, so just pass that into the one-off _do_linkify sub which already exists.
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-15wwwtext: show listid config directive(s)
We want to share this piece for potential mirror-ers just like watchheader.
2019-09-27wwwtext: support $INBOX_URL/_/text/config/raw
This returns a git-config(1)-compatible file to make it easier to get started on mirroring an existing public-inbox. Omitting the "raw" from the URL works, as well, but I'm not sure if it's very useful.
2019-09-09run update-copyrights from gnulib for 2019