about summary refs log tree commit homepage
path: root/examples
DateCommit message (Collapse)
2024-04-16doc: note MALLOC_MMAP_THRESHOLD_ as a potential workaround
Large string processing + concurrency + caching/memoization really brings out the worst in glibc malloc :<
2024-03-16Fix some typos and language nits in docs and comments
2024-03-12doc: tuning: note reduced fragmentation w/ jemalloc
I may be mistaken, but I suspect the reason jemalloc handles long-lived processes better than glibc is due to granularity reduction being scaled to larger size classes. This can waste 20% of an individual allocation, but increases the likelyhood of reuse (without splitting/consolidating into other sizes). In other words, glibc seems to try too hard to make the best fit for initial allocations. This ends up being suboptimal over time as those allocations are freed and similar (but not identical) allocations come in. jemalloc sacrifices the best initial fit for better fits over a long process lifetime.
2024-01-17examples/unsubscribe-milter@.service: use KillMode=process
This can be a multi-process daemon, but systemd should only kill the top-level one. And also finish a comment about the User having access to the shared private key.
2023-11-25examples/unsubscribe.milter: limit scope of munging
We don't want the milter to munge List-Unsubscribe headers from external (incoming) mlmmj lists, only lists hosted on the server running unsubscribe.milter. Adding support for an allow_domains file should've been enough, but this further restricts the milter to only operating on Postfix connections from localhost.
2023-11-11doc: update README.unsubscribe
The whitelist was only used in the early days of its development and hasn't existed for a while. I've largely forgotten this thing exists since it's been working well...
2023-10-28examples/logrotate: only SIGUSR1 main process
There's no need to send SIGUSR1 to auxiliary processes since they don't know what to do with them.
2023-10-28examples/*.service: avoid `nobody' user on systemd
systemd complains about `User=nobody' since `nobody' has access to all files which can't be mapped to a valid UID. We'll also switch to `Group=ssl-cert' since that ought to be able to read TLS certificates.
2023-08-28Fix some typos/grammar/errors in docs and comments
2023-02-22examples: remove `Standard{Error,Output} = syslog' lines
systemd (247.3-7+deb11u1 on Debian 11.x) considers them "obsolete" and emits the following to my syslog: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether. So we'll remove it altogether, as I'm sticking with rsyslog for now.
2022-11-26examples/nginx_proxy: recommend `proxy_buffering off'
public-inbox-httpd has always been designed to handle slow clients efficiently via non-blocking sockets and epoll|kqueue. Thus the proxy buffering capabilities of nginx were a needless waste of memory and filesystem traffic and increases response latency. nginx does provide an HTTPS-capable reverse-proxy to talk to varnish, however, any other HTTPS-capable reverse proxy works, too.
2022-10-24treewide: replace /^I: / prefix with /^# /
This is like more familiar to readers of TAP (Test Anywhere Protocol) output, as well as shell and Perl scripters which also use `#' for comments. AFAIK, nobody is parsing our stderr, and I'm not sure how standardized the `I:' prefix is (nor `W:' and `E:' are). It's already the prevailing style in Lei* code, too, so things have been moving in that direction for a bit.
2022-08-11examples: add systemd files for -netd
It's important show that a single systemd service and socket file can replace all other read-only daemons for ease-of-management.
2022-08-11examples: consolidate systemd socket examples
systemd.socket(5) files can actually contain multiple listen sockets, so shave down inode overhead and simplify config file management by consolidating all applicable ports into a single file for each daemon.
2022-08-11doc: drop ancient Apache and WEBrick examples
Having old, unmaintained docs for other HTTP servers is likely harmful at this point. public-inbox-httpd is specifically designed to handle git repos on slow storage and stream giant mbox.gz files fairly to slow clients.
2022-04-02examples/unsubscribe.milter: RFC 8058 (List-Unsubscribe=One-Click)
This allows unambiguous signaling to some MUAs and webmail clients that th List-Unsubscribe header contains an instantaneous unsubscribe option.
2022-04-02examples/unsubscribe.milter: use IO::Socket, again
Sendmail::PMilter requires an IO::Socket object, not a GLOB. Fixes: e901a56b3b30b22f (treewide: favor open(..., '+<&=', $fd), 2021-05-21)
2021-09-17search: fix rt: w/ approxidate when TZ != UTC
While git respects a user's local timezone and returns seconds-since-the-Epoch, we were unnecessarily and incorrectly calling gmtime+strftime on its result. So ignore calling gmtime+strftime when the strftime format is "%s", just feed the output time from git directly to Xapian. This is mainly for lei, which will likely run in a variety of timezones. While we're at it, add a recommendation to use TZ=UTC in public-inbox-httpd, in case there are (misguided :P) sysadmins who set a non-UTC TZ.
2021-05-23treewide: favor open(..., '+<&=', $fd)
Cut down on unnecessary imports of IO::Handle and method lookup + dispatch overhead.
2021-03-19examples: cgit-commit-filter: drop <tt> HTML tag, use title=
<tt> doesn't seem necessary and it's deprecated in HTML, nowadays. In any case, dillo's CSS support seems to show it as fixed-width even without <tt>. Use the title= attribute to highlight that it goes to the mail thread, too. In the future, we'll probably link to something like "lei p2q" (patch-to-query) to include OIDs in the search.
2021-03-13examples/varnish-4: http => httpd
Our HTTP daemon is `public-inbox-httpd', not `public-inbox-http'.
2021-02-27examples/cgit-commit-filter: improve quoted text handling
With an example such as: something before "quoted phrase" something after The Xapian will now see: [ "something before", "quoted phrase", "something after" ] whereas before it would see: [ "something before", "quoted", "phrase", "something after" ] which should improve search results accuracy when looking up commits by commit title (subject).
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-09rename {pi_config} fields to {pi_cfg}
{pi_config} may be confused with the documented `PI_CONFIG' environment variable, and we'll favor vowel-removal to be consistent with our usage of object references. The `pi_' prefix may stay in some places, for now; since a separate namespace may come into this codebase for local/private client-tooling. For InboxIdle, we'll also remove an invalid comment about holding a reference to the PublicInbox::Config object, too.
2020-08-26grok-pull.post_update_hook: flock(2) before SQLite check
Unlike DBD::SQLite, the sqlite3(1) CLI does not have a default busy timeout enabled, so it easily times out while acquiring a SHARED lock for read-only queries. We can avoid battery-wasting polling from the SQLite timeout handler by relying on flock(2) as we do in our Perl code. Furthermore, this avoids triggering some locking problems[1] from a long "SELECT COUNT(*) ..." query and reindex. While there may be other SQLite-related parallelism issues[1], this works around one of them by relying on flock(2). [1] https://public-inbox.org/meta/20200825001204.GA840@dcvr/
2020-08-25examples: add imapd systemd examples
We've got examples for all the other daemons, too!
2020-08-14grok-pull.post_update_hook: favor --sequential-shard for HDD
--sequential-shard offers better performance on HDD than -j0 since the on-disk active set can be kept small (with -j $HIGH_NUM). --batch-size can also be helpful for systems with much RAM.
2020-07-29examples/grok-pull.post_update_hook: fix description URL
I finally noticed descriptions weren't showing up in my mirrors :x
2020-07-17doc: add some recommendations around slow HDDs
grok-pull is still painful with serialization on an old USB 2.0 HDD, but at least it can finish with flock(1) and disabling parallelization. While parallel "git fetch" doesn't seem so bad, slow seeks are exacerbated by parallel reads in Xapian. That means some updates can take days instead of hours. The same updates take only seconds or minutes on an SSD.
2020-07-06stop auto-loading Plack::Middleware::Deflater
Instead of gzipping some (mbox.gz, manifest.js.gz) responses and leaving P::M::D to do the rest, we gzip everything ourselves, now, so P::M::D is redundant.
2020-04-06examples/grok-pull.post_update_hook: move url_base to the top
Users are encouraged to edit this script, anyways, so make it easy for them to swap out and use whatever URL they need.
2020-04-06examples/grok-pull.post_update_hook: capture infourl
The value of infourl parameters are shared in the config, so include them in the mirror.
2020-04-06examples/grok-pull.post_update_hook: fetch mirror description
The $INBOX_URL/description endpoint is available since v1.3.0, so use it in mirrors.
2020-03-21examples/*.psgi: add examples for -httpd
public-inbox-httpd should work with any PSGI files, so make it more apparent to people reading .psgi examples.
2020-02-24examples/nginx_proxy: convert CRLF to LF
It was the only file in our tree which had CRLF line endings, so make it consistent with the rest.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-01-12examples/unsubscribe.milter: support unique mailto:
Instead of providing a generic "mailto:foo+unsubscribe@example.com" address in List-Unsubscribe which requires confirmation, replace it with a mailto: header with a unique subject which contains the same unique ID we put in the https:// URL. This makes it easier for some MUAs without https:// support to unsubscribe with a single action via the List-Unsubscribe header.
2020-01-12examples/unsubscribe.milter: skip gmane-mx
Mail to gmane is being delivered to gmane-mx.org, nowadays, and we don't want ordinary readers to be able to trigger unconfirmed unsubscription off any mailing lists which go through our unsubscribe.milter. https://lars.ingebrigtsen.no/2020/01/06/whatever-happened-to-news-gmane-org/
2020-01-03examples: add empty "lib" dir to placate plackup
This is necessary for Filesys::Notify::Simple 0.13 using Linux::Inotify2, since 0.13 started croaking on inotify_add_watch failures.
2019-10-18examples/grok-pull.post_update_hook: fix config detection
We need to account for both the old ("mainrepo") and new ("inboxdir") names. But "dir" was just a search+replace error and we don't use that outside of "coderepo.dir".
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-16examples/grok-pull.post_update_hook: use "inbox_dir"
Move away from using "mainrepo" since it's confusing to new users, especially with v2.
2019-10-07examples: add grok-pull post_update_hook example
This requires the latest (to be in 1.2) -init changes for synchronization and has no dependencies on GNU or bash-isms so it should run on *BSD systems without GNU tools. It does attempt to use curl on <$INBOX_URL/_/text/config/raw>, but curl is fairly standard nowadays, and falls back to using an invalid address to initialize.
2019-09-14doc: update nntpd with NNTPS and STARTTLS examples
NNTPS and STARTTLS seems to be working for several months without incident on news.public-inbox.org, so consider it a success and maybe others can try using it. HTTPS technically works, too, but isn't documented at the moment since I can't recommend production deployments without varnish protecting it.
2019-09-09run update-copyrights from gnulib for 2019
2019-06-30examples/*@.service: sockets MUST be NonBlocking
For users running multiple (-nntpd@1, -nntpd@2) instances of either -httpd or -nntpd via systemd to implement zero-downtime restarts; it's possible for a listen socket to become blocking for a moment during an accept syscall and cause a daemons to get stuck in a blocking accept() during PublicInbox::Listener::event_step (event_read in previous versions). Since O_NONBLOCK is a file description flag, systemd clearing O_NONBLOCK momentarily (before PublicInbox::Listener::new re-enables it) creates a window for another instance of our daemon to get stuck in accept(). cf. systemd.service(5)
2019-06-04examples: add sample nginx configuration
The sample configuration can be used to proxy-pass requests to public-inbox-httpd or to a standalone PSGI/Plack server.
2019-05-14httpd: get rid of Deflater warning
Deflating responses may be done by the reverse proxy (e.g. varnish or nginx), so the warning for it could be invalid.
2019-04-25examples/cgit-commit-filter.lua: some doc updates
It's been a while since I wrote this, and it needs to be kept up-to-date with some advances in our Perl code.
2019-04-25examples: cgit filter for use with WwwHighlight
I'm using this as the cgit about-filter and source-filter in https://80x24.org/public-inbox.git