about summary refs log tree commit homepage
path: root/lib/PublicInbox/Inbox.pm
DateCommit message (Collapse)
2017-01-07searchmsg: favor direct hash access over accessor methods
This is faster, smaller, and more straighforward to me with fewer layers of indirection.
2017-01-07inbox: eliminate weaken usage entirely
We can do a better job initializing the data structure so we no longer need to rely on weak references to cleanup when we ditch the config on reload.
2017-01-07inbox: describe the full key name
Hopefully make this easier for future generations to understand.
2016-12-17feed: support publicinbox.<name>.feedmax
This allows users to customize by using smaller or larger Atom feeds than the default value of 25 entries.
2016-10-05inbox: deal with ghost smsg
smsg will be undef for ghost messages in a subsequent commit
2016-08-12www: allow including links to NNTP sites in HTML footer
Improve the discoverability of NNTP endpoints for users who still know what NNTP is. ==> ~/.public-inbox/config <== ; aliases for the locally-run nntpd can be specified in ; the "publicinbox" section: [publicinbox] nntpserver = nntp://ou63pmih66umazou.onion/ nntpserver = news.public-inbox.org ; NNTPS is not supported natively, yet, ; but one can use haproxy or similar ; nntpserver = nntps://news.public-inbox.invalid/ ; mirrors for specific inboxes may be specified either as full ; NNTP (or NNTPS) URLs, or with the server name only if the ; newsgroup name is specfied for a local NNTP server [publicinbox "git"] ... newsgroup = inbox.a.b.c nntpmirror = nntp://czquwvybam4bgbro.onion/ nntpmirror = hjrcffqmbrq6wope.onion ; there may be a mirror on a different server with a ; different name: nntpmirror = nntp://news.example.com/differently.named.group ; (And I really need to write manpages for all this...)
2016-08-11search: support alt-ID for mapping legacy serial numbers
For some existing mailing list archives, messages are identified by serial number (such as NNTP article numbers in gmane). Those links may become inaccessible (as is the current case for gmane), so ensure users can still search based on old serial numbers. Now, I run the following periodically to get article numbers from gmane (while news.gmane.org remains): NNTPSERVER=news.gmane.org export NNTPSERVER GROUP=gmane.comp.version-control.git perl -I lib scripts/xhdr-num2mid $GROUP --msgmap=/path/to/gmane.sqlite3 (I might integrate this further with public-inbox-* scripts one day). My ~/.public-inbox/config as an added "altid" snippet which now looks like this: [publicinbox "git"] address = git@vger.kernel.org mainrepo = /path/to/git.vger.git newsgroup = inbox.comp.version-control.git ; relative pathnames expand to $mainrepo/public-inbox/$file altid = serial:gmane:file=gmane.sqlite3 And run "public-inbox-index --reindex /path/to/git.vger.git" periodically. This ought to allow searching for "gmane:12345" to work for Xapian-enabled instances. Disclaimer: while public-inbox supports NNTP and stable article serial numbers, use of those for public links is discouraged since it encourages centralization.
2016-08-04searchmsg: add git object ID to doc_data
Doing git tree lookups based on the SHA-1 of the Message-ID is expensive as trees get larger, instead, use the SHA-1 object ID directly. This drastically reduces the amount of time spent in the "git cat-file --batch" process for fetching the /$INBOX/all.mbox.gz endpoint on the ~800MB git@vger.kernel.org mirror This retains backwards compatibility and allows existing indices to be transparently upgraded without performance degradation.
2016-07-27localize $/ when using chomp
Callers may have localized $/ to something else, so make sure we chomp the expected character(s) when calling chomp.
2016-07-09www: add configurable limiters
Currently only for git-http-backend use, this allows limiting the number of spawned processes per-inbox or by group, if there are multiple large inboxes amidst a sea of small ones. For example, a "big" repo limiter could be used for big inboxes: which would be shared between multiple repos: [limiter "big"] max = 4 [publicinbox "git"] address = git@vger.kernel.org mainrepo = /path/to/git.git ; shared limiter with giant: httpbackendmax = big [publicinbox "giant"] address = giant@project.org mainrepo = /path/to/giant.git ; shared limiter with git: httpbackendmax = big ; This is a tiny inbox, use the default limiter with 32 slots: [publicinbox "meta"] address = meta@public-inbox.org mainrepo = /path/to/meta.git
2016-07-07inbox: cleanup and consolidate object weakening
This fixes some layering violations and consolidates the cleanup into the inbox object itself. Keeping in mind weakening does not work at all without our PSGI server.
2016-07-02extmsg: rework to use Inbox objects
This is less code and hopefully easier-to-understand.
2016-07-02inbox: base_url method takes PSGI env hashref instead
This is lighter and we can work further towards eliminating our Plack::Request dependency entirely.
2016-06-26inbox: avoid trying s// on undef
Oops, I guess I'm trigger-happy today. Fixes: 31a6ff1221fe ("inbox: ensure we do not show leading "From " lines")
2016-06-26inbox: ensure we do not show leading "From " lines
Some messages will be misimported due to an old bug, clean them up and ensure we do not propagate the mistake. Followup-to: a0c07cba0e5d ("mda: drop leading "From " lines again")
2016-06-25inbox: do not weaken already-weak refs
This quiets a (hopefully harmless) warning when a ref remains alive through several expiry timeouts.
2016-06-20inbox: move field population logic to initializer
Inboxes are normally created by Config, but having the population logic in Inbox should make it easier to mock for testing.
2016-06-20feed: various object-orientation cleanups
Favor Inbox objects as our primary source of truth to simplify our code. This increases our coupling with PSGI to make it easier to write tests in the future. A lot of this code was originally designed to be usable standalone without PSGI or CGI at all; but that might increase development effort.
2016-06-15inbox: allow undef return value for base_url
It should be possible to serve the contents of a public-inbox over NNTP but not HTTP.
2016-05-29inbox: drop references ASAP for search and msgmap
We can't leave them lingering in the parent process at all due to the risk of corruption with multiple processes.
2016-05-28remove redundant NewsGroup class
Most of its functionality is in the PublicInbox::Inbox class. While we're at it, we no longer auto-create newsgroup names based on the inbox name, since newsgroup names probably deserve some thought when it comes to hierarchy.
2016-05-28www: remove footer_html support
I haven't used it in a while and the existing "description" is probably good enough. If we support it again, it should be plain-text + auto-linkified for ease-of-maintenance and consistency.
2016-05-16www: fix for running under mount paths
We try to avoid issues like these by using relative URLs in hrefs, but we can't avoid the problem with Location: for redirects and Atom feeds which are likely to be rehosted elsewhere. We also reorder some of the code to work around a weird issue on the psgi-plack mailing list: <20160516073750.GA11931@dcvr.yhbt.net> (Somewhere on https://groups.google.com/group/psgi-plack but it's probably not bookmarkable)
2016-05-16declare Inbox object for reusability
From the beginning, we've avoided objects here in favor of faster startup time; but it may not be worth it since a persistent httpd/nntpd is faster and -mda isn't hit as often.