about summary refs log tree commit homepage
path: root/lib/PublicInbox/NNTP.pm
DateCommit message (Collapse)
2019-01-16nntp: header responses use CRLF consistently
Alpine is apparently stricter than other clients I've tried w.r.t. using CRLF for headers. So do the same thing we do for bodies to ensure we only emit CRLFs and no bare LFs. Reported-by: Wang Kang <i@scateu.me> https://public-inbox.org/meta/alpine.DEB.2.21.99.1901161043430.29788@la.scateu.me/
2019-01-08nntp: fix uninitialized variable in event_read
do_write must return 0 or 1.
2018-12-06nntp: prevent event_read from firing twice in a row
When a client starts pipelining requests to us which trigger long responses, we need to keep socket readiness checks disabled and only enable them when our socket rbuf is drained. Failure to do this caused aborted clients with "BUG: nested long response" when Danga::Socket calls event_read for read-readiness after our "next_tick" sub fires in the same event loop iteration. Reported-by: Jonathan Corbet <corbet@lwn.net> cf. https://public-inbox.org/meta/20181013124658.23b9f9d2@lwn.net/
2018-10-16Add Xrefs to over/xover lines
Putting the Xref field into xover lines allows newsreaders to mark cross-posted messages read when catching up a group. That, in turn, massively improves the life of crazy people who try to follow dozens of kernel lists, where emails are often heavily cross-posted.
2018-10-16Put the NNTP server name into Xref lines
RFC 5536 sec 3.2.14 says that the server-name in an Xref line is "which news server generated the header field"; indeed, that is necessary for newsreaders like gnus to handle references properly. So pick up the server name from the config if available (the first name if there's more than one), from the host name otherwise, and use it rather than the domain name of the list server. Tests have been adjusted to match the new behavior.
2018-04-18Merge remote-tracking branch 'origin/master' into v2
* origin/master: nntp: allow and ignore empty commands mbox: do not barf on queries which return no results nntp: fix NEWNEWS command searchview: fix non-numeric comparison Allow specification of the number of search results to return githttpbackend: avoid infinite loop on generic PSGI servers http: fix modification of read-only value extmsg: use news.gmane.org for Message-ID lookups extmsg: rework partial MID matching to favor current inbox Update the installation instructions with Fedora package names nntp: do not drain rbuf if there is a command pending nntp: improve fairness during XOVER and similar commands searchidx: do not modify Xapian DB while iterating Don't use LIMIT in UPDATE statements
2018-04-18nntp: allow and ignore empty commands
Somebody hitting "\n" into telnet shouldn't hold a client up indefinitely and prevent shutdown.
2018-04-07store less data in the Xapian document
Since we only query the SQLite over DB for OVER/XOVER; do not need to waste space storing fields To/Cc/:bytes/:lines or the XNUM term. We only use From/Subject/References/Message-ID/:blob in various places of the PSGI code. For reindexing, we will take advantage of docid stability in "xapian-compact --no-renumber" to ensure duplicates do not show up in search results. Since the PSGI interface is the only consumer of Xapian at the moment, it has no need to search based on NNTP article number.
2018-04-06nntp: set Xref across multiple inboxes
Noted by Jonathan Corbet in https://lwn.net/Articles/748184/
2018-04-03nntp: simplify the long_response API
We we worked around the default range/termination conditions of long_response in many cases to reduce calls to SQLite or Xapian. So continue that trend and become more like the PSGI API which doesn't force callers to specify an article range or work inside a loop.
2018-04-03msgmap: replace id_batch with ids_after
id_batch had a an overly complicated interface, replace it with id_batch which is simpler and takes advantage of selectcol_arrayref in DBI. This allows simplification of callers and the diffstat agrees with me.
2018-04-03nntp: make XOVER, XHDR, OVER, HDR and NEWNEWS faster
While SQLite is faster than Xapian for some queries we use, it sucks at handling OFFSET. Fortunately, we do not need offsets when retrieving sorted results and can bake it into the query. For inbox.comp.version-control.git (v1 Xapian), XOVER and XHDR are over 20x faster.
2018-04-03nntp: fix NEWNEWS command
I guess nobody uses this command (slrnpull does not), and the breakage was not noticed until I started writing new tests for multi-MID handling. Fixes: 3fc411c772a21d8f ("search: drop pointless range processors for Unix timestamp")
2018-04-02replace Xapian skeleton with SQLite overview DB
This ought to provide better performance and scalability which is less dependent on inbox size. Xapian does not seem optimized for some queries used by the WWW homepage, Atom feeds, XOVER and NEWNEWS NNTP commands. This can actually make Xapian optional for NNTP usage, and allow more functionality to work without Xapian installed. Indexing performance was extremely bad at first, but DBI::Profile helped me optimize away problematic queries.
2018-03-14nntp: do not drain rbuf if there is a command pending
Some clients pipeline requests aggressively (enough to match LINE_MAX) and we should not read from the client socket until we know there's no pending command in our read buffer. Reported-and-tested-by: Sergey Organov <sorganov@gmail.com>
2018-03-07nntp: improve fairness during XOVER and similar commands
For other commands generating long responses, we generally want to yield to another client after emitting 100 . However, XOVER-based responses already query 200 lines worth of responses at a time, so we were sending 20000 lines before yielding to other clients. This may help avoid timeouts for some clients.
2018-03-03nntp: fix NEWNEWS command
I guess nobody uses this command (slrnpull does not), and the breakage was not noticed until I started writing new tests for multi-MID handling. Fixes: 3fc411c772a21d8f ("search: drop pointless range processors for Unix timestamp")
2018-03-03nntp: use NNTP article numbers for lookups
Since Message-IDs are no longer unique within Xapian (but are within the SQLite Msgmap); favor NNTP article numbers for internal lookups. This will prevent us from finding the "wrong" internal Message-ID.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2016-12-13nntp: avoid useless use of strftime
There's no need to use strftime if we'll be converting the date by hand, anyways.
2016-09-09nntp: cleanup: move use statements out of sub scope
This clarifies the code somewhat, and we don't care to lazy-load in NNTP.pm anyways since this is only used for a long-lived daemon.
2016-08-14www: do not unecessarily escape some chars in paths
Based on reading RFC 3986, it seems '@', ':', '!', '$', '&', "'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed in path-absolute where we have the Message-ID. In any case, it seems '@' is fairly common in path components nowadays and too common in Message-IDs.
2016-07-27localize $/ when using chomp
Callers may have localized $/ to something else, so make sure we chomp the expected character(s) when calling chomp.
2016-07-09nntp: return if a client drops on us
Danga::Socket::write will set the closed flag on a socket, automatically, and we do not need to bring down an entire server when one client breaks the connection :P
2016-07-07inbox: cleanup and consolidate object weakening
This fixes some layering violations and consolidates the cleanup into the inbox object itself. Keeping in mind weakening does not work at all without our PSGI server.
2016-07-02nntp: respect 3 minute idle time for shutdown
This avoids breaking clients on graceful shutdown since NNTP responses should usually be quick.
2016-07-02nntp: simplify update_idle_time
This ought to make things easier when we add TLS support.
2016-06-20nntp: use lookup_mail instead of lookup_message
lookup_mail is safer since it won't inadvertently load ghosts.
2016-06-20feed: various object-orientation cleanups
Favor Inbox objects as our primary source of truth to simplify our code. This increases our coupling with PSGI to make it easier to write tests in the future. A lot of this code was originally designed to be usable standalone without PSGI or CGI at all; but that might increase development effort.
2016-06-14nntp: do not double-encode UTF-8 body
Or whatever the appropriate Perl terminology, is... And we will need to do something appropriate for other encodings, too. I still barely understand Perl Unicode despite attempting to understand the docs over the years..
2016-05-30use utf8::{encode,decode} for in-place transforms
No need to duplicate the string when transforming it; learned from studying SpamAssassin 3.4.1
2016-05-29nntp: fix for missing articles/bodies/heads
Oops, we totally forgot to automate testing for this :x
2016-05-28remove redundant NewsGroup class
Most of its functionality is in the PublicInbox::Inbox class. While we're at it, we no longer auto-create newsgroup names based on the inbox name, since newsgroup names probably deserve some thought when it comes to hierarchy.
2016-05-24standardize timer-related event-loop code
Standardize the code we have in place to avoid creating too many timer objects. We do not need exact timers for things that don't need to be run ASAP, so we can play things fast and loose to avoid wasting power with unnecessary wakeups. We only need two classes of timers: * asap - run this on the next loop tick, after operating on @Danga::Socket::ToClose to close remaining sockets * later - run at some point in the future. It could be as soon as immediately (like "asap"), and as late as 60s into the future. In the future, we support an "emergency" switch to fire "later" timers immediately.
2016-05-18nntpd: reject control characters entirely
There's no place for them in the commands and we don't take messages; potentially printing them into a log opened in a terminal is too dangerous. Hoist out read_til_dot in the test while we're at it.
2016-05-14nntp: use "newsgroup" instead of "name"
This reduces the cognitive overhead for mapping names of configuration values to internal field names of our classes. Further changes along these lines coming...
2016-05-13nntp: fixup "Wide character" warnings
We need Perl to believe everything we send is UTF-8, make it so, even if it may not be. Fixes: 265e79ff82ce 'Revert "nntp: proper UTF-8 support (hopefully?)"'
2016-05-13Revert "nntp: proper UTF-8 support (hopefully?)"
This reverts commit f81ad477cb013d05b9b11fa051a9ebc5983a5be6. The raw, undecoded body is probably what should be sent over the wire anyways for clients to deal with. We'll need this to avoid deprecation warnings with Perl 5.24+ since we use send()/recv()/sysread().
2016-05-02nntp: append Archived-At and List-Archive headers
For readers using NNTP, we should do our best to advertise the clonable HTTP/HTTPS URLs and the message permalink URL for ease-of-referencing messages, since we don't want the NNTP server and it's sequential article numbers to be relied on.
2016-05-01daemon: reduce timer-related allocations
We can reduce the allocation and overhead needed for Danga::Socket timers for immediately-executed responses by combining identical timers and reducing anonymous sub creation.
2016-04-25nntp: reduce timers for weakening
Danga::Socket timers are not cheap, so avoid creating up to 3 timers per-newsgroup by batching resource weakening. This lets us reduce resource consumption for scheduing additional resource consumption reduction :)
2016-04-25nntp: remove unused hdr_val subroutine
hdr_val has not been used since commit 1d236e649df1 ("nntp: implement OVER/XOVER summary in search document")
2016-02-29favor procedural calls for most private functions
This makes for better compile-time checking and also helps document which calls are private for HTTP and NNTP. While we're at it, use IO::Handle::* functions procedurally, too, since we know we're working with native glob handles.
2016-02-29distinguish error messages intended for users vs developers
For error messages intended to show user error (e.g. giving invalid options), we add a newline ("\n") at the end to polluting the output with location information. However, for diagnosing non-user-triggered errors, we should show the location of where the error occured.
2016-02-28http: support graceful shutdown like nntp
HTTP responses may be long-running or requests may be slow or pipelined. Ensure we don't kill them off prematurely.
2015-12-22rename 'GitCatFile' package to 'Git'
We'll be using it for more than just cat-file. Adding a `popen' API for internal use allows us to save a bunch of code in other places.
2015-11-20various internal documentation updates
Hopefully this gives new hackers a better overview of how the components relate to each other.
2015-11-18nntp: fix printf warnings
Error messages and request lines may contain '%' which would throw off Perl printf.
2015-10-01nntp: better delimit error message
It may be hard to tell what command triggered an error, otherwise.
2015-10-01nntp: remove reference to non-existent function
Oops