user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/12] another round of NNTP updates
@ 2015-09-30 21:00 Eric Wong
  2015-09-30 21:00 ` [PATCH 01/12] search: remove get_subject_path Eric Wong
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

This is probably performant enough for practical use :)

Eric Wong (12):
      search: remove get_subject_path
      nntp: HDR returns 225, not 224
      nntp: reduce syscalls for LIST OVERVIEW.FMT
      remove unnecessary fields usage
      daemon: always autoflush stdout/stderr
      nntpd: avoid lazy require
      INSTALL: document Danga::Socket dependency for nntpd
      nntp: MODE READER denies posting
      nntp: implement LIST HEADERS
      nntp: implement OVER/XOVER summary in search document
      t/nntpd.t: simplify condition for response termination
      t/nntpd.t: additional tests for XHDR/HDR

 INSTALL                       |   2 +
 lib/PublicInbox/Daemon.pm     |   3 +
 lib/PublicInbox/GitCatFile.pm |   5 +-
 lib/PublicInbox/Hval.pm       |   9 +--
 lib/PublicInbox/Mbox.pm       |  11 ++-
 lib/PublicInbox/Msgmap.pm     |   4 +-
 lib/PublicInbox/NNTP.pm       | 175 ++++++++++++++++++++----------------------
 lib/PublicInbox/NewsGroup.pm  |  25 +++---
 lib/PublicInbox/Search.pm     |  43 ++++++++---
 lib/PublicInbox/SearchIdx.pm  |  59 ++++++++++----
 lib/PublicInbox/SearchMsg.pm  |  86 +++++++++++++--------
 lib/PublicInbox/SearchView.pm |  19 ++---
 public-inbox-nntpd            |  19 +++--
 t/nntpd.t                     |  46 ++++++++++-
 t/search.t                    |   2 +-
 15 files changed, 305 insertions(+), 203 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/12] search: remove get_subject_path
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 02/12] nntp: HDR returns 225, not 224 Eric Wong
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

We probably won't be supporting this in the public API
---
 lib/PublicInbox/Search.pm | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 810eab8..695d56b 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -91,12 +91,6 @@ sub query {
 	$self->do_enquire($query, $opts);
 }
 
-sub get_subject_path {
-	my ($self, $path, $opts) = @_;
-	my $q = Search::Xapian::Query->new(xpfx("path").mid_compress($path));
-	$self->do_enquire($q, $opts);
-}
-
 sub get_thread {
 	my ($self, $mid, $opts) = @_;
 	my $smsg = eval { $self->lookup_message($mid) };
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/12] nntp: HDR returns 225, not 224
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
  2015-09-30 21:00 ` [PATCH 01/12] search: remove get_subject_path Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 03/12] nntp: reduce syscalls for LIST OVERVIEW.FMT Eric Wong
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

This is stipulated by RFC 3977 8.5.1, but apparently I misread it.
---
 lib/PublicInbox/NNTP.pm | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 95aa4af..faa7563 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -16,6 +16,7 @@ use constant {
 	r501 => '501 command syntax error',
 	r221 => '221 Header follows',
 	r224 => '224 Overview information follows (multi-line)',
+	r225 =>	'225 Headers follow (multi-line)',
 	r430 => '430 No article with that message-id',
 	long_response_limit => 0xffffffff,
 };
@@ -554,7 +555,7 @@ sub hdr_message_id ($$$) { # optimize XHDR Message-ID [range] for slrnpull.
 		return $r unless ref $r;
 		my $mm = $self->{ng}->mm;
 		my ($beg, $end) = @$r;
-		more($self, '221 Header follows');
+		more($self, $xhdr ? r221 : r225);
 		$self->long_response($beg, $end, sub {
 			my ($i) = @_;
 			my $mid = $mm->mid_for($$i);
@@ -597,7 +598,7 @@ sub hdr_xref ($$$) { # optimize XHDR Xref [range] for rtin
 		my $ng = $self->{ng};
 		my $mm = $ng->mm;
 		my ($beg, $end) = @$r;
-		more($self, '221 Header follows');
+		more($self, $xhdr ? r221 : r225);
 		$self->long_response($beg, $end, sub {
 			my ($i) = @_;
 			my $mid = $mm->mid_for($$i);
@@ -644,7 +645,7 @@ sub hdr_searchmsg ($$$$) {
 		my $r = get_range($self, $range);
 		return $r unless ref $r;
 		my ($beg, $end) = @$r;
-		more($self, '221 Header follows');
+		more($self, $xhdr ? r221 : r225);
 		$self->long_response($beg, $end, sub {
 			my ($i) = @_;
 			my $mid = $mm->mid_for($$i) or return;
@@ -702,7 +703,7 @@ sub hdr_mid_response ($$$$$$) {
 		$res .= r221 . "\r\n";
 		$res .= "$mid $v\r\n" if defined $v;
 	} else {
-		$res .= r224 . "\r\n";
+		$res .= r225 . "\r\n";
 		if (defined $v) {
 			my $pfx = hdr_mid_prefix($self, $xhdr, $ng, $n, $mid);
 			$res .= "$pfx $v\r\n";
@@ -726,7 +727,7 @@ sub hdr_slow ($$$$) {
 		my $r = get_range($self, $range);
 		return $r unless ref $r;
 		my ($beg, $end) = @$r;
-		more($self, $xhdr ? r221 : r224);
+		more($self, $xhdr ? r221 : r225);
 		$self->long_response($beg, $end, sub {
 			my ($i) = @_;
 			$r = $self->art_lookup($$i, 2);
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/12] nntp: reduce syscalls for LIST OVERVIEW.FMT
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
  2015-09-30 21:00 ` [PATCH 01/12] search: remove get_subject_path Eric Wong
  2015-09-30 21:00 ` [PATCH 02/12] nntp: HDR returns 225, not 224 Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 04/12] remove unnecessary fields usage Eric Wong
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

No point in sending such a short, bounded response with
multiple syscalls.
---
 lib/PublicInbox/NNTP.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index faa7563..dc4227e 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -24,7 +24,7 @@ use constant {
 sub now () { clock_gettime(CLOCK_MONOTONIC) };
 
 my @OVERVIEW = qw(Subject From Date Message-ID References Bytes Lines);
-my %OVERVIEW = map { $_ => 1 } @OVERVIEW;
+my $OVERVIEW_FMT = join(":\r\n", @OVERVIEW) . ":\r\n";
 
 # disable commands with easy DoS potential:
 # LISTGROUP could get pretty bad, too...
@@ -90,9 +90,9 @@ sub cmd_xgtitle ($;$) {
 	'.'
 }
 
-sub list_overview_fmt ($$) {
+sub list_overview_fmt ($) {
 	my ($self) = @_;
-	more($self, $_ . ':') foreach @OVERVIEW;
+	do_more($self, $OVERVIEW_FMT);
 }
 
 sub list_active ($;$) {
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/12] remove unnecessary fields usage
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (2 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 03/12] nntp: reduce syscalls for LIST OVERVIEW.FMT Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 05/12] daemon: always autoflush stdout/stderr Eric Wong
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

It doesn't actually give performance improvements unless we
use types with "my", but we don't do that.  We'll only continue
using fields with Danga::Socket-derived classes where they're
required.
---
 lib/PublicInbox/GitCatFile.pm |  5 +----
 lib/PublicInbox/Hval.pm       |  9 ++++-----
 lib/PublicInbox/Mbox.pm       | 11 +++++------
 lib/PublicInbox/Msgmap.pm     |  4 +---
 lib/PublicInbox/NewsGroup.pm  | 10 +++++-----
 lib/PublicInbox/SearchView.pm | 19 ++++++++-----------
 public-inbox-nntpd            | 13 ++++++-------
 7 files changed, 30 insertions(+), 41 deletions(-)

diff --git a/lib/PublicInbox/GitCatFile.pm b/lib/PublicInbox/GitCatFile.pm
index 5403696..629d23e 100644
--- a/lib/PublicInbox/GitCatFile.pm
+++ b/lib/PublicInbox/GitCatFile.pm
@@ -8,13 +8,10 @@ use strict;
 use warnings;
 use POSIX qw(dup2);
 require IO::Handle;
-use fields qw(git_dir pid in out);
 
 sub new {
 	my ($class, $git_dir) = @_;
-	my $self = fields::new($class);
-	$self->{git_dir} = $git_dir;
-	$self;
+	bless { git_dir => $git_dir }, $class
 }
 
 sub _cat_file_begin {
diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm
index e0ec630..9fbe616 100644
--- a/lib/PublicInbox/Hval.pm
+++ b/lib/PublicInbox/Hval.pm
@@ -5,7 +5,6 @@
 package PublicInbox::Hval;
 use strict;
 use warnings;
-use fields qw(raw href);
 use Encode qw(find_encoding);
 use URI::Escape qw(uri_escape_utf8);
 use PublicInbox::MID qw/mid_clean/;
@@ -14,14 +13,14 @@ my $enc_ascii = find_encoding('us-ascii');
 
 sub new {
 	my ($class, $raw, $href) = @_;
-	my $self = fields::new($class);
 
 	# we never care about leading/trailing whitespace
 	$raw =~ s/\A\s*//;
 	$raw =~ s/\s*\z//;
-	$self->{raw} = $raw;
-	$self->{href} = defined $href ? $href : $raw;
-	$self;
+	bless {
+		raw => $raw,
+		href => defined $href ? $href : $raw,
+	}, $class;
 }
 
 sub new_msgid {
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index c92d444..8bb8dc8 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -110,16 +110,15 @@ EOF
 package PublicInbox::MboxGz;
 use strict;
 use warnings;
-use fields qw(gz fh buf);
 
 sub new {
 	my ($class, $fh) = @_;
-	my $self = fields::new($class);
 	my $buf;
-	$self->{buf} = \$buf;
-	$self->{gz} = IO::Compress::Gzip->new(\$buf);
-	$self->{fh} = $fh;
-	$self;
+	bless {
+		buf => \$buf,
+		gz => IO::Compress::Gzip->new(\$buf),
+		fh => $fh,
+	}, $class;
 }
 
 sub _flush_buf {
diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index f285790..8a34e7e 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -4,7 +4,6 @@
 package PublicInbox::Msgmap;
 use strict;
 use warnings;
-use fields qw(dbh mid_insert mid_for num_for num_minmax);
 use DBI;
 use DBD::SQLite;
 
@@ -23,8 +22,7 @@ sub new {
 		sqlite_use_immediate_transaction => 1,
 	});
 	$dbh->do('PRAGMA case_sensitive_like = ON');
-	my $self = fields::new($class);
-	$self->{dbh} = $dbh;
+	my $self = bless { dbh => $dbh }, $class;
 
 	if ($writable) {
 		create_tables($dbh);
diff --git a/lib/PublicInbox/NewsGroup.pm b/lib/PublicInbox/NewsGroup.pm
index 0c7051d..1250b0d 100644
--- a/lib/PublicInbox/NewsGroup.pm
+++ b/lib/PublicInbox/NewsGroup.pm
@@ -3,7 +3,6 @@
 package PublicInbox::NewsGroup;
 use strict;
 use warnings;
-use fields qw(name git_dir address domain mm gcf search);
 use Scalar::Util qw(weaken);
 require Danga::Socket;
 require PublicInbox::Msgmap;
@@ -11,12 +10,13 @@ require PublicInbox::GitCatFile;
 
 sub new {
 	my ($class, $name, $git_dir, $address) = @_;
-	my $self = fields::new($class);
-	$self->{name} = $name;
 	$address = $address->[0] if ref($address);
+	my $self = bless {
+		name => $name,
+		git_dir => $git_dir,
+		address => $address,
+	}, $class;
 	$self->{domain} = ($address =~ /\@(\S+)\z/) ? $1 : 'localhost';
-	$self->{git_dir} = $git_dir;
-	$self->{address} = $address;
 	$self;
 }
 
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index 6bc66ce..cfc650f 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -257,27 +257,24 @@ sub adump {
 package PublicInbox::SearchQuery;
 use strict;
 use warnings;
-use fields qw(q o t x r);
 use PublicInbox::Hval;
 
 sub new {
 	my ($class, $cgi) = @_;
-	my $self = fields::new($class);
-	$self->{q} = $cgi->param('q');
-	$self->{x} = $cgi->param('x') || '';
-	$self->{o} = int($cgi->param('o') || 0) || 0;
-	my $r = $cgi->param('r');
-	$self->{r} = (defined $r && $r ne '0');
-
-	$self;
+	my $r => $cgi->param('r'),
+	bless {
+		q => $cgi->param('q'),
+		x => $cgi->param('x') || '',
+		o => int($cgi->param('o') || 0) || 0,
+		r => (defined $r && $r ne '0'),
+	}, $class;
 }
 
 sub qs_html {
 	my ($self, %over) = @_;
 
 	if (keys %over) {
-		my $tmp = fields::new(ref($self));
-		%$tmp = %$self;
+		my $tmp = bless { %$self }, ref($self);
 		foreach my $k (keys %over) {
 			$tmp->{$k} = $over{$k};
 		}
diff --git a/public-inbox-nntpd b/public-inbox-nntpd
index f6042c2..79161fb 100644
--- a/public-inbox-nntpd
+++ b/public-inbox-nntpd
@@ -15,16 +15,15 @@ daemon_run('0.0.0.0:119',
 package PublicInbox::NNTPD;
 use strict;
 use warnings;
-use fields qw(groups grouplist err out);
 
 sub new {
 	my ($class) = @_;
-	my $self = fields::new($class);
-	$self->{groups} = {};
-	$self->{err} = \*STDERR;
-	$self->{out} = \*STDOUT;
-	$self->{grouplist} = [];
-	$self;
+	bless {
+		groups => {},
+		err => \*STDERR,
+		out => \*STDOUT,
+		grouplist => [],
+	}, $class;
 }
 
 sub refresh_groups () {
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/12] daemon: always autoflush stdout/stderr
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (3 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 04/12] remove unnecessary fields usage Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 06/12] nntpd: avoid lazy require Eric Wong
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

Users may log output to a pipe, so ensure these outputs are
unbuffered in userspace and go to the operating system ASAP
for other processes to pick up.
---
 lib/PublicInbox/Daemon.pm | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index 9e177f2..02d2dc5 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -8,6 +8,9 @@ package main;
 use strict;
 use warnings;
 use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
+use IO::Handle;
+STDOUT->autoflush(1);
+STDERR->autoflush(1);
 require Danga::Socket;
 require POSIX;
 require PublicInbox::Listener;
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/12] nntpd: avoid lazy require
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (4 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 05/12] daemon: always autoflush stdout/stderr Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 07/12] INSTALL: document Danga::Socket dependency for nntpd Eric Wong
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

We cannot avoid requiring ::Config, so do not hide it.
---
 public-inbox-nntpd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/public-inbox-nntpd b/public-inbox-nntpd
index 79161fb..82d6838 100644
--- a/public-inbox-nntpd
+++ b/public-inbox-nntpd
@@ -6,6 +6,7 @@ use warnings;
 require PublicInbox::Daemon;
 require PublicInbox::NewsGroup;
 require PublicInbox::NNTP;
+require PublicInbox::Config;
 my $nntpd = PublicInbox::NNTPD->new;
 daemon_run('0.0.0.0:119',
 	sub { $nntpd->refresh_groups },
@@ -28,7 +29,6 @@ sub new {
 
 sub refresh_groups () {
 	my ($self) = @_;
-	require PublicInbox::Config;
 	my $pi_config = PublicInbox::Config->new;
 	my $new = {};
 	my @list;
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/12] INSTALL: document Danga::Socket dependency for nntpd
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (5 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 06/12] nntpd: avoid lazy require Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 08/12] nntp: MODE READER denies posting Eric Wong
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

We don't have something like CGI or Plack to build an NNTP
server on top on, so we implemented one using Danga::Socket
for epoll/kqueue abstraction.
---
 INSTALL | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/INSTALL b/INSTALL
index 0e7203d..a2d9fdc 100644
--- a/INSTALL
+++ b/INSTALL
@@ -43,12 +43,14 @@ Optional modules:
   - IO::Compress::Gzip[3]      libio-compress-perl
   - DBI[3]                     libdbi-perl
   - DBD::SQLite[3]             libdbd-sqlite3-perl
+  - Danga::Socket[4]           libdanga-socket-perl
 
 [1] - Only required for serving/generating Atom and HTML pages.
 [2] - Keep in mind this will be split into a separate Debian package
       when CGI.pm is dropped from the Perl standard library.
       Plack/PSGI and mod_perl2 are both supported.
 [3] - Optional for HTML web interface and NNTP server
+[4] - Optional for NNTP server
 
 Copyright
 ---------
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/12] nntp: MODE READER denies posting
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (6 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 07/12] INSTALL: document Danga::Socket dependency for nntpd Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 09/12] nntp: implement LIST HEADERS Eric Wong
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

Redundantly confirm to clients we do not accept posting with the
MODE READER command.

ref: RFC 3977 5.3.1
---
 lib/PublicInbox/NNTP.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index dc4227e..142bee3 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -78,7 +78,7 @@ sub cmd_mode ($$) {
 	my ($self, $arg) = @_;
 	$arg = uc $arg;
 	return r501 unless $arg eq 'READER';
-	'200 reader status acknowledged';
+	'201 Posting prohibited';
 }
 
 sub cmd_slave ($) { '202 slave status noted' }
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/12] nntp: implement LIST HEADERS
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (7 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 08/12] nntp: MODE READER denies posting Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 10/12] nntp: implement OVER/XOVER summary in search document Eric Wong
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

We shall remove slow, unoptimized headers in XHDR/HDR to avoid
becoming an easy DoS target.
---
 lib/PublicInbox/NNTP.pm | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 142bee3..1276039 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -25,6 +25,8 @@ sub now () { clock_gettime(CLOCK_MONOTONIC) };
 
 my @OVERVIEW = qw(Subject From Date Message-ID References Bytes Lines);
 my $OVERVIEW_FMT = join(":\r\n", @OVERVIEW) . ":\r\n";
+my $LIST_HEADERS = join("\r\n", qw(Subject From Date Message-ID References
+				  :bytes :lines Xref To Cc)) . "\r\n";
 
 # disable commands with easy DoS potential:
 # LISTGROUP could get pretty bad, too...
@@ -95,6 +97,11 @@ sub list_overview_fmt ($) {
 	do_more($self, $OVERVIEW_FMT);
 }
 
+sub list_headers ($;$) {
+	my ($self) = @_;
+	do_more($self, $LIST_HEADERS);
+}
+
 sub list_active ($;$) {
 	my ($self, $wildmat) = @_;
 	wildmat2re($wildmat);
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/12] nntp: implement OVER/XOVER summary in search document
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (8 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 09/12] nntp: implement LIST HEADERS Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 11/12] t/nntpd.t: simplify condition for response termination Eric Wong
  2015-09-30 21:00 ` [PATCH 12/12] t/nntpd.t: additional tests for XHDR/HDR Eric Wong
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

The document data of a search message already contains a good chunk
of the information needed to respond to OVER/XOVER commands quickly.
Expand on that and use the document data to implement OVER/XOVER
quickly.

This adds a dependency on Xapian being available for nntpd usage,
but is probably alright since nntpd is esoteric enough that anybody
willing to run nntpd will also want search functionality offered
by Xapian.

This also speeds up XHDR/HDR with the To: and Cc: headers and
:bytes/:lines article metadata used by some clients for header
displays and marking messages as read/unread.
---
 lib/PublicInbox/NNTP.pm      | 157 +++++++++++++++++++------------------------
 lib/PublicInbox/NewsGroup.pm |  15 +++--
 lib/PublicInbox/Search.pm    |  37 +++++++++-
 lib/PublicInbox/SearchIdx.pm |  59 +++++++++++-----
 lib/PublicInbox/SearchMsg.pm |  86 ++++++++++++++----------
 public-inbox-nntpd           |   4 +-
 t/nntpd.t                    |  25 ++++++-
 t/search.t                   |   2 +-
 8 files changed, 236 insertions(+), 149 deletions(-)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 1276039..7fe7f2f 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -5,6 +5,7 @@ use strict;
 use warnings;
 use base qw(Danga::Socket);
 use fields qw(nntpd article rbuf ng long_res);
+use PublicInbox::Search;
 use PublicInbox::Msgmap;
 use PublicInbox::GitCatFile;
 use PublicInbox::MID qw(mid2path);
@@ -23,10 +24,10 @@ use constant {
 
 sub now () { clock_gettime(CLOCK_MONOTONIC) };
 
-my @OVERVIEW = qw(Subject From Date Message-ID References Bytes Lines);
-my $OVERVIEW_FMT = join(":\r\n", @OVERVIEW) . ":\r\n";
-my $LIST_HEADERS = join("\r\n", qw(Subject From Date Message-ID References
-				  :bytes :lines Xref To Cc)) . "\r\n";
+my @OVERVIEW = qw(Subject From Date Message-ID References);
+my $OVERVIEW_FMT = join(":\r\n", @OVERVIEW, qw(Bytes Lines)) . ":\r\n";
+my $LIST_HEADERS = join("\r\n", @OVERVIEW,
+			qw(:bytes :lines Xref To Cc)) . "\r\n";
 
 # disable commands with easy DoS potential:
 # LISTGROUP could get pretty bad, too...
@@ -614,53 +615,42 @@ sub hdr_xref ($$$) { # optimize XHDR Xref [range] for rtin
 	}
 }
 
-sub header_obj_for {
-	my ($srch, $mid) = @_;
-	eval {
-		my $smsg = $srch->lookup_message($mid);
-		$smsg = PublicInbox::SearchMsg->load_doc($smsg->{doc});
-		$smsg->mini_mime->header_obj;
-	};
-};
+sub search_header_for {
+	my ($srch, $mid, $field) = @_;
+	my $smsg = $srch->lookup_message($mid) or return;
+	$smsg = PublicInbox::SearchMsg->load_doc($smsg->{doc});
+	$smsg->$field;
+}
 
 sub hdr_searchmsg ($$$$) {
-	my ($self, $xhdr, $hdr, $range) = @_;
-	my $filter;
-	if ($hdr eq 'date') {
-		$hdr = 'X-PI-TS';
-		$filter = sub ($) {
-			strftime('%a, %d %b %Y %T %z', gmtime($_[0]));
-		};
-	}
-
+	my ($self, $xhdr, $field, $range) = @_;
 	if (defined $range && $range =~ /\A<(.+)>\z/) { # Message-ID
 		my ($ng, $n) = mid_lookup($self, $1);
 		return r430 unless $n;
-		if (my $srch = $ng->search) {
-			my $m = header_obj_for($srch, $range);
-			my $v = $m->header($hdr);
-			$v = $filter->($v) if defined $v && $filter;
-			hdr_mid_response($self, $xhdr, $ng, $n, $range, $v);
-		} else {
-			hdr_slow($self, $xhdr, $hdr, $range);
-		}
+		my $v = search_header_for($ng->search, $range, $field);
+		hdr_mid_response($self, $xhdr, $ng, $n, $range, $v);
 	} else { # numeric range
 		$range = $self->{article} unless defined $range;
-		my $srch = $self->{ng}->search or
-				return hdr_slow($self, $xhdr, $hdr, $range);
+		my $srch = $self->{ng}->search;
 		my $mm = $self->{ng}->mm;
 		my $r = get_range($self, $range);
 		return $r unless ref $r;
 		my ($beg, $end) = @$r;
 		more($self, $xhdr ? r221 : r225);
+		my $off = 0;
 		$self->long_response($beg, $end, sub {
 			my ($i) = @_;
-			my $mid = $mm->mid_for($$i) or return;
-			my $m = header_obj_for($srch, $mid) or return;
-			my $v = $m->header($hdr);
-			defined $v or return;
-			$v = $filter->($v) if $filter;
-			more($self, "$$i $v");
+			my $res = $srch->query_xover($beg, $end, $off);
+			my $msgs = $res->{msgs};
+			my $nr = scalar @$msgs or return;
+			$off += $nr;
+			my $tmp = '';
+			foreach my $s (@$msgs) {
+				$tmp .= $s->num . ' ' . $s->$field . "\r\n";
+			}
+			do_more($self, $tmp);
+			# -1 to adjust for implicit increment in long_response
+			$$i = $nr ? $$i + $nr - 1 : long_response_limit;
 		});
 	}
 }
@@ -672,10 +662,13 @@ sub do_hdr ($$$;$) {
 		hdr_message_id($self, $xhdr, $range);
 	} elsif ($sub eq 'xref') {
 		hdr_xref($self, $xhdr, $range);
-	} elsif ($sub =~ /\A(subject|references|date)\z/) {
+	} elsif ($sub =~ /\A(?:subject|references|date|from|to|cc|
+				bytes|lines)\z/x) {
 		hdr_searchmsg($self, $xhdr, $sub, $range);
+	} elsif ($sub =~ /\A:(bytes|lines)\z/) {
+		hdr_searchmsg($self, $xhdr, $1, $range);
 	} else {
-		hdr_slow($self, $xhdr, $header, $range);
+		$xhdr ? (r221 . "\r\n.") : "503 HDR not permitted on $header";
 	}
 }
 
@@ -708,43 +701,16 @@ sub hdr_mid_response ($$$$$$) {
 	my $res = '';
 	if ($xhdr) {
 		$res .= r221 . "\r\n";
-		$res .= "$mid $v\r\n" if defined $v;
+		$res .= "$mid $v\r\n";
 	} else {
 		$res .= r225 . "\r\n";
-		if (defined $v) {
-			my $pfx = hdr_mid_prefix($self, $xhdr, $ng, $n, $mid);
-			$res .= "$pfx $v\r\n";
-		}
+		my $pfx = hdr_mid_prefix($self, $xhdr, $ng, $n, $mid);
+		$res .= "$pfx $v\r\n";
 	}
 	res($self, $res .= '.');
 	undef;
 }
 
-sub hdr_slow ($$$$) {
-	my ($self, $xhdr, $header, $range) = @_;
-
-	if (defined $range && $range =~ /\A<.+>\z/) { # Message-ID
-		my $r = $self->art_lookup($range, 2);
-		return $r unless ref $r;
-		my ($n, $ng) = ($r->[0], $r->[5]);
-		my $v = hdr_val($r, $header);
-		hdr_mid_response($self, $xhdr, $ng, $n, $range, $v);
-	} else { # numeric range
-		$range = $self->{article} unless defined $range;
-		my $r = get_range($self, $range);
-		return $r unless ref $r;
-		my ($beg, $end) = @$r;
-		more($self, $xhdr ? r221 : r225);
-		$self->long_response($beg, $end, sub {
-			my ($i) = @_;
-			$r = $self->art_lookup($$i, 2);
-			return unless ref $r;
-			defined($r = hdr_val($r, $header)) or return;
-			more($self, "$$i $r");
-		});
-	}
-}
-
 sub cmd_xrover ($;$) {
 	my ($self, $range) = @_;
 	my $ng = $self->{ng} or return '412 no newsgroup selected';
@@ -761,32 +727,38 @@ sub cmd_xrover ($;$) {
 	$self->long_response($beg, $end, sub {
 		my ($i) = @_;
 		my $mid = $mm->mid_for($$i) or return;
-		my $m = header_obj_for($srch, $mid) or return;
-		my $h = $m->header('references');
-		more($self, "$$i $h") if defined $h;
+		my $h = search_header_for($srch, $mid, 'references');
+		more($self, "$$i $h");
 	});
 }
 
 sub over_line ($$) {
-	my ($self, $r) = @_;
-
-	more($self, join("\t", $r->[0], map {
-				my $h = hdr_val($r, $_);
-				defined $h ? $h : '';
-			} @OVERVIEW ));
+	my ($num, $smsg) = @_;
+	# n.b. field access and procedural calls can be
+	# 10%-15% faster than OO method calls:
+	join("\t", $num,
+		$smsg->{subject},
+		$smsg->{from},
+		PublicInbox::SearchMsg::date($smsg),
+		'<'.PublicInbox::SearchMsg::mid($smsg).'>',
+		$smsg->{references},
+		PublicInbox::SearchMsg::bytes($smsg),
+		PublicInbox::SearchMsg::lines($smsg));
 }
 
 sub cmd_over ($;$) {
 	my ($self, $range) = @_;
-	if ($range && $range =~ /\A<.+>\z/) {
-		my $r = $self->art_lookup($range, 2);
-		return '430 No article with that message-id' unless ref $r;
+	if ($range && $range =~ /\A<(.+)>\z/) {
+		my ($ng, $n) = mid_lookup($self, $1);
+		my $smsg = $ng->search->lookup_message($range) or
+			return '430 No article with that message-id';
 		more($self, '224 Overview information follows (multi-line)');
+		$smsg = PublicInbox::SearchMsg->load_doc($smsg->{doc});
 
 		# Only set article number column if it's the current group
-		my $ng = $self->{ng};
-		$r->[0] = 0 if (!$ng || $ng ne $r->[5]);
-		over_line($self, $r);
+		my $self_ng = $self->{ng};
+		$n = 0 if (!$self_ng || $self_ng ne $ng);
+		more($self, over_line($n, $smsg));
 		'.';
 	} else {
 		cmd_xover($self, $range);
@@ -800,11 +772,22 @@ sub cmd_xover ($;$) {
 	return $r unless ref $r;
 	my ($beg, $end) = @$r;
 	more($self, "224 Overview information follows for $beg to $end");
+	my $srch = $self->{ng}->search;
+	my $off = 0;
 	$self->long_response($beg, $end, sub {
 		my ($i) = @_;
-		my $r = $self->art_lookup($$i, 2);
-		return unless ref $r;
-		over_line($self, $r);
+		my $res = $srch->query_xover($beg, $end, $off);
+		my $msgs = $res->{msgs};
+		my $nr = scalar @$msgs or return;
+		$off += $nr;
+
+		# OVERVIEW.FMT
+		more($self, join("\r\n", map {
+			over_line(PublicInbox::SearchMsg::num($_), $_);
+			} @$msgs));
+
+		# -1 to adjust for implicit increment in long_response
+		$$i = $nr ? $$i + $nr - 1 : long_response_limit;
 	});
 }
 
diff --git a/lib/PublicInbox/NewsGroup.pm b/lib/PublicInbox/NewsGroup.pm
index 1250b0d..02e9011 100644
--- a/lib/PublicInbox/NewsGroup.pm
+++ b/lib/PublicInbox/NewsGroup.pm
@@ -6,6 +6,7 @@ use warnings;
 use Scalar::Util qw(weaken);
 require Danga::Socket;
 require PublicInbox::Msgmap;
+require PublicInbox::Search;
 require PublicInbox::GitCatFile;
 
 sub new {
@@ -36,11 +37,16 @@ sub gcf {
 	};
 }
 
+sub usable {
+	my ($self) = @_;
+	eval {
+		PublicInbox::Msgmap->new($self->{git_dir});
+		PublicInbox::Search->new($self->{git_dir});
+	};
+}
+
 sub mm {
-	my ($self, $check_only) = @_;
-	if ($check_only) {
-		return eval { PublicInbox::Msgmap->new($self->{git_dir}) };
-	}
+	my ($self) = @_;
 	$self->{mm} ||= eval {
 		my $mm = PublicInbox::Msgmap->new($self->{git_dir});
 
@@ -53,7 +59,6 @@ sub mm {
 sub search {
 	my ($self) = @_;
 	$self->{search} ||= eval {
-		require PublicInbox::Search;
 		my $search = PublicInbox::Search->new($self->{git_dir});
 
 		# may be needed if we run low on handles
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 695d56b..1d13f4b 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -4,7 +4,13 @@
 package PublicInbox::Search;
 use strict;
 use warnings;
-use constant TS => 0;
+
+# values for searching
+use constant TS => 0; # timestamp
+use constant NUM => 1; # NNTP article number
+use constant BYTES => 2; # :bytes as defined in RFC 3977
+use constant LINES => 3; # :lines as defined in RFC 3977
+
 use Search::Xapian qw/:standard/;
 use PublicInbox::SearchMsg;
 use Email::MIME;
@@ -26,8 +32,9 @@ use constant {
 	# 6 - preserve References: order in document data
 	# 7 - remove references and inreplyto terms
 	# 8 - remove redundant/unneeded document data
-	# 9 - disable Message-ID compression
-	SCHEMA_VERSION => 9,
+	# 9 - disable Message-ID compression (SHA-1)
+	# 10 - optimize doc for NNTP overviews
+	SCHEMA_VERSION => 10,
 
 	# n.b. FLAG_PURE_NOT is expensive not suitable for a public website
 	# as it could become a denial-of-service vector
@@ -168,6 +175,30 @@ sub date_range_processor {
 	$_[0]->{drp} ||= Search::Xapian::DateValueRangeProcessor->new(TS);
 }
 
+sub num_range_processor {
+	$_[0]->{nrp} ||= Search::Xapian::NumberValueRangeProcessor->new(NUM);
+}
+
+# only used for NNTP server
+sub query_xover {
+	my ($self, $beg, $end, $offset) = @_;
+	my $enquire = $self->enquire;
+	my $qp = Search::Xapian::QueryParser->new;
+	$qp->set_database($self->{xdb});
+	$qp->add_valuerangeprocessor($self->num_range_processor);
+	my $query = $qp->parse_query("$beg..$end", QP_FLAGS);
+	$query = Search::Xapian::Query->new(OP_AND, $mail_query, $query);
+	$enquire->set_query($query);
+	$enquire->set_sort_by_value(NUM, 0);
+	my $limit = 200;
+	my $mset = $enquire->get_mset($offset, $limit);
+	my @msgs = map {
+		PublicInbox::SearchMsg->load_doc($_->get_document);
+	} $mset->items;
+
+	{ total => $mset->get_matches_estimated, msgs => \@msgs }
+}
+
 sub lookup_message {
 	my ($self, $mid) = @_;
 	$mid = mid_clean($mid);
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 8724326..4b43369 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -36,8 +36,14 @@ sub new {
 	$self;
 }
 
+sub add_val {
+	my ($doc, $col, $num) = @_;
+	$num = Search::Xapian::sortable_serialise($num);
+	$doc->add_value($col, $num);
+}
+
 sub add_message {
-	my ($self, $mime) = @_; # mime = Email::MIME object
+	my ($self, $mime, $bytes, $num) = @_; # mime = Email::MIME object
 	my $db = $self->{xdb};
 
 	my $doc_id;
@@ -80,8 +86,16 @@ sub add_message {
 			$doc->add_term(xpfx('path') . mid_compress($path));
 		}
 
-		my $ts = Search::Xapian::sortable_serialise($smsg->ts);
-		$doc->add_value(PublicInbox::Search::TS, $ts);
+		add_val($doc, &PublicInbox::Search::TS, $smsg->ts);
+
+		defined($num) and
+			add_val($doc, &PublicInbox::Search::NUM, $num);
+
+		defined($bytes) and
+			add_val($doc, &PublicInbox::Search::BYTES, $bytes);
+
+		add_val($doc, &PublicInbox::Search::LINES,
+				$mime->body_raw =~ tr!\n!\n!);
 
 		my $tg = $self->term_generator;
 
@@ -91,7 +105,7 @@ sub add_message {
 		$tg->index_text($subj) if $subj;
 		$tg->increase_termpos;
 
-		$tg->index_text($smsg->from->format);
+		$tg->index_text($smsg->from);
 		$tg->increase_termpos;
 
 		$mime->walk_parts(sub {
@@ -224,7 +238,7 @@ sub link_message_to_parents {
 		}
 	}
 	if (@refs) {
-		$smsg->{references_sorted} = '<'.join('><', @refs).'>';
+		$smsg->{references} = '<'.join('> <', @refs).'>';
 
 		# first ref *should* be the thread root,
 		# but we can never trust clients to do the right thing
@@ -245,8 +259,8 @@ sub link_message_to_parents {
 }
 
 sub index_blob {
-	my ($self, $git, $mime) = @_;
-	$self->add_message($mime);
+	my ($self, $git, $mime, $bytes, $num) = @_;
+	$self->add_message($mime, $bytes, $num);
 }
 
 sub unindex_blob {
@@ -265,10 +279,22 @@ sub unindex_mm {
 	$self->{mm}->mid_delete(mid_clean($mime->header('Message-ID')));
 }
 
-sub index_both {
+sub index_mm2 {
+	my ($self, $git, $mime, $bytes) = @_;
+	my $num = $self->{mm}->num_for(mid_clean($mime->header('Message-ID')));
+	index_blob($self, $git, $mime, $bytes, $num);
+}
+
+sub unindex_mm2 {
 	my ($self, $git, $mime) = @_;
-	index_blob($self, $git, $mime);
-	index_mm($self, $git, $mime);
+	$self->{mm}->mid_delete(mid_clean($mime->header('Message-ID')));
+	unindex_blob($self, $git, $mime);
+}
+
+sub index_both {
+	my ($self, $git, $mime, $bytes) = @_;
+	my $num = index_mm($self, $git, $mime);
+	index_blob($self, $git, $mime, $bytes, $num);
 }
 
 sub unindex_both {
@@ -278,9 +304,9 @@ sub unindex_both {
 }
 
 sub do_cat_mail {
-	my ($git, $blob) = @_;
+	my ($git, $blob, $sizeref) = @_;
 	my $mime = eval {
-		my $str = $git->cat_file($blob);
+		my $str = $git->cat_file($blob, $sizeref);
 		Email::MIME->new($str);
 	};
 	$@ ? undef : $mime;
@@ -304,12 +330,13 @@ sub rlog {
 		    qw/--reverse --no-notes --no-color --raw -r --no-abbrev/,
 		    $range);
 	my $latest;
+	my $bytes;
 	my $pid = open(my $log, '-|', @cmd) or
 		die('open` '.join(' ', @cmd) . " pipe failed: $!\n");
 	while (my $line = <$log>) {
 		if ($line =~ /$addmsg/o) {
-			my $mime = do_cat_mail($git, $1) or next;
-			$add_cb->($self, $git, $mime);
+			my $mime = do_cat_mail($git, $1, \$bytes) or next;
+			$add_cb->($self, $git, $mime, $bytes);
 		} elsif ($line =~ /$delmsg/o) {
 			my $mime = do_cat_mail($git, $1) or next;
 			$del_cb->($self, $git, $mime);
@@ -354,11 +381,11 @@ sub _index_sync {
 			$mm->{dbh}->commit;
 			$mm->last_commit($lm) if defined $lm;
 
-			goto xapian_only;
+			$lx = $self->rlog($range, *index_mm2, *unindex_mm2);
+			$db->set_metadata('last_commit', $lx) if defined $lx;
 		}
 	} else {
 		# user didn't install DBD::SQLite and DBI
-xapian_only:
 		$lx = $self->rlog($range, *index_blob, *unindex_blob);
 		$db->set_metadata('last_commit', $lx) if defined $lx;
 	}
diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index 8c55c92..8d49ee2 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -14,6 +14,7 @@ use Encode qw/find_encoding/;
 my $enc_utf8 = find_encoding('UTF-8');
 our $PFX2TERM_RE = undef;
 use constant EPOCH_822 => 'Thu, 01 Jan 1970 00:00:00 +0000';
+use POSIX qw(strftime);
 
 sub new {
 	my ($class, $mime) = @_;
@@ -28,48 +29,67 @@ sub wrap {
 	bless { doc => $doc, mime => undef, mid => $mid }, $class;
 }
 
+sub get_val ($$) {
+	my ($doc, $col) = @_;
+	Search::Xapian::sortable_unserialise($doc->get_value($col));
+}
+
 sub load_doc {
 	my ($class, $doc) = @_;
 	my $data = $doc->get_data;
-	my $ts = eval {
-		no strict 'subs';
-		$doc->get_value(PublicInbox::Search::TS);
-	};
-	$ts = Search::Xapian::sortable_unserialise($ts);
+	my $ts = get_val($doc, &PublicInbox::Search::TS);
 	$data = $enc_utf8->decode($data);
-	my ($subj, $from, $refs) = split(/\n/, $data);
+	my ($subj, $from, $refs, $to, $cc) = split(/\n/, $data);
 	bless {
 		doc => $doc,
 		subject => $subj,
 		ts => $ts,
-		from_name => $from,
-		references_sorted => $refs,
+		from => $from,
+		references => $refs,
+		to => $to,
+		cc => $cc,
 	}, $class;
 }
 
-sub subject {
+# :bytes and :lines metadata in RFC 3977
+sub bytes ($) { get_val($_[0]->{doc}, &PublicInbox::Search::BYTES) }
+sub lines ($) { get_val($_[0]->{doc}, &PublicInbox::Search::LINES) }
+sub num ($) { get_val($_[0]->{doc}, &PublicInbox::Search::NUM) }
+
+sub __hdr ($$) {
+	my ($self, $field) = @_;
+	my $val = $self->{$field};
+	return $val if defined $val;
+
+	my $mime = $self->{mime} or return;
+	$val = $mime->header($field);
+	$val = '' unless defined $val;
+	$val =~ tr/\t\r\n/ /;
+	$self->{$field} = $val;
+}
+
+sub subject ($) { __hdr($_[0], 'subject') }
+sub to ($) { __hdr($_[0], 'to') }
+sub cc ($) { __hdr($_[0], 'cc') }
+
+sub date ($) {
 	my ($self) = @_;
-	my $subj = $self->{subject};
-	return $subj if defined $subj;
-	$subj = $self->{mime}->header('Subject');
-	$subj = '' unless defined $subj;
-	$subj =~ tr/\n/ /;
-	$self->{subject} = $subj;
+	my $date = __hdr($self, 'date');
+	return $date if defined $date;
+	my $ts = $self->{ts};
+	return unless defined $ts;
+	$self->{date} = strftime('%a, %d %b %Y %T %z', gmtime($ts));
 }
 
-sub from {
+sub from ($) {
 	my ($self) = @_;
-	my $from = $self->mime->header('From') || '';
-	my @from;
-
-	if ($from) {
-		$from =~ tr/\n/ /;
-		@from = Email::Address->parse($from);
-		$self->{from} = $from[0];
-		$from = $from[0]->name;
+	my $from = __hdr($self, 'from');
+	if (defined $from && !defined $self->{from_name}) {
+		$from =~ tr/\t\r\n/ /;
+		my @from = Email::Address->parse($from);
+		$self->{from_name} = $from[0]->name;
 	}
-	$self->{from_name} = $from;
-	$self->{from};
+	$from;
 }
 
 sub from_name {
@@ -87,14 +107,13 @@ sub ts {
 
 sub to_doc_data {
 	my ($self) = @_;
-	PublicInbox::Search::subject_summary($self->subject) . "\n" .
-	$self->from_name . "\n".
-	$self->references_sorted;
+	join("\n", $self->subject, $self->from, $self->references,
+		$self->to, $self->cc);
 }
 
-sub references_sorted {
+sub references {
 	my ($self) = @_;
-	my $x = $self->{references_sorted};
+	my $x = $self->{references};
 	defined $x ? $x : '';
 }
 
@@ -140,8 +159,7 @@ sub mini_mime {
 		'Message-ID' => "<$self->{mid}>",
 		'X-PI-TS' => $self->ts,
 	);
-	if (my $refs = $self->{references_sorted}) {
-		$refs =~ s/></> </g;
+	if (my $refs = $self->{references}) {
 		push @h, References => $refs;
 	}
 	my $mime = Email::MIME->create(header_str => \@hs, header => \@h);
@@ -156,7 +174,7 @@ sub mini_mime {
 	$mime;
 }
 
-sub mid {
+sub mid ($;$) {
 	my ($self, $mid) = @_;
 
 	if (defined $mid) {
diff --git a/public-inbox-nntpd b/public-inbox-nntpd
index 82d6838..0035637 100644
--- a/public-inbox-nntpd
+++ b/public-inbox-nntpd
@@ -52,8 +52,8 @@ sub refresh_groups () {
 			$ng = $old_ng;
 		}
 
-		# Only valid if Msgmap works
-		if ($ng->mm(1)) {
+		# Only valid if msgmap and search works
+		if ($ng->usable) {
 			$new->{$g} = $ng;
 			push @list, $ng;
 		}
diff --git a/t/nntpd.t b/t/nntpd.t
index ea2c3df..e4c0244 100644
--- a/t/nntpd.t
+++ b/t/nntpd.t
@@ -151,7 +151,30 @@ EOF
 			'<nntp@example.com>',
 			'',
 			'202',
-			'1' ] }, "XOVER works");
+			'1' ] }, "XOVER range works");
+
+	is_deeply($n->xover('1'), {
+		'1' => ['hihi',
+			'Me <me@example.com>',
+			'Thu, 01 Jan 1970 06:06:06 +0000',
+			'<nntp@example.com>',
+			'',
+			'202',
+			'1' ] }, "XOVER by article works");
+
+	{
+		syswrite($s, "OVER $mid\r\n");
+		$buf = '';
+		do {
+			sysread($s, $buf, 4096, length($buf));
+		} until ($buf =~ /^[^2]../ || $buf =~ /\r\n\.\r\n\z/);
+		my @r = split("\r\n", $buf);
+		like($r[0], qr/^224 /, 'got 224 response for OVER');
+		is($r[1], "0\thihi\tMe <me\@example.com>\t" .
+			"Thu, 01 Jan 1970 06:06:06 +0000\t" .
+			"$mid\t\t202\t1", 'OVER by Message-ID works');
+		is($r[2], '.', 'correctly terminated response');
+	}
 
 	ok(kill('TERM', $pid), 'killed nntpd');
 	$pid = undef;
diff --git a/t/search.t b/t/search.t
index b1c7728..cd7048f 100644
--- a/t/search.t
+++ b/t/search.t
@@ -285,7 +285,7 @@ sub filter_mids {
 	ok($doc_id > 0, "doc_id defined with circular reference");
 	my $smsg = $rw->lookup_message('circle@a');
 	$smsg->ensure_metadata;
-	is($smsg->references_sorted, '', "no references created");
+	is($smsg->references, '', "no references created");
 }
 
 done_testing();
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/12] t/nntpd.t: simplify condition for response termination
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (9 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 10/12] nntp: implement OVER/XOVER summary in search document Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  2015-09-30 21:00 ` [PATCH 12/12] t/nntpd.t: additional tests for XHDR/HDR Eric Wong
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

Multiline responses must end with "\r\n.\r\n", so we won't
break out early in case the OS doesn't support MSG_MORE.
---
 t/nntpd.t | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/nntpd.t b/t/nntpd.t
index e4c0244..8a721e2 100644
--- a/t/nntpd.t
+++ b/t/nntpd.t
@@ -130,7 +130,7 @@ EOF
 		syswrite($s, "HDR $k $mid\r\n");
 		do {
 			sysread($s, $buf, 4096, length($buf));
-		} until ($buf =~ /^[^2]../ || $buf =~ /\r\n\.\r\n\z/);
+		} until ($buf =~ /\r\n\.\r\n\z/);
 		my @r = split("\r\n", $buf);
 		like($r[0], qr/\A224 /, '224 response for HDR');
 		is($r[1], "0 $v", 'got expected response for HDR');
@@ -167,7 +167,7 @@ EOF
 		$buf = '';
 		do {
 			sysread($s, $buf, 4096, length($buf));
-		} until ($buf =~ /^[^2]../ || $buf =~ /\r\n\.\r\n\z/);
+		} until ($buf =~ /\r\n\.\r\n\z/);
 		my @r = split("\r\n", $buf);
 		like($r[0], qr/^224 /, 'got 224 response for OVER');
 		is($r[1], "0\thihi\tMe <me\@example.com>\t" .
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 12/12] t/nntpd.t: additional tests for XHDR/HDR
  2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
                   ` (10 preceding siblings ...)
  2015-09-30 21:00 ` [PATCH 11/12] t/nntpd.t: simplify condition for response termination Eric Wong
@ 2015-09-30 21:00 ` Eric Wong
  11 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2015-09-30 21:00 UTC (permalink / raw)
  To: meta

More testing is good, especially since clients I use
don't implement all the commands.
---
 t/nntpd.t | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/t/nntpd.t b/t/nntpd.t
index 8a721e2..ecb876f 100644
--- a/t/nntpd.t
+++ b/t/nntpd.t
@@ -176,6 +176,25 @@ EOF
 		is($r[2], '.', 'correctly terminated response');
 	}
 
+	is_deeply($n->xhdr(qw(Cc 1-)), { 1 => 'test-nntpd@example.com' },
+		 'XHDR Cc 1- works');
+	is_deeply($n->xhdr(qw(References 1-)), { 1 => '' },
+		 'XHDR References 1- works (empty string)');
+	is_deeply($n->xhdr(qw(list-id 1-)), {},
+		 'XHDR on invalid header returns empty');
+
+	{
+		syswrite($s, "HDR List-id 1-\r\n");
+		$buf = '';
+		do {
+			sysread($s, $buf, 4096, length($buf));
+		} until ($buf =~ /\r\n\z/);
+		my @r = split("\r\n", $buf);
+		like($r[0], qr/^5\d\d /,
+			'got 5xx response for unoptimized HDR');
+		is(scalar @r, 1, 'only one response line');
+	}
+
 	ok(kill('TERM', $pid), 'killed nntpd');
 	$pid = undef;
 	waitpid(-1, 0);
-- 
EW


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-09-30 21:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-30 21:00 [PATCH 0/12] another round of NNTP updates Eric Wong
2015-09-30 21:00 ` [PATCH 01/12] search: remove get_subject_path Eric Wong
2015-09-30 21:00 ` [PATCH 02/12] nntp: HDR returns 225, not 224 Eric Wong
2015-09-30 21:00 ` [PATCH 03/12] nntp: reduce syscalls for LIST OVERVIEW.FMT Eric Wong
2015-09-30 21:00 ` [PATCH 04/12] remove unnecessary fields usage Eric Wong
2015-09-30 21:00 ` [PATCH 05/12] daemon: always autoflush stdout/stderr Eric Wong
2015-09-30 21:00 ` [PATCH 06/12] nntpd: avoid lazy require Eric Wong
2015-09-30 21:00 ` [PATCH 07/12] INSTALL: document Danga::Socket dependency for nntpd Eric Wong
2015-09-30 21:00 ` [PATCH 08/12] nntp: MODE READER denies posting Eric Wong
2015-09-30 21:00 ` [PATCH 09/12] nntp: implement LIST HEADERS Eric Wong
2015-09-30 21:00 ` [PATCH 10/12] nntp: implement OVER/XOVER summary in search document Eric Wong
2015-09-30 21:00 ` [PATCH 11/12] t/nntpd.t: simplify condition for response termination Eric Wong
2015-09-30 21:00 ` [PATCH 12/12] t/nntpd.t: additional tests for XHDR/HDR Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).