user/dev discussion of public-inbox itself
 help / Atom feed
* [PATCH 0/9] misc cleanups and trimming
@ 2017-01-07  1:44 Eric Wong
  2017-01-07  1:44 ` [PATCH 1/9] qspawn: prepare to support runtime reloading of Limiter Eric Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

Weak reference dependencies are entirely gone.  This should
simplify the internal object structures of Perl a bit and
hopefully make the data structures easier to reason about.

While we're at it, there's several search-related cleanups
to eliminate some dead code.

9 changes:

      qspawn: prepare to support runtime reloading of Limiter
      config: always use namespaced "publicinboxlimiter"
      config: remove unused get() method
      inbox: describe the full key name
      inbox: eliminate weaken usage entirely
      config: allow per-inbox nntpserver
      remove incorrect comment about strftime + locales
      searchmsg: favor direct hash access over accessor methods
      search: remove subject_summary


 lib/PublicInbox/Config.pm        | 14 ++------
 lib/PublicInbox/Inbox.pm         | 73 ++++++++++++++++++----------------------
 lib/PublicInbox/Qspawn.pm        | 11 ++++--
 lib/PublicInbox/Search.pm        | 29 ++--------------
 lib/PublicInbox/SearchIdx.pm     |  4 +--
 lib/PublicInbox/SearchMsg.pm     | 36 +++-----------------
 lib/PublicInbox/WwwAtomStream.pm |  1 -
 t/config.t                       | 25 ++++++++++++--
 t/config_limiter.t               | 11 +++---
 t/search.t                       | 17 ----------
 10 files changed, 78 insertions(+), 143 deletions(-)


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 1/9] qspawn: prepare to support runtime reloading of Limiter
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 2/9] config: always use namespaced "publicinboxlimiter" Eric Wong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

We may allow the {max} value of a limiter to be changed
in the future, so lets start accounting for it before we
spawn followup processes.
---
 lib/PublicInbox/Qspawn.pm | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 697c55a..4950da2 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -31,14 +31,19 @@ sub _do_spawn {
 sub finish ($) {
 	my ($self) = @_;
 	my $limiter = $self->{limiter};
+	my $running;
 	if (delete $self->{rpipe}) {
 		my $pid = delete $self->{pid};
 		$self->{err} = $pid == waitpid($pid, 0) ? $? :
 				"PID:$pid still running?";
-		$limiter->{running}--;
+		$running = --$limiter->{running};
 	}
-	if (my $next = shift @{$limiter->{run_queue}}) {
-		_do_spawn(@$next);
+
+	# limiter->{max} may change dynamically
+	if (($running || $limiter->{running}) < $limiter->{max}) {
+		if (my $next = shift @{$limiter->{run_queue}}) {
+			_do_spawn(@$next);
+		}
 	}
 	$self->{err};
 }
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 2/9] config: always use namespaced "publicinboxlimiter"
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
  2017-01-07  1:44 ` [PATCH 1/9] qspawn: prepare to support runtime reloading of Limiter Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 3/9] config: remove unused get() method Eric Wong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

I'm not sure if we'll ever support sharing a config file
with other tools, but maybe we will, and "limiter" is
too generic.
---
 lib/PublicInbox/Config.pm | 6 +-----
 t/config_limiter.t        | 2 +-
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 6e31df7..3e3d79a 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -90,11 +90,7 @@ sub limiter {
 	my ($self, $name) = @_;
 	$self->{-limiters}->{$name} ||= do {
 		require PublicInbox::Qspawn;
-		my $max;
-		# XXX "limiter.<name>.max" was a historical mistake
-		foreach my $pfx (qw(publicinboxlimiter limiter)) {
-			$max ||= $self->{"$pfx.$name.max"};
-		}
+		my $max = $self->{"publicinboxlimiter.$name.max"};
 		PublicInbox::Qspawn::Limiter->new($max);
 	};
 }
diff --git a/t/config_limiter.t b/t/config_limiter.t
index 3c7ec55..486bfbe 100644
--- a/t/config_limiter.t
+++ b/t/config_limiter.t
@@ -25,7 +25,7 @@ my $cfgpfx = "publicinbox.test";
 
 {
 	my $config = PublicInbox::Config->new({
-		'limiter.named.max' => 3,
+		'publicinboxlimiter.named.max' => 3,
 		"$cfgpfx.address" => 'test@example.com',
 		"$cfgpfx.mainrepo" => '/path/to/non/existent',
 		"$cfgpfx.httpbackendmax" => 'named',
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 3/9] config: remove unused get() method
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
  2017-01-07  1:44 ` [PATCH 1/9] qspawn: prepare to support runtime reloading of Limiter Eric Wong
  2017-01-07  1:44 ` [PATCH 2/9] config: always use namespaced "publicinboxlimiter" Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 4/9] inbox: describe the full key name Eric Wong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

This seems like an unnecessary abstraction, or an abstraction
on the wrong level.
---
 lib/PublicInbox/Config.pm | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 3e3d79a..55019e9 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -95,12 +95,6 @@ sub limiter {
 	};
 }
 
-sub get {
-	my ($self, $inbox, $key) = @_;
-
-	$self->{"publicinbox.$inbox.$key"};
-}
-
 sub config_dir { $ENV{PI_DIR} || "$ENV{HOME}/.public-inbox" }
 
 sub default_file {
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 4/9] inbox: describe the full key name
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
                   ` (2 preceding siblings ...)
  2017-01-07  1:44 ` [PATCH 3/9] config: remove unused get() method Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 5/9] inbox: eliminate weaken usage entirely Eric Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

Hopefully make this easier for future generations to understand.
---
 lib/PublicInbox/Inbox.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 5503980..008a6cf 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -60,6 +60,7 @@ sub _set_limiter ($$$) {
 	my ($self, $git, $pfx) = @_;
 	my $lkey = "-${pfx}_limiter";
 	$git->{$lkey} = $self->{$lkey} ||= eval {
+		# full key is: publicinbox.$NAME.httpbackendmax
 		my $mkey = $pfx.'max';
 		my $val = $self->{$mkey} or return;
 		my $lim;
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 5/9] inbox: eliminate weaken usage entirely
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
                   ` (3 preceding siblings ...)
  2017-01-07  1:44 ` [PATCH 4/9] inbox: describe the full key name Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  2:12   ` [PATCH 10/9] inbox: properly register cleanup timer for git processes Eric Wong
  2017-01-11 10:21   ` [PATCH 11/9] inbox: reinstate periodic cleanup of Xapian and SQLite objects Eric Wong
  2017-01-07  1:44 ` [PATCH 6/9] config: allow per-inbox nntpserver Eric Wong
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

We can do a better job initializing the data structure
so we no longer need to rely on weak references to cleanup
when we ditch the config on reload.
---
 lib/PublicInbox/Inbox.pm | 69 ++++++++++++++++++++----------------------------
 t/config.t               | 19 +++++++++++--
 t/config_limiter.t       |  9 +++----
 3 files changed, 50 insertions(+), 47 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 008a6cf..f77944f 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -5,28 +5,27 @@
 package PublicInbox::Inbox;
 use strict;
 use warnings;
-use Scalar::Util qw(weaken isweak);
 use PublicInbox::Git;
 use PublicInbox::MID qw(mid2path);
 
-my $weakt;
+my $cleanup_timer;
 eval {
-	$weakt = 'disabled';
+	$cleanup_timer = 'disabled';
 	require PublicInbox::EvCleanup;
-	$weakt = undef; # OK if we get here
+	$cleanup_timer = undef; # OK if we get here
 };
 
-my $WEAKEN = {}; # string(inbox) -> inbox
-sub weaken_task () {
-	$weakt = undef;
-	_weaken_fields($_) for values %$WEAKEN;
-	$WEAKEN = {};
+my $CLEANUP = {}; # string(inbox) -> inbox
+sub cleanup_task () {
+	$cleanup_timer = undef;
+	delete $_->{git} for values %$CLEANUP;
+	$CLEANUP = {};
 }
 
-sub _weaken_later ($) {
+sub _cleanup_later ($) {
 	my ($self) = @_;
-	$weakt ||= PublicInbox::EvCleanup::later(*weaken_task);
-	$WEAKEN->{"$self"} = $self;
+	$cleanup_timer ||= PublicInbox::EvCleanup::later(*cleanup_task);
+	$CLEANUP->{"$self"} = $self;
 }
 
 sub _set_uint ($$$) {
@@ -39,27 +38,10 @@ sub _set_uint ($$$) {
 	$opts->{$field} = $val || $default;
 }
 
-sub new {
-	my ($class, $opts) = @_;
-	my $v = $opts->{address} ||= 'public-inbox@example.com';
-	my $p = $opts->{-primary_address} = ref($v) eq 'ARRAY' ? $v->[0] : $v;
-	$opts->{domain} = ($p =~ /\@(\S+)\z/) ? $1 : 'localhost';
-	_set_uint($opts, 'feedmax', 25);
-	weaken($opts->{-pi_config});
-	bless $opts, $class;
-}
-
-sub _weaken_fields {
-	my ($self) = @_;
-	foreach my $f (qw(git mm search)) {
-		isweak($self->{$f}) or weaken($self->{$f});
-	}
-}
-
 sub _set_limiter ($$$) {
-	my ($self, $git, $pfx) = @_;
+	my ($self, $pi_config, $pfx) = @_;
 	my $lkey = "-${pfx}_limiter";
-	$git->{$lkey} = $self->{$lkey} ||= eval {
+	$self->{$lkey} ||= eval {
 		# full key is: publicinbox.$NAME.httpbackendmax
 		my $mkey = $pfx.'max';
 		my $val = $self->{$mkey} or return;
@@ -68,7 +50,7 @@ sub _set_limiter ($$$) {
 			require PublicInbox::Qspawn;
 			$lim = PublicInbox::Qspawn::Limiter->new($val);
 		} elsif ($val =~ /\A[a-z][a-z0-9]*\z/) {
-			$lim = $self->{-pi_config}->limiter($val);
+			$lim = $pi_config->limiter($val);
 			warn "$mkey limiter=$val not found\n" if !$lim;
 		} else {
 			warn "$mkey limiter=$val not understood\n";
@@ -77,28 +59,35 @@ sub _set_limiter ($$$) {
 	}
 }
 
+sub new {
+	my ($class, $opts) = @_;
+	my $v = $opts->{address} ||= 'public-inbox@example.com';
+	my $p = $opts->{-primary_address} = ref($v) eq 'ARRAY' ? $v->[0] : $v;
+	$opts->{domain} = ($p =~ /\@(\S+)\z/) ? $1 : 'localhost';
+	my $pi_config = delete $opts->{-pi_config};
+	_set_limiter($opts, $pi_config, 'httpbackend');
+	_set_uint($opts, 'feedmax', 25);
+	$opts->{nntpserver} ||= $pi_config->{'publicinbox.nntpserver'};
+	bless $opts, $class;
+}
+
 sub git {
 	my ($self) = @_;
 	$self->{git} ||= eval {
-		_weaken_later($self);
 		my $g = PublicInbox::Git->new($self->{mainrepo});
-		_set_limiter($self, $g, 'httpbackend');
+		$g->{-httpbackend_limiter} = $self->{-httpbackend_limiter};
 		$g;
 	};
 }
 
 sub mm {
 	my ($self) = @_;
-	$self->{mm} ||= eval {
-		_weaken_later($self);
-		PublicInbox::Msgmap->new($self->{mainrepo});
-	};
+	$self->{mm} ||= eval { PublicInbox::Msgmap->new($self->{mainrepo}) };
 }
 
 sub search {
 	my ($self) = @_;
 	$self->{search} ||= eval {
-		_weaken_later($self);
 		PublicInbox::Search->new($self->{mainrepo}, $self->{altid});
 	};
 }
@@ -164,7 +153,7 @@ sub nntp_url {
 	$self->{-nntp_url} ||= do {
 		# no checking for nntp_usable here, we can point entirely
 		# to non-local servers or users run by a different user
-		my $ns = $self->{-pi_config}->{'publicinbox.nntpserver'};
+		my $ns = $self->{nntpserver};
 		my $group = $self->{newsgroup};
 		my @urls;
 		if ($ns && $group) {
diff --git a/t/config.t b/t/config.t
index 7271351..040e9fb 100644
--- a/t/config.t
+++ b/t/config.t
@@ -31,7 +31,8 @@ my $tmpdir = tempdir('pi-config-XXXXXX', TMPDIR => 1, CLEANUP => 1);
 		-primary_address => 'meta@public-inbox.org',
 		'name' => 'meta',
 		feedmax => 25,
-		-pi_config => $cfg,
+		-httpbackend_limiter => undef,
+		nntpserver => undef,
 	}, "lookup matches expected output");
 
 	is($cfg->lookup('blah@example.com'), undef,
@@ -48,7 +49,8 @@ my $tmpdir = tempdir('pi-config-XXXXXX', TMPDIR => 1, CLEANUP => 1);
 		'name' => 'test',
 		feedmax => 25,
 		'url' => 'http://example.com/test',
-		-pi_config => $cfg,
+		-httpbackend_limiter => undef,
+		nntpserver => undef,
 	}, "lookup matches expected output for test");
 }
 
@@ -65,4 +67,17 @@ my $tmpdir = tempdir('pi-config-XXXXXX', TMPDIR => 1, CLEANUP => 1);
 	is_deeply($ibx->{altid}, [ @altid ]);
 }
 
+{
+	my $pfx = "publicinbox.test";
+	my %h = (
+		"$pfx.address" => 'test@example.com',
+		"$pfx.mainrepo" => '/path/to/non/existent',
+		"publicinbox.nntpserver" => 'news.example.com',
+	);
+	my %tmp = %h;
+	my $cfg = PublicInbox::Config->new(\%tmp);
+	my $ibx = $cfg->lookup_name('test');
+	is($ibx->{nntpserver}, 'news.example.com', 'global NNTP server');
+}
+
 done_testing();
diff --git a/t/config_limiter.t b/t/config_limiter.t
index 486bfbe..f0b6528 100644
--- a/t/config_limiter.t
+++ b/t/config_limiter.t
@@ -37,12 +37,11 @@ my $cfgpfx = "publicinbox.test";
 	my $lim = $git->{-httpbackend_limiter};
 	ok($lim, 'Limiter exists');
 	is($lim->{max}, 3, 'limiter has expected slots');
-	$git = undef;
 	$ibx->{git} = undef;
-	PublicInbox::Inbox::weaken_task;
-	$git = $ibx->git;
-	isnt($old, "$git", 'got new Git object');
-	is("$git->{-httpbackend_limiter}", "$lim", 'same limiter');
+	PublicInbox::Inbox::cleanup_task;
+	my $new = $ibx->git;
+	isnt($old, "$new", 'got new Git object');
+	is("$new->{-httpbackend_limiter}", "$lim", 'same limiter');
 	is($lim->{max}, 3, 'limiter has expected slots');
 }
 
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 6/9] config: allow per-inbox nntpserver
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
                   ` (4 preceding siblings ...)
  2017-01-07  1:44 ` [PATCH 5/9] inbox: eliminate weaken usage entirely Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 7/9] remove incorrect comment about strftime + locales Eric Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

This allows certain inboxes to override the global nntpserver
(perhaps under a different domain).
---
 lib/PublicInbox/Config.pm | 2 +-
 t/config.t                | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 55019e9..28b5bdb 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -136,7 +136,7 @@ sub _fill {
 
 	foreach my $k (qw(mainrepo address filter url newsgroup
 			infourl watch watchheader httpbackendmax
-			feedmax)) {
+			feedmax nntpserver)) {
 		my $v = $self->{"$pfx.$k"};
 		$rv->{$k} = $v if defined $v;
 	}
diff --git a/t/config.t b/t/config.t
index 040e9fb..3ba6111 100644
--- a/t/config.t
+++ b/t/config.t
@@ -78,6 +78,12 @@ my $tmpdir = tempdir('pi-config-XXXXXX', TMPDIR => 1, CLEANUP => 1);
 	my $cfg = PublicInbox::Config->new(\%tmp);
 	my $ibx = $cfg->lookup_name('test');
 	is($ibx->{nntpserver}, 'news.example.com', 'global NNTP server');
+
+	delete $h{'publicinbox.nntpserver'};
+	$h{"$pfx.nntpserver"} = 'news.alt.example.com';
+	$cfg = PublicInbox::Config->new(\%h);
+	$ibx = $cfg->lookup_name('test');
+	is($ibx->{nntpserver}, 'news.alt.example.com','per-inbox NNTP server');
 }
 
 done_testing();
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 7/9] remove incorrect comment about strftime + locales
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
                   ` (5 preceding siblings ...)
  2017-01-07  1:44 ` [PATCH 6/9] config: allow per-inbox nntpserver Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 8/9] searchmsg: favor direct hash access over accessor methods Eric Wong
  2017-01-07  1:44 ` [PATCH 9/9] search: remove subject_summary Eric Wong
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

We only need strftime to be locale-independent when generating
dates for email and HTTP headers.  Purely numeric dates can
use strftime for ease-of-readability.
---
 lib/PublicInbox/SearchMsg.pm     | 2 +-
 lib/PublicInbox/WwwAtomStream.pm | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index 96406c6..4522eb6 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -69,7 +69,7 @@ sub subject ($) { __hdr($_[0], 'subject') }
 sub to ($) { __hdr($_[0], 'to') }
 sub cc ($) { __hdr($_[0], 'cc') }
 
-# no strftime, that is locale-dependent
+# no strftime, that is locale-dependent and not for RFC822
 my @DoW = qw(Sun Mon Tue Wed Thu Fri Sat);
 my @MoY = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
 
diff --git a/lib/PublicInbox/WwwAtomStream.pm b/lib/PublicInbox/WwwAtomStream.pm
index a6817b3..5a10034 100644
--- a/lib/PublicInbox/WwwAtomStream.pm
+++ b/lib/PublicInbox/WwwAtomStream.pm
@@ -6,7 +6,6 @@ package PublicInbox::WwwAtomStream;
 use strict;
 use warnings;
 
-# FIXME: locale-independence:
 use POSIX qw(strftime);
 use Date::Parse qw(strptime);
 use Digest::SHA qw(sha1_hex);
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 8/9] searchmsg: favor direct hash access over accessor methods
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
                   ` (6 preceding siblings ...)
  2017-01-07  1:44 ` [PATCH 7/9] remove incorrect comment about strftime + locales Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  2017-01-07  1:44 ` [PATCH 9/9] search: remove subject_summary Eric Wong
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

This is faster, smaller, and more straighforward to me with
fewer layers of indirection.
---
 lib/PublicInbox/Inbox.pm     |  3 ++-
 lib/PublicInbox/Search.pm    |  2 +-
 lib/PublicInbox/SearchIdx.pm |  4 ++--
 lib/PublicInbox/SearchMsg.pm | 33 ++-------------------------------
 4 files changed, 7 insertions(+), 35 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index f77944f..aa4e141 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -216,7 +216,8 @@ sub msg_by_smsg ($$;$) {
 
 	# backwards compat to fallback to msg_by_mid
 	# TODO: remove if we bump SCHEMA_VERSION in Search.pm:
-	defined(my $blob = $smsg->blob) or return msg_by_mid($self, $smsg->mid);
+	defined(my $blob = $smsg->{blob}) or
+			return msg_by_mid($self, $smsg->mid);
 
 	my $str = git($self)->cat_file($blob, $ref);
 	$$str =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s if $str;
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index b59430d..86354b5 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -282,7 +282,7 @@ sub lookup_message {
 		# raises on error:
 		my $doc = $self->{xdb}->get_document($doc_id);
 		$smsg = PublicInbox::SearchMsg->wrap($doc, $mid);
-		$smsg->doc_id($doc_id);
+		$smsg->{doc_id} = $doc_id;
 	}
 	$smsg;
 }
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 832d1cb..87ee0d4 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -155,7 +155,7 @@ sub add_message {
 		if ($smsg) {
 			# convert a ghost to a regular message
 			# it will also clobber any existing regular message
-			$doc_id = $smsg->doc_id;
+			$doc_id = $smsg->{doc_id};
 			$old_tid = $smsg->thread_id;
 		}
 		$smsg = PublicInbox::SearchMsg->new($mime);
@@ -289,7 +289,7 @@ sub link_message {
 	my ($self, $smsg, $old_tid) = @_;
 	my $doc = $smsg->{doc};
 	my $mid = $smsg->mid;
-	my $mime = $smsg->mime;
+	my $mime = $smsg->{mime};
 	my $hdr = $mime->header_obj;
 	my $refs = $hdr->header_raw('References');
 	my @refs = $refs ? ($refs =~ /<([^>]+)>/g) : ();
diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index 4522eb6..5bb0077 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -103,7 +103,7 @@ sub from_name {
 
 sub ts {
 	my ($self) = @_;
-	$self->{ts} ||= eval { str2time($self->mime->header('Date')) } || 0;
+	$self->{ts} ||= eval { str2time($self->{mime}->header('Date')) } || 0;
 }
 
 sub to_doc_data {
@@ -146,36 +146,7 @@ sub mid ($;$) {
 	}
 }
 
-sub _extract_mid { mid_clean(mid_mime($_[0]->mime)) }
-
-sub blob {
-	my ($self, $x40) = @_;
-	if (defined $x40) {
-		$self->{blob} = $x40;
-	} else {
-		$self->{blob};
-	}
-}
-
-sub mime {
-	my ($self, $mime) = @_;
-	if (defined $mime) {
-		$self->{mime} = $mime;
-	} else {
-		# TODO load from git
-		$self->{mime};
-	}
-}
-
-sub doc_id {
-	my ($self, $doc_id) = @_;
-	if (defined $doc_id) {
-		$self->{doc_id} = $doc_id;
-	} else {
-		# TODO load from xapian
-		$self->{doc_id};
-	}
-}
+sub _extract_mid { mid_clean(mid_mime($_[0]->{mime})) }
 
 sub thread_id {
 	my ($self) = @_;
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 9/9] search: remove subject_summary
  2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
                   ` (7 preceding siblings ...)
  2017-01-07  1:44 ` [PATCH 8/9] searchmsg: favor direct hash access over accessor methods Eric Wong
@ 2017-01-07  1:44 ` Eric Wong
  8 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  1:44 UTC (permalink / raw)
  To: meta

Apparently it never actually got used, and the world seems
fine without it, so we can drop it.

While we're at it, consider removing our subject_path
usage from existence, too.  We are not using fancy subject-line
based URLs, here.
---
 lib/PublicInbox/Search.pm    | 27 +--------------------------
 lib/PublicInbox/SearchMsg.pm |  1 +
 t/search.t                   | 17 -----------------
 3 files changed, 2 insertions(+), 43 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 86354b5..a1bae41 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -326,6 +326,7 @@ sub find_doc_ids_for_term {
 }
 
 # normalize subjects so they are suitable as pathnames for URLs
+# XXX: consider for removal
 sub subject_path {
 	my $subj = pop;
 	$subj = subject_normalized($subj);
@@ -343,32 +344,6 @@ sub subject_normalized {
 	$subj;
 }
 
-# for doc data
-sub subject_summary {
-	my $subj = pop;
-	my $max = 68;
-	if (length($subj) > $max) {
-		my @subj = split(/\s+/, $subj);
-		$subj = '';
-		my $l;
-
-		while ($l = shift @subj) {
-			my $new = $subj . $l . ' ';
-			last if length($new) >= $max;
-			$subj = $new;
-		}
-		if ($subj ne '') {
-			my $r = scalar @subj ? ' ...' : '';
-			$subj =~ s/ \z/$r/s;
-		} else {
-			# subject has one REALLY long word, and NOT spam? wtf
-			@subj = ($l =~ /\A(.{1,72})/);
-			$subj = $subj[0] . ' ...';
-		}
-	}
-	$subj;
-}
-
 sub enquire {
 	my ($self) = @_;
 	$self->{enquire} ||= Search::Xapian::Enquire->new($self->{xdb});
diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index 5bb0077..b8eee66 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -155,6 +155,7 @@ sub thread_id {
 	$self->{thread} = _get_term_val($self, 'G', qr/\AG/); # *G*roup
 }
 
+# XXX: consider removing this, we can phrase match subject
 sub path {
 	my ($self) = @_;
 	my $path = $self->{path};
diff --git a/t/search.t b/t/search.t
index c16811d..c9c4e34 100644
--- a/t/search.t
+++ b/t/search.t
@@ -15,23 +15,6 @@ is(0, system(qw(git init -q --bare), $git_dir), "git init (main)");
 eval { PublicInbox::Search->new($git_dir) };
 ok($@, "exception raised on non-existent DB");
 
-{
-	my $orig = "FOO " x 30;
-	my $summ = PublicInbox::Search::subject_summary($orig);
-
-	$summ = length($summ);
-	$orig = length($orig);
-	ok($summ < $orig && $summ > 0, "summary shortened ($orig => $summ)");
-
-	$orig = "FOO" x 30;
-	$summ = PublicInbox::Search::subject_summary($orig);
-
-	$summ = length($summ);
-	$orig = length($orig);
-	ok($summ < $orig && $summ > 0,
-	   "summary shortened but not empty: $summ");
-}
-
 my $rw = PublicInbox::SearchIdx->new($git_dir, 1);
 $rw->_xdb_acquire;
 $rw->_xdb_release;
-- 
EW


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 10/9] inbox: properly register cleanup timer for git processes
  2017-01-07  1:44 ` [PATCH 5/9] inbox: eliminate weaken usage entirely Eric Wong
@ 2017-01-07  2:12   ` Eric Wong
  2017-01-11 10:21   ` [PATCH 11/9] inbox: reinstate periodic cleanup of Xapian and SQLite objects Eric Wong
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-07  2:12 UTC (permalink / raw)
  To: meta

We still need to cleanup git processes occasionally, since
"git cat-file --batch" does not release old packs (and
git processes are fairly expensive).

For SQLite and Xapian file handles, they should be capable
of managing themselves without too much trouble, so lets
try keeping them for the lifetime of a process.
---
 lib/PublicInbox/Inbox.pm | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index aa4e141..51ada0b 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -22,12 +22,6 @@ sub cleanup_task () {
 	$CLEANUP = {};
 }
 
-sub _cleanup_later ($) {
-	my ($self) = @_;
-	$cleanup_timer ||= PublicInbox::EvCleanup::later(*cleanup_task);
-	$CLEANUP->{"$self"} = $self;
-}
-
 sub _set_uint ($$$) {
 	my ($opts, $field, $default) = @_;
 	my $val = $opts->{$field};
@@ -76,6 +70,8 @@ sub git {
 	$self->{git} ||= eval {
 		my $g = PublicInbox::Git->new($self->{mainrepo});
 		$g->{-httpbackend_limiter} = $self->{-httpbackend_limiter};
+		$cleanup_timer ||= PublicInbox::EvCleanup::later(*cleanup_task);
+		$CLEANUP->{"$self"} = $self;
 		$g;
 	};
 }
-- 
EW

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH 11/9] inbox: reinstate periodic cleanup of Xapian and SQLite objects
  2017-01-07  1:44 ` [PATCH 5/9] inbox: eliminate weaken usage entirely Eric Wong
  2017-01-07  2:12   ` [PATCH 10/9] inbox: properly register cleanup timer for git processes Eric Wong
@ 2017-01-11 10:21   ` Eric Wong
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Wong @ 2017-01-11 10:21 UTC (permalink / raw)
  To: meta

We may need to do this even more aggressively, since the
Xapian database does not always give the latest results.
This time, we'll do it without relying on weak references,
and instead check refcounts.
---
 lib/PublicInbox/Inbox.pm | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 51ada0b..a0d69f1 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -7,6 +7,7 @@ use strict;
 use warnings;
 use PublicInbox::Git;
 use PublicInbox::MID qw(mid2path);
+use Devel::Peek qw(SvREFCNT);
 
 my $cleanup_timer;
 eval {
@@ -18,10 +19,20 @@ eval {
 my $CLEANUP = {}; # string(inbox) -> inbox
 sub cleanup_task () {
 	$cleanup_timer = undef;
-	delete $_->{git} for values %$CLEANUP;
+	for my $ibx (values %$CLEANUP) {
+		foreach my $f (qw(git mm search)) {
+			delete $ibx->{$f} if SvREFCNT($ibx->{$f}) == 1;
+		}
+	}
 	$CLEANUP = {};
 }
 
+sub _cleanup_later ($) {
+	my ($self) = @_;
+	$cleanup_timer ||= PublicInbox::EvCleanup::later(*cleanup_task);
+	$CLEANUP->{"$self"} = $self;
+}
+
 sub _set_uint ($$$) {
 	my ($opts, $field, $default) = @_;
 	my $val = $opts->{$field};
@@ -70,20 +81,23 @@ sub git {
 	$self->{git} ||= eval {
 		my $g = PublicInbox::Git->new($self->{mainrepo});
 		$g->{-httpbackend_limiter} = $self->{-httpbackend_limiter};
-		$cleanup_timer ||= PublicInbox::EvCleanup::later(*cleanup_task);
-		$CLEANUP->{"$self"} = $self;
+		_cleanup_later($self);
 		$g;
 	};
 }
 
 sub mm {
 	my ($self) = @_;
-	$self->{mm} ||= eval { PublicInbox::Msgmap->new($self->{mainrepo}) };
+	$self->{mm} ||= eval {
+		_cleanup_later($self);
+		PublicInbox::Msgmap->new($self->{mainrepo});
+	};
 }
 
 sub search {
 	my ($self) = @_;
 	$self->{search} ||= eval {
+		_cleanup_later($self);
 		PublicInbox::Search->new($self->{mainrepo}, $self->{altid});
 	};
 }
-- 
EW

^ permalink raw reply	[flat|threaded] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-07  1:44 [PATCH 0/9] misc cleanups and trimming Eric Wong
2017-01-07  1:44 ` [PATCH 1/9] qspawn: prepare to support runtime reloading of Limiter Eric Wong
2017-01-07  1:44 ` [PATCH 2/9] config: always use namespaced "publicinboxlimiter" Eric Wong
2017-01-07  1:44 ` [PATCH 3/9] config: remove unused get() method Eric Wong
2017-01-07  1:44 ` [PATCH 4/9] inbox: describe the full key name Eric Wong
2017-01-07  1:44 ` [PATCH 5/9] inbox: eliminate weaken usage entirely Eric Wong
2017-01-07  2:12   ` [PATCH 10/9] inbox: properly register cleanup timer for git processes Eric Wong
2017-01-11 10:21   ` [PATCH 11/9] inbox: reinstate periodic cleanup of Xapian and SQLite objects Eric Wong
2017-01-07  1:44 ` [PATCH 6/9] config: allow per-inbox nntpserver Eric Wong
2017-01-07  1:44 ` [PATCH 7/9] remove incorrect comment about strftime + locales Eric Wong
2017-01-07  1:44 ` [PATCH 8/9] searchmsg: favor direct hash access over accessor methods Eric Wong
2017-01-07  1:44 ` [PATCH 9/9] search: remove subject_summary Eric Wong

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox