user/dev discussion of public-inbox itself
 help / color / Atom feed
* news.public-inbox.org misbehaving?
@ 2020-06-09 22:13 Kyle Meyer
  2020-06-09 23:24 ` Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Kyle Meyer @ 2020-06-09 22:13 UTC (permalink / raw)
  To: meta

Checking inbox.comp.mail.public-inbox.meta today, I ran into some odd
behavior in my reader (Gnus).  While debugging that, I tried

  $ w3m -m nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta

The tail of that buffer was

  4132 (06/06) Dmitry Alexandro Re: [Patch] Update 24-hour times to use two digits for the hour
  4133 (06/07) Eric Wong        [PATCH] index: v2: parallel by default

That didn't match up with what I saw at https://public-inbox.org/meta/:

  IMAP server notes, maybe JMAP?
   2020-06-09 11:34 UTC  - mbox.gz / Atom
  
  [PATCH] index: v2: parallel by default
   2020-06-07 20:02 UTC  - mbox.gz / Atom

inbox.comp.version-control.git also seems off.  After invoking w3m, I
landed on an empty

  Newsgroup: inbox.comp.version-control.git 399364-399413
  
  [399314-399363]

Following "[399314-399363]" displayed articles.  The last two were

  399326 (06/08) Denton Liu      Re: [PATCH] Recommend "git gc --prune=now" instead of "git prune"
  399327 (06/08) Simon Ser       Re: [PATCH v2] grep: add configuration variables for --heading

Again, that doesn't match what I see at https://public-inbox.org/git/,
which has a good number of messages from today (6/9).  And the empty
landing page/counts make me think that public-inbox-nntpd knows about
the articles that aren't being displayed.

Any ideas what might be going on?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: news.public-inbox.org misbehaving?
  2020-06-09 22:13 news.public-inbox.org misbehaving? Kyle Meyer
@ 2020-06-09 23:24 ` Eric Wong
  2020-06-11  0:57   ` [PATCH] nntpd+imapd: detect replaced over.sqlite3 Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2020-06-09 23:24 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Any ideas what might be going on?

Thanks for pointing that out.  Worked around for now via SIGHUP to
reload -nntpd.

I used -compact (and via `-index -c --reindex'), and -nntpd
didn't pick up the movement of over.sqlite3 automatically.

Will have to think about how -compact needs to interact with
read-only daemons w.r.t. over.sqlite3, and whether moving
over.sqlite3 can be avoided.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] nntpd+imapd: detect replaced over.sqlite3
  2020-06-09 23:24 ` Eric Wong
@ 2020-06-11  0:57   ` Eric Wong
  2020-06-11  1:54     ` Kyle Meyer
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2020-06-11  0:57 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Will have to think about how -compact needs to interact with
> read-only daemons w.r.t. over.sqlite3, and whether moving
> over.sqlite3 can be avoided.

It probably can't be avoided for v1 or I would've done it.
And I've been thinking about making public-inbox-compact support
VACUUM and possibly other expensive optimizations, too.

Anyways, the following patch (which goes on top of the imapd
stuff I posted yesterday) should fix it.

-------8<-------
Subject: [PATCH] nntpd+imapd: detect replaced over.sqlite3

For v1 inboxes (and possibly v2 in the future, for VACUUM),
public-inbox-compact replaces over.sqlite3 with a new file.

This currently doesn't need an extra inotify watch descriptor
(or FD for kevent) at the moment, so it can coexist nicely for
systems w/o IO::KQueue or Linux::Inotify2.
---
 lib/PublicInbox/Inbox.pm | 14 +++++++++++---
 lib/PublicInbox/NNTP.pm  |  4 +++-
 lib/PublicInbox/NNTPD.pm | 12 ++++++++++--
 lib/PublicInbox/Over.pm  | 36 +++++++++++++++++++++++++++++-------
 t/nntpd.t                | 35 +++++++++++++++++++++++++++++++----
 5 files changed, 84 insertions(+), 17 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index b2b0b56fdd3..7d5e048363f 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -31,7 +31,7 @@ sub cleanup_task () {
 	for my $ibx (values %$CLEANUP) {
 		my $again;
 		if ($have_devel_peek) {
-			foreach my $f (qw(mm search over)) {
+			foreach my $f (qw(mm search)) {
 				# we bump refcnt by assigning tmp, here:
 				my $tmp = $ibx->{$f} or next;
 				next if Devel::Peek::SvREFCNT($tmp) > 2;
@@ -45,9 +45,9 @@ sub cleanup_task () {
 				$again = 1 if $git->cleanup;
 			}
 		}
+		check_inodes($ibx);
 		if ($have_devel_peek) {
-			$again ||= !!($ibx->{over} || $ibx->{mm} ||
-			              $ibx->{search});
+			$again ||= !!($ibx->{mm} || $ibx->{search});
 		}
 		$next->{"$ibx"} = $ibx if $again;
 	}
@@ -407,9 +407,17 @@ sub unsubscribe_unlock {
 	delete $self->{unlock_subs}->{$ident};
 }
 
+sub check_inodes ($) {
+	my ($self) = @_;
+	for (qw(over)) { # TODO: search, mm
+		$self->{$_}->check_inodes if $self->{$_};
+	}
+}
+
 # called by inotify
 sub on_unlock {
 	my ($self) = @_;
+	check_inodes($self);
 	my $subs = $self->{unlock_subs} or return;
 	for (values %$subs) {
 		eval { $_->on_inbox_unlock($self) };
diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index ac13c7df8ce..bffd773cf9e 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -334,7 +334,9 @@ sub cmd_newnews ($$$$;$$) {
 sub cmd_group ($$) {
 	my ($self, $group) = @_;
 	my $no_such = '411 no such news group';
-	my $ng = $self->{nntpd}->{groups}->{$group} or return $no_such;
+	my $nntpd = $self->{nntpd};
+	my $ng = $nntpd->{groups}->{$group} or return $no_such;
+	$nntpd->idler_start;
 
 	$self->{ng} = $ng;
 	my ($min, $max) = $ng->mm->minmax;
diff --git a/lib/PublicInbox/NNTPD.pm b/lib/PublicInbox/NNTPD.pm
index ed5cf7cc8c0..6b762d8923b 100644
--- a/lib/PublicInbox/NNTPD.pm
+++ b/lib/PublicInbox/NNTPD.pm
@@ -8,6 +8,7 @@ use strict;
 use warnings;
 use Sys::Hostname;
 use PublicInbox::Config;
+use PublicInbox::InboxIdle;
 
 sub new {
 	my ($class) = @_;
@@ -24,15 +25,17 @@ sub new {
 		err => \*STDERR,
 		out => \*STDOUT,
 		grouplist => [],
+		pi_config => $pi_config,
 		servername => $name,
 		greet => \"201 $name ready - post via email\r\n",
 		# accept_tls => { SSL_server => 1, ..., SSL_reuse_ctx => ... }
+		# idler => PublicInbox::InboxIdle
 	}, $class;
 }
 
 sub refresh_groups {
-	my ($self, $pi_config) = @_;
-	$pi_config //= PublicInbox::Config->new;
+	my ($self, $sig) = @_;
+	my $pi_config = $sig ? PublicInbox::Config->new : $self->{pi_config};
 	my $new = {};
 	my @list;
 	$pi_config->each_inbox(sub {
@@ -59,8 +62,13 @@ sub refresh_groups {
 	});
 	@list =	sort { $a->{newsgroup} cmp $b->{newsgroup} } @list;
 	$self->{grouplist} = \@list;
+	$self->{pi_config} = $pi_config;
 	# this will destroy old groups that got deleted
 	%{$self->{groups}} = %$new;
 }
 
+sub idler_start {
+	$_[0]->{idler} //= PublicInbox::InboxIdle->new($_[0]->{pi_config});
+}
+
 1;
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index e32104f0cf8..74c8fb86270 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -19,13 +19,23 @@ sub dbh_new {
 	if ($rw && !-f $f) { # SQLite defaults mode to 0644, we want 0666
 		open my $fh, '+>>', $f or die "failed to open $f: $!";
 	}
-	my $dbh = DBI->connect("dbi:SQLite:dbname=$f",'','', {
-		AutoCommit => 1,
-		RaiseError => 1,
-		PrintError => 0,
-		ReadOnly => !$rw,
-		sqlite_use_immediate_transaction => 1,
-	});
+	my (@st, $st, $dbh);
+	my $tries = 0;
+	do {
+		@st = stat($f) or die "failed to stat $f: $!";
+		$st = pack('dd', $st[0], $st[1]); # 0: dev, 1: inode
+		$dbh = DBI->connect("dbi:SQLite:dbname=$f",'','', {
+			AutoCommit => 1,
+			RaiseError => 1,
+			PrintError => 0,
+			ReadOnly => !$rw,
+			sqlite_use_immediate_transaction => 1,
+		});
+		$self->{st} = $st;
+		@st = stat($f) or die "failed to stat $f: $!";
+		$st = pack('dd', $st[0], $st[1]);
+	} while ($st ne $self->{st} && $tries++ < 3);
+	warn "W: $f: .st_dev, .st_ino unstable\n" if $st ne $self->{st};
 	$dbh->{sqlite_unicode} = 1;
 	$dbh;
 }
@@ -259,4 +269,16 @@ SELECT MAX(num) FROM over WHERE num > 0
 	($exists, $uidnext, $sth->fetchrow_array // 0);
 }
 
+sub check_inodes {
+	my ($self) = @_;
+	if (my @st = stat($self->{filename})) { # did st_dev, st_ino change?
+		my $st = pack('dd', $st[0], $st[1]);
+
+		# don't actually reopen, just let {dbh} be recreated later
+		delete($self->{dbh}) if ($st ne ($self->{st} // $st));
+	} else {
+		warn "W: stat $self->{filename}: $!\n";
+	}
+}
+
 1;
diff --git a/t/nntpd.t b/t/nntpd.t
index d2f31323115..d6042d18758 100644
--- a/t/nntpd.t
+++ b/t/nntpd.t
@@ -14,8 +14,11 @@ use Net::NNTP;
 use Sys::Hostname;
 
 # FIXME: make easier to test both versions
-my $version = $ENV{PI_TEST_VERSION} || 2;
+my $version = $ENV{PI_TEST_VERSION} || 1;
 require_git('2.6') if $version == 2;
+my $lsof = which('lsof');
+my $fast_idle = eval { require Linux::Inotify2; 1 } //
+		eval { require IO::KQueue; 1 };
 
 my ($tmpdir, $for_destroy) = tmpdir();
 my $home = "$tmpdir/pi-home";
@@ -302,13 +305,13 @@ Date: Fri, 02 Oct 1993 00:00:00 +0000
 		is($rdr, waitpid($rdr, 0), 'reader done');
 		is($? >> 8, 0, 'no errors');
 	}
+	my $noerr = { 2 => \(my $null) };
 	SKIP: {
 		if ($INC{'Search/Xapian.pm'} && ($ENV{TEST_RUN_MODE}//2)) {
 			skip 'Search/Xapian.pm pre-loaded (by t/run.perl?)', 1;
 		}
-		my $lsof = which('lsof') or skip 'lsof missing', 1;
-		my $rdr = { 2 => \(my $null) };
-		my @of = xqx([$lsof, '-p', $td->{pid}], undef, $rdr);
+		$lsof or skip 'lsof missing', 1;
+		my @of = xqx([$lsof, '-p', $td->{pid}], undef, $noerr);
 		skip('lsof broken', 1) if (!scalar(@of) || $?);
 		my @xap = grep m!Search/Xapian!, @of;
 		is_deeply(\@xap, [], 'Xapian not loaded in nntpd');
@@ -330,6 +333,30 @@ Date: Fri, 02 Oct 1993 00:00:00 +0000
 		is(scalar @r, 1, 'only one response line');
 	}
 
+	# -compact requires Xapian
+	SKIP: {
+		require_mods('Search::Xapian', 2);
+		which('xapian-compact') or skip 'xapian-compact missing', 2;
+		is(xsys(qw(git config), "--file=$home/.public-inbox/config",
+				"publicinbox.$group.indexlevel", 'medium'),
+			0, 'upgraded indexlevel');
+		my $ex = eml_load('t/data/0001.patch');
+		is($n->article($ex->header('Message-ID')), undef,
+			'article did not exist');
+		$im->add($ex);
+		$im->done;
+		ok(run_script([qw(-index --reindex -c), $ibx->{inboxdir}],
+				undef, $noerr), '-compacted');
+		select(undef, undef, undef, $fast_idle ? 0.1 : 2.1);
+		$art = $n->article($ex->header('Message-ID'));
+		ok($art, 'new article retrieved after compact');
+		$lsof or skip 'lsof missing', 1;
+		($^O =~ /\A(?:linux)\z/) or
+			skip "lsof /(deleted)/ check untested on $^O", 1;
+		my @of = xqx([$lsof, '-p', $td->{pid}], undef, $noerr);
+		is(scalar(grep(/\(deleted\)/, @of)), 0, 'no deleted files');
+	};
+
 	$n = $s = undef;
 	$td->join;
 	is($?, 0, 'no error in exited process');

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nntpd+imapd: detect replaced over.sqlite3
  2020-06-11  0:57   ` [PATCH] nntpd+imapd: detect replaced over.sqlite3 Eric Wong
@ 2020-06-11  1:54     ` Kyle Meyer
  0 siblings, 0 replies; 4+ messages in thread
From: Kyle Meyer @ 2020-06-11  1:54 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Anyways, the following patch (which goes on top of the imapd
> stuff I posted yesterday) should fix it.

Nice, thank you.  (Looking forward to trying out the imapd stuff :))

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-09 22:13 news.public-inbox.org misbehaving? Kyle Meyer
2020-06-09 23:24 ` Eric Wong
2020-06-11  0:57   ` [PATCH] nntpd+imapd: detect replaced over.sqlite3 Eric Wong
2020-06-11  1:54     ` Kyle Meyer

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git