user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@yhbt.net>
To: meta@public-inbox.org
Subject: [PATCH 3/3] over+msgmap: respect WAL journal_mode if set
Date: Tue, 25 Aug 2020 03:02:47 +0000	[thread overview]
Message-ID: <20200825030247.12307-4-e@yhbt.net> (raw)
In-Reply-To: <20200825030247.12307-1-e@yhbt.net>

WAL actually seems to have ideal locking characteristics given
concurrency problems I'm experiencing with --reindex running
in parallel with expensive read-only SQLite queries:
<https://public-inbox.org/meta/20200825001204.GA840@dcvr/>

Unfortunately, we cannot blindly use WAL while preserving
compatibility with existing setups nor our guarantees that
read-only daemons are indeed "read-only".

However, respect an user's the choice to set WAL on their
own if they're comfortable with giving -nntpd/-httpd/-imapd
processes write permission to the directory storing SQLite DBs.
---
 lib/PublicInbox/Msgmap.pm  |  3 ---
 lib/PublicInbox/Over.pm    | 23 ++++++++++++++++++++++-
 lib/PublicInbox/OverIdx.pm |  5 -----
 t/over.t                   | 10 ++++++++++
 4 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 5b4cebc1..d28e96c8 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -31,9 +31,6 @@ sub new_file {
 	my $self = bless { filename => $f }, $class;
 	my $dbh = $self->{dbh} = PublicInbox::Over::dbh_new($self, $rw);
 	if ($rw) {
-		# TRUNCATE reduces I/O compared to the default (DELETE)
-		$dbh->do('PRAGMA journal_mode = TRUNCATE');
-
 		$dbh->begin_work;
 		create_tables($dbh);
 		$self->created_at(time) unless $self->created_at;
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index a2f04117..3e74b7a6 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -42,7 +42,28 @@ sub dbh_new {
 		$st = pack('dd', $st[0], $st[1]);
 	} while ($st ne $self->{st} && $tries++ < 3);
 	warn "W: $f: .st_dev, .st_ino unstable\n" if $st ne $self->{st};
-	$dbh->do('PRAGMA synchronous = OFF') if ($rw // 0) > 1;
+
+	if ($rw) {
+		# TRUNCATE reduces I/O compared to the default (DELETE).
+		#
+		# Do not use WAL by default since we expect the case
+		# where any users may read via read-only daemons
+		# (-httpd/-imapd/-nntpd); but only a single user has
+		# write permissions for -watch/-mda.
+		#
+		# Read-only WAL support in SQLite 3.22.0 (2018-01-22)
+		# doesn't do what we need: it is only intended for
+		# immutable read-only media (e.g. CD-ROM) and not
+		# usable for our use case described above.
+		#
+		# If an admin is willing to give read-only daemons R/W
+		# permissions; they can enable WAL manually and we will
+		# respect that by not clobbering it.
+		my $jm = $dbh->selectrow_array('PRAGMA journal_mode');
+		$dbh->do('PRAGMA journal_mode = TRUNCATE') if $jm ne 'wal';
+
+		$dbh->do('PRAGMA synchronous = OFF') if $rw > 1;
+	}
 	$dbh;
 }
 
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index d42d6fe7..9f4a56fb 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -23,11 +23,6 @@ sub dbh_new {
 	my ($self) = @_;
 	my $dbh = $self->SUPER::dbh_new($self->{-no_fsync} ? 2 : 1);
 
-	# TRUNCATE reduces I/O compared to the default (DELETE)
-	# We do not use WAL since we're optimized for read-only ops,
-	# (and read-only requires SQLite 3.22.0 (2018-01-22)).
-	$dbh->do('PRAGMA journal_mode = TRUNCATE');
-
 	# 80000 pages (80MiB on SQLite <3.12.0, 320MiB on 3.12.0+)
 	# was found to be good in 2018 during the large LKML import
 	# at the time.  This ought to be configurable based on HW
diff --git a/t/over.t b/t/over.t
index 07672aa7..8bf64ecb 100644
--- a/t/over.t
+++ b/t/over.t
@@ -65,4 +65,14 @@ isnt($over->max, 0, 'max is non-zero');
 
 $over->rollback_lazy;
 
+# L<perldata/"Version Strings">
+my $v = eval 'v'.$over->{dbh}->{sqlite_version};
+SKIP: {
+	skip("no WAL in SQLite version $v < 3.7.0", 1) if $v lt v3.7.0;
+	$over->{dbh}->do('PRAGMA journal_mode = WAL');
+	$over = PublicInbox::OverIdx->new("$tmpdir/over.sqlite3");
+	is($over->connect->selectrow_array('PRAGMA journal_mode'), 'wal',
+		'WAL journal_mode not clobbered if manually set');
+}
+
 done_testing();

      parent reply	other threads:[~2020-08-25  3:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-25  3:02 [PATCH 0/3] SQLite-related things Eric Wong
2020-08-25  3:02 ` [PATCH 1/3] over: skip nodatacow on the journal Eric Wong
2020-08-25  3:02 ` [PATCH 2/3] msgmap: use "CREATE TABLE IF NOT EXISTS" Eric Wong
2020-08-25  3:02 ` Eric Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200825030247.12307-4-e@yhbt.net \
    --to=e@yhbt.net \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).