user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 3/5] overidx: document the SQLite PRAGMA we use
  2020-05-10 22:37  4% [PATCH 0/5] scattered dev/CLI-oriented changes Eric Wong
@ 2020-05-10 22:37  7% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-05-10 22:37 UTC (permalink / raw)
  To: meta

This ought to prevent cargo-culting the cache_size PRAGMA
into smaller SQLite DBs we might use.
---
 lib/PublicInbox/OverIdx.pm | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index acbf2c8de60..cb15baadf2b 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -21,8 +21,16 @@ use PublicInbox::Search;
 sub dbh_new {
 	my ($self) = @_;
 	my $dbh = $self->SUPER::dbh_new(1);
+
+	# TRUNCATE reduces I/O compared to the default (DELETE)
 	$dbh->do('PRAGMA journal_mode = TRUNCATE');
+
+	# 80000 pages (80MiB on SQLite <3.12.0, 320MiB on 3.12.0+)
+	# was found to be good in 2018 during the large LKML import
+	# at the time.  This ought to be configurable based on HW
+	# and inbox size; I suspect it's overkill for many inboxes.
 	$dbh->do('PRAGMA cache_size = 80000');
+
 	create_tables($dbh);
 	$dbh;
 }

^ permalink raw reply related	[relevance 7%]

* [PATCH 0/5] scattered dev/CLI-oriented changes
@ 2020-05-10 22:37  4% Eric Wong
  2020-05-10 22:37  7% ` [PATCH 3/5] overidx: document the SQLite PRAGMA we use Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-05-10 22:37 UTC (permalink / raw)
  To: meta

I've been using the test in 1/5 while developing Eml for the
1.5.0 release, and it's probably a good starting point for
anybody who wants to run more stats or do more optimizations,
there.

A couple of comments and naming things to make life easier
for developers

For non-server-oriented stuff, I guess we can start
using XDG directories to avoid cluttering the top-level
of users' HOME directories.  This will make development
easier on platforms where `make' has limited `-include'
support and PERL_INLINE_DIRECTORY can't be set by a
developers' config.mak

I'll probably integrate Eric Biederman's IMAPTracker work, soon:
https://public-inbox.org/meta/874l0i9vhc.fsf_-_@x220.int.ebiederm.org/

Eric Wong (5):
  xt/eml_check_limits: check limits against an inbox
  rename "ContentId" to "ContentHash"
  overidx: document the SQLite PRAGMA we use
  msgmap: use TRUNCATE for journal_mode, for now
  spawn: use ~/.cache/public-inbox/inline-c if writable

 Documentation/public-inbox-v2-format.pod      | 12 +--
 MANIFEST                                      |  5 +-
 .../{ContentId.pm => ContentHash.pm}          |  8 +-
 lib/PublicInbox/Import.pm                     |  2 +-
 lib/PublicInbox/Msgmap.pm                     |  4 +
 lib/PublicInbox/OverIdx.pm                    |  8 ++
 lib/PublicInbox/Spawn.pm                      | 13 +++-
 lib/PublicInbox/V2Writable.pm                 | 48 ++++++------
 script/public-inbox-edit                      | 16 ++--
 t/{content_id.t => content_hash.t}            | 14 ++--
 t/v1reindex.t                                 |  2 +-
 t/v2reindex.t                                 |  2 +-
 t/v2writable.t                                |  4 +-
 xt/eml_check_limits.t                         | 76 +++++++++++++++++++
 14 files changed, 154 insertions(+), 60 deletions(-)
 rename lib/PublicInbox/{ContentId.pm => ContentHash.pm} (93%)
 rename t/{content_id.t => content_hash.t} (64%)
 create mode 100644 xt/eml_check_limits.t

^ permalink raw reply	[relevance 4%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-05-10 22:37  4% [PATCH 0/5] scattered dev/CLI-oriented changes Eric Wong
2020-05-10 22:37  7% ` [PATCH 3/5] overidx: document the SQLite PRAGMA we use Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).