* [PATCH 05/10] v2writable: less expensive checkpoint for extindex
2020-11-07 10:56 6% [PATCH 00/10] extindex: another round of updates Eric Wong
@ 2020-11-07 10:56 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-11-07 10:56 UTC (permalink / raw)
To: meta
Since extindex holds no locks on parallel inbox writers,
we can simply use "barrier" IPC shard commands to checkpoint
and avoid respawning shard or git processes.
---
lib/PublicInbox/V2Writable.pm | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 0364857f..224675ab 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -620,13 +620,13 @@ sub checkpoint ($;$) {
# Now deal with Xapian
if ($wait) {
- my $barrier = $self->barrier_init(scalar @$shards);
+ my $barrier = barrier_init($self, scalar @$shards);
# each shard needs to issue a barrier command
$_->shard_barrier for @$shards;
# wait for each Xapian shard
- $self->barrier_wait($barrier);
+ barrier_wait($self, $barrier);
} else {
$_->shard_commit for @$shards;
}
@@ -860,11 +860,16 @@ sub atfork_child {
sub reindex_checkpoint ($$) {
my ($self, $sync) = @_;
- $self->git->cleanup; # *async_wait
+ $self->git->async_wait_all;
${$sync->{need_checkpoint}} = 0;
my $mm_tmp = $sync->{mm_tmp};
$mm_tmp->atfork_prepare if $mm_tmp;
- $self->done; # release lock
+ die 'BUG: {im} during reindex' if $self->{im};
+ if ($self->{ibx_map}) {
+ checkpoint($self, 1); # no need to release lock on pure index
+ } else {
+ $self->done; # release lock
+ }
if (my $pr = $sync->{-opt}->{-progress}) {
$pr->(sprintf($sync->{-regen_fmt}, ${$sync->{nr}}));
^ permalink raw reply related [relevance 7%]
* [PATCH 00/10] extindex: another round of updates
@ 2020-11-07 10:56 6% Eric Wong
2020-11-07 10:56 7% ` [PATCH 05/10] v2writable: less expensive checkpoint for extindex Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-11-07 10:56 UTC (permalink / raw)
To: meta
A major user-visible change is renaming -eindex to -extindex,
because rhyming with "reindex" is probably confusing (and I'm
easily confused :x).
PATCH 10/10 finally starts making sense performance-wise, still
testing... I've long thought the default 1m batch-size is too
small for 64-bit machines, so maybe that'll change, too.
But it took me way too long to figure out why indexBatchSize was
seemed to have no effect in my PI_CONFIG :<
My Internet access has been terribly unreliable, lately, too; so
relying on mosh/ssh access to work on more powerful machines
aint too pleasant.
Overall extindex it seems to be working somewhat OK for
incremental updates the past few weeks, but could still benefit
from speedups to work better on HW I have locally.
Will have to retest SQLite cache_size and mmap_size pragmas, too.
Eric Wong (10):
extsearch: rename -eindex to -extindex
extsearchidx: avoid needless alternates rewrite in ALL.git
searchidxshard: reduce syscalls when writing ->eidx_key
searchidxshard: further improve {current_info} readability
v2writable: less expensive checkpoint for extindex
extsearchidx: quiet warning for unindexed `d' messages
extsearch: canonicalize topdir
v2writable: more accurate {current_info} warnings/progress
extindex: SIGUSR1 supports checkpoint
extindex: fix --batch-size support
MANIFEST | 2 +-
lib/PublicInbox/Config.pm | 2 +-
lib/PublicInbox/ExtSearch.pm | 2 +
lib/PublicInbox/ExtSearchIdx.pm | 37 +++++++++++++------
lib/PublicInbox/SearchIdxShard.pm | 9 ++---
lib/PublicInbox/V2Writable.pm | 37 ++++++++++++++-----
...lic-inbox-eindex => public-inbox-extindex} | 8 +++-
t/extsearch.t | 6 +--
8 files changed, 68 insertions(+), 35 deletions(-)
rename script/{public-inbox-eindex => public-inbox-extindex} (84%)
^ permalink raw reply [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-11-07 10:56 6% [PATCH 00/10] extindex: another round of updates Eric Wong
2020-11-07 10:56 7% ` [PATCH 05/10] v2writable: less expensive checkpoint for extindex Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).