* [PATCH 0/2] index: start speeding up some noop calls
@ 2020-12-24 10:09 7% Eric Wong
2020-12-24 10:09 6% ` [PATCH 2/2] index: support --fast-noop / -F switch Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-12-24 10:09 UTC (permalink / raw)
To: meta
Users scripting "public-inbox-index --all" to run after grok-pull
runs have to wait a long time with thousands of inboxes, most
of which don't get updated.
PATCH 1/2 is a no-brainer and improves the opt-in speedup
for PATCH 2/2.
2/2 I'm not 100% sure about. Maybe -F/--fast-noop can become a
default, maybe not. -L medium/full users will notice it the
most, but there's further opportunities for speedups, there.
Eric Wong (2):
inboxwritable: delay umask_prepare calls
index: support --fast-noop / -F switch
lib/PublicInbox/ExtSearchIdx.pm | 2 --
lib/PublicInbox/InboxWritable.pm | 6 ++----
lib/PublicInbox/SearchIdx.pm | 1 -
lib/PublicInbox/V2Writable.pm | 17 +++++++++++------
lib/PublicInbox/Xapcmd.pm | 1 -
script/public-inbox-convert | 1 -
script/public-inbox-index | 2 +-
7 files changed, 14 insertions(+), 16 deletions(-)
^ permalink raw reply [relevance 7%]
* [PATCH 2/2] index: support --fast-noop / -F switch
2020-12-24 10:09 7% [PATCH 0/2] index: start speeding up some noop calls Eric Wong
@ 2020-12-24 10:09 6% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-12-24 10:09 UTC (permalink / raw)
To: meta
Note: I'm not sure if it's worth documenting and supporting this
long-term.
We can can avoid taking locks for invocations of "index --all"
and rely on high-resolution ctime (struct timespec st_ctim)
comparisons of msgmap.sqlite3 and the packed-refs + refs/heads
directory of the newest epoch.
This cuts public-inbox-index invocations with
"--all --no-update-extindex -L basic" down from 0.92s to 0.31s.
The change with "-L medium" or "-L full" and (default) non-zero
jobs is even more drastic, reducing a 12-13s no-op invocation
down to the same 0.31s
---
lib/PublicInbox/V2Writable.pm | 14 +++++++++++---
script/public-inbox-index | 2 +-
2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 531a72b2..2b849ddf 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -1351,11 +1351,19 @@ sub index_sync {
$opt //= {};
return xapian_only($self, $opt) if $opt->{xapian_only};
- my $pr = $opt->{-progress};
my $epoch_max;
- my $latest = $self->{ibx}->git_dir_latest(\$epoch_max);
- return unless defined $latest;
+ my $latest = $self->{ibx}->git_dir_latest(\$epoch_max) // return;
+ if ($opt->{'fast-noop'}) { # nanosecond (st_ctim) comparison
+ use Time::HiRes qw(stat);
+ if (my @mm = stat("$self->{ibx}->{inboxdir}/msgmap.sqlite3")) {
+ my $c = $mm[10]; # 10 = ctime (nsec NV)
+ my @hd = stat("$latest/refs/heads");
+ my @pr = stat("$latest/packed-refs");
+ return if $c > ($hd[10] // 0) && $c > ($pr[10] // 0);
+ }
+ }
+ my $pr = $opt->{-progress};
my $seq = $opt->{sequential_shard};
my $art_beg; # the NNTP article number we start xapian_only at
my $idxlevel = $self->{ibx}->{indexlevel};
diff --git a/script/public-inbox-index b/script/public-inbox-index
index f10bb5ad..91afac88 100755
--- a/script/public-inbox-index
+++ b/script/public-inbox-index
@@ -42,7 +42,7 @@ GetOptions($opt, qw(verbose|v+ reindex rethread compact|c+ jobs|j=i prune
batch_size|batch-size=s
sequential_shard|seq-shard|sequential-shard
no-update-extindex update-extindex|E=s@
- skip-docdata all help|h))
+ fast-noop|F skip-docdata all help|h))
or die $help;
if ($opt->{help}) { print $help; exit 0 };
die "--jobs must be >= 0\n" if defined $opt->{jobs} && $opt->{jobs} < 0;
^ permalink raw reply related [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-12-24 10:09 7% [PATCH 0/2] index: start speeding up some noop calls Eric Wong
2020-12-24 10:09 6% ` [PATCH 2/2] index: support --fast-noop / -F switch Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).