user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 2/2] index: support --fast-noop / -F switch
Date: Thu, 24 Dec 2020 10:09:19 +0000	[thread overview]
Message-ID: <20201224100919.30927-3-e@80x24.org> (raw)
In-Reply-To: <20201224100919.30927-1-e@80x24.org>

Note: I'm not sure if it's worth documenting and supporting this
long-term.

We can can avoid taking locks for invocations of "index --all"
and rely on high-resolution ctime (struct timespec st_ctim)
comparisons of msgmap.sqlite3 and the packed-refs + refs/heads
directory of the newest epoch.

This cuts public-inbox-index invocations with
"--all --no-update-extindex -L basic" down from 0.92s to 0.31s.
The change with "-L medium" or "-L full" and (default) non-zero
jobs is even more drastic, reducing a 12-13s no-op invocation
down to the same 0.31s
---
 lib/PublicInbox/V2Writable.pm | 14 +++++++++++---
 script/public-inbox-index     |  2 +-
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 531a72b2..2b849ddf 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -1351,11 +1351,19 @@ sub index_sync {
 	$opt //= {};
 	return xapian_only($self, $opt) if $opt->{xapian_only};
 
-	my $pr = $opt->{-progress};
 	my $epoch_max;
-	my $latest = $self->{ibx}->git_dir_latest(\$epoch_max);
-	return unless defined $latest;
+	my $latest = $self->{ibx}->git_dir_latest(\$epoch_max) // return;
+	if ($opt->{'fast-noop'}) { # nanosecond (st_ctim) comparison
+		use Time::HiRes qw(stat);
+		if (my @mm = stat("$self->{ibx}->{inboxdir}/msgmap.sqlite3")) {
+			my $c = $mm[10]; # 10 = ctime (nsec NV)
+			my @hd = stat("$latest/refs/heads");
+			my @pr = stat("$latest/packed-refs");
+			return if $c > ($hd[10] // 0) && $c > ($pr[10] // 0);
+		}
+	}
 
+	my $pr = $opt->{-progress};
 	my $seq = $opt->{sequential_shard};
 	my $art_beg; # the NNTP article number we start xapian_only at
 	my $idxlevel = $self->{ibx}->{indexlevel};
diff --git a/script/public-inbox-index b/script/public-inbox-index
index f10bb5ad..91afac88 100755
--- a/script/public-inbox-index
+++ b/script/public-inbox-index
@@ -42,7 +42,7 @@ GetOptions($opt, qw(verbose|v+ reindex rethread compact|c+ jobs|j=i prune
 		batch_size|batch-size=s
 		sequential_shard|seq-shard|sequential-shard
 		no-update-extindex update-extindex|E=s@
-		skip-docdata all help|h))
+		fast-noop|F skip-docdata all help|h))
 	or die $help;
 if ($opt->{help}) { print $help; exit 0 };
 die "--jobs must be >= 0\n" if defined $opt->{jobs} && $opt->{jobs} < 0;

      parent reply	other threads:[~2020-12-24 10:09 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-24 10:09 [PATCH 0/2] index: start speeding up some noop calls Eric Wong
2020-12-24 10:09 ` [PATCH 1/2] inboxwritable: delay umask_prepare calls Eric Wong
2020-12-24 10:09 ` Eric Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201224100919.30927-3-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).