From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Cc: Eric Wong <e@80x24.org>
Subject: [RFC 01/11] search: implement index_sync to fixup indexer
Date: Sun, 16 Aug 2015 08:37:49 +0000 [thread overview]
Message-ID: <1439714279-21923-2-git-send-email-e@80x24.org> (raw)
In-Reply-To: <1439714279-21923-1-git-send-email-e@80x24.org>
We need to make the indexer executable and installable
while we're at it.
---
Makefile.PL | 3 ++-
lib/PublicInbox/Search.pm | 39 ++++++++++++++++++++++++++++++++++++++-
public-inbox-index | 0
3 files changed, 40 insertions(+), 2 deletions(-)
mode change 100644 => 100755 public-inbox-index
diff --git a/Makefile.PL b/Makefile.PL
index 1ee1089..f302b7c 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -9,7 +9,8 @@ WriteMakefile(
AUTHOR => 'Eric Wong <normalperson@yhbt.net>',
ABSTRACT => 'public-inbox server infrastructure',
EXE_FILES => [qw/public-inbox-mda public-inbox.cgi
- public-inbox-learn public-inbox-init/],
+ public-inbox-learn public-inbox-init
+ public-inbox-index/],
PREREQ_PM => {
# note: we use ssoma(1) and spamc(1), NOT the Perl modules
# We also depend on git through ssoma.
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index fe4984e..15bb9f6 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -226,7 +226,6 @@ sub remove_message {
} else {
$db->commit_transaction;
}
- $db->commit;
$doc_id;
}
@@ -536,4 +535,42 @@ sub enquire {
$self->{enquire} ||= Search::Xapian::Enquire->new($self->{xdb});
}
+# indexes all unindexed messages
+sub index_sync {
+ my ($self, $git) = @_;
+ my $db = $self->{xdb};
+ my $latest = $db->get_metadata('last_commit');
+ my $range = length $latest ? "$latest..HEAD" : 'HEAD';
+ $latest = undef;
+
+ my $hex = '[a-f0-9]';
+ my $h40 = $hex .'{40}';
+ my $addmsg = qr!^:000000 100644 \S+ ($h40) A\t${hex}{2}/${hex}{38}$!;
+ my $delmsg = qr!^:100644 000000 ($h40) \S+ D\t${hex}{2}/${hex}{38}$!;
+
+ # get indexed messages
+ my @cmd = ('git', "--git-dir=$git->{git_dir}", "log",
+ qw/--reverse --no-notes --no-color --raw -r --no-abbrev/,
+ $range);
+
+ my $pid = open(my $log, '-|', @cmd) or
+ die('open` '.join(' ', @cmd) . " pipe failed: $!\n");
+ my $last;
+ while (my $line = <$log>) {
+ if ($line =~ /$addmsg/o) {
+ $self->index_blob($git, $1);
+ } elsif ($line =~ /$delmsg/o) {
+ $self->unindex_blob($git, $1);
+ } elsif ($line =~ /^commit ($h40)/o) {
+ my $commit = $1;
+ if (defined $latest) {
+ $db->set_metadata('last_commit', $latest)
+ }
+ $latest = $commit;
+ }
+ }
+ close $log;
+ $db->set_metadata('last_commit', $latest) if defined $latest;
+}
+
1;
diff --git a/public-inbox-index b/public-inbox-index
old mode 100644
new mode 100755
--
EW
next prev parent reply other threads:[~2015-08-16 8:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-16 8:37 [RFC 0/11] work-in-progress search branch updated Eric Wong
2015-08-16 8:37 ` Eric Wong [this message]
2015-08-16 8:37 ` [RFC 02/11] extract redundant Message-ID handling code Eric Wong
2015-08-16 8:37 ` [RFC 03/11] search: make search results more OO Eric Wong
2015-08-16 8:37 ` [RFC 04/11] view: display replies in per-message view Eric Wong
2015-08-16 8:37 ` [RFC 05/11] thread: common sorting code Eric Wong
2015-08-16 8:37 ` [RFC 06/11] view: reply threading adjustment Eric Wong
2015-08-16 8:37 ` [RFC 07/11] view: hoist out index_walk function Eric Wong
2015-08-16 9:23 ` Eric Wong
2015-08-16 8:37 ` [RFC 08/11] www: /t/$MESSAGE_ID.html for threads Eric Wong
2015-08-16 8:37 ` [RFC 09/11] search: remove unnecessary xpfx export Eric Wong
2015-08-16 8:37 ` [RFC 10/11] implement /s/$SUBJECT_PATH.html lookups Eric Wong
2015-08-16 8:37 ` [RFC 11/11] SearchMsg: ensure metadata for ghost messages mid Eric Wong
2015-08-16 8:55 ` [RFC 12/11] view: deduplicate common code for loading search results Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1439714279-21923-2-git-send-email-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).