user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 0/3] mda: v2: ensure message bodies are indexed
@ 2018-07-29  9:34  7% Eric Wong
  2018-07-29  9:34  6% ` [PATCH 1/3] mda: use InboxWritable Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2018-07-29  9:34 UTC (permalink / raw)
  To: meta

I found a bug for v2 users getting mail through -mda, causing
message bodies to not show up in the search results.  It was a
stupid one-line bug made in an effort to save memory :x

Anyways, to properly index message bodies on affected mda-using
v2 inboxes, a reindex is required:

	public-inbox-index --reindex

This can take a long while and requires roughly double the
current Xapian storage.   However, it's designed to run online
so users will gradually find search more useful as indexing
completes (it runs in reverse-chronological order)

Fwiw, I always run indexing with "eatmydata" to disable fsync
and speed up the process, since Xapian data isn't critical.

I suppose another idea is to allow passing a limit to reindex,
as this bug didn't affect initial imports... (But I'm tired
and I fixed this bug while getting sidetracked from another
bugfix on another project)

Eric Wong (3):
  mda: use InboxWritable
  t/v2mda: make it easy to test v1 repos here, too
  mda: v2: ensure message bodies are indexed

 MANIFEST                         |  1 +
 lib/PublicInbox/InboxWritable.pm |  1 +
 script/public-inbox-mda          | 38 +++++++-------------------
 t/data/0001.patch                | 46 ++++++++++++++++++++++++++++++++
 t/v2mda.t                        | 19 ++++++++++++-
 t/watch_maildir_v2.t             | 15 +++++++++++
 6 files changed, 91 insertions(+), 29 deletions(-)
 create mode 100644 t/data/0001.patch

-- 
EW

^ permalink raw reply	[relevance 7%]

* [PATCH 1/3] mda: use InboxWritable
  2018-07-29  9:34  7% [PATCH 0/3] mda: v2: ensure message bodies are indexed Eric Wong
@ 2018-07-29  9:34  6% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2018-07-29  9:34 UTC (permalink / raw)
  To: meta

It's a convenient wrapper nowadays, so get rid of some legacy
code and minimize differences from the -watch code.
---
 lib/PublicInbox/InboxWritable.pm |  1 +
 script/public-inbox-mda          | 37 +++++++++-----------------------
 2 files changed, 11 insertions(+), 27 deletions(-)

diff --git a/lib/PublicInbox/InboxWritable.pm b/lib/PublicInbox/InboxWritable.pm
index 9b0cdfd..aa62132 100644
--- a/lib/PublicInbox/InboxWritable.pm
+++ b/lib/PublicInbox/InboxWritable.pm
@@ -39,6 +39,7 @@ sub importer {
 			my $addr = $self->{-primary_address};
 			PublicInbox::Import->new($git, $name, $addr, $self);
 		} else {
+			$! = 78; # EX_CONFIG 5.3.5 local configuration error
 			die "unsupported inbox version: $v\n";
 		}
 	}
diff --git a/script/public-inbox-mda b/script/public-inbox-mda
index 1f1252a..2a31537 100755
--- a/script/public-inbox-mda
+++ b/script/public-inbox-mda
@@ -18,11 +18,10 @@ use Email::Simple;
 use PublicInbox::MIME;
 use PublicInbox::MDA;
 use PublicInbox::Config;
-use PublicInbox::Import;
-use PublicInbox::Git;
 use PublicInbox::Emergency;
 use PublicInbox::Filter::Base;
 use PublicInbox::Spamcheck::Spamc;
+use PublicInbox::InboxWritable;
 
 # n.b: hopefully we can setup the emergency path without bailing due to
 # user error, we really want to setup the emergency destination ASAP
@@ -39,7 +38,8 @@ my $recipient = $ENV{ORIGINAL_RECIPIENT};
 defined $recipient or die "ORIGINAL_RECIPIENT not defined in ENV\n";
 my $dst = $config->lookup($recipient); # first check
 defined $dst or do_exit(67); # EX_NOUSER 5.1.1 user unknown
-my $main_repo = $dst->{mainrepo} or do_exit(67);
+$dst->{mainrepo} or do_exit(67);
+$dst = PublicInbox::InboxWritable->new($dst);
 
 # pre-check, MDA has stricter rules than an importer might;
 do_exit(0) unless PublicInbox::MDA->precheck($simple, $dst->{address});
@@ -55,18 +55,13 @@ $str = '';
 do_exit(0) unless $spam_ok;
 
 my $fcfg = $dst->{filter} || '';
-my $filter;
-if ($fcfg =~ /::/) {
-	eval "require $fcfg";
-	die $@ if $@;
-	$filter = $fcfg->new;
-} elsif ($fcfg eq 'scrub') { # TODO:
-	require PublicInbox::Filter::Mirror;
-	$filter = PublicInbox::Filter::Mirror->new;
-} else {
-	$filter = PublicInbox::Filter::Base->new;
+# -mda defaults to the strict base filter
+if ($fcfg eq '') {
+	$dst->{filter} = 'PublicInbox::Filter::Base';
+} elsif ($fcfg eq 'scrub') { # legacy alias, undocumented, remove?
+	$dst->{filter} = 'PublicInbox::Filter::Mirror';
 }
-
+my $filter = $dst->filter;
 my $ret = $filter->delivery($mime);
 if (ref($ret) && $ret->isa('Email::MIME')) { # filter altered message
 	$mime = $ret;
@@ -78,19 +73,7 @@ if (ref($ret) && $ret->isa('Email::MIME')) { # filter altered message
 } # else { accept
 
 PublicInbox::MDA->set_list_headers($mime, $dst);
-my $v = $dst->{version} || 1;
-my $im;
-if ($v == 2) {
-	require PublicInbox::V2Writable;
-	$im = PublicInbox::V2Writable->new($dst);
-	$im->{parallel} = 0; # pointless to be parallel for a single message
-} elsif ($v == 1) {
-	my $git = $dst->git;
-	$im = PublicInbox::Import->new($git, $dst->{name}, $recipient, $dst);
-} else {
-	$! = 78; # EX_CONFIG 5.3.5 local configuration error
-	die "Unsupported inbox version: $v\n";
-}
+my $im = $dst->importer(0);
 if (defined $im->add($mime)) {
 	$emm = $emm->abort;
 } else {
-- 
EW


^ permalink raw reply related	[relevance 6%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2018-07-29  9:34  7% [PATCH 0/3] mda: v2: ensure message bodies are indexed Eric Wong
2018-07-29  9:34  6% ` [PATCH 1/3] mda: use InboxWritable Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).