user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/7] v2 odds and ends
@ 2018-04-05  9:34 Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 1/7] v2writable: recount partitions after acquiring lock Eric Wong (Contractor, The Linux Foundation)
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

Eric Wong (Contractor, The Linux Foundation) (7):
  v2writable: recount partitions after acquiring lock
  searchmsg: remove unused `tid' and `path' methods
  search: remove unnecessary OP_AND of query
  mbox: do not sort search results
  searchview: minor cleanup
  support altid mechanism for v2
  compact: better handling of over.sqlite3* files

 MANIFEST                           |   1 +
 lib/PublicInbox/AltId.pm           |  20 +++++-
 lib/PublicInbox/Filter/RubyLang.pm |  22 ++++--
 lib/PublicInbox/Mbox.pm            | 139 +++++++++++++++++++++----------------
 lib/PublicInbox/Search.pm          |   7 +-
 lib/PublicInbox/SearchMsg.pm       |   5 --
 lib/PublicInbox/SearchView.pm      |   3 +-
 lib/PublicInbox/V2Writable.pm      |  44 ++++++++----
 script/public-inbox-compact        |   9 +++
 t/altid_v2.t                       |  55 +++++++++++++++
 10 files changed, 210 insertions(+), 95 deletions(-)
 create mode 100644 t/altid_v2.t

-- 
EW


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/7] v2writable: recount partitions after acquiring lock
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 2/7] searchmsg: remove unused `tid' and `path' methods Eric Wong (Contractor, The Linux Foundation)
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

The partition count can change if public-inbox-compact runs
while public-inbox-watch or public-inbox-index is running.
---
 lib/PublicInbox/V2Writable.pm | 44 ++++++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 74953d3..e0c8ac3 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -26,22 +26,14 @@ sub nproc () {
 	int($ENV{NPROC} || `nproc 2>/dev/null` || 2);
 }
 
-sub new {
-	my ($class, $v2ibx, $creat) = @_;
-	my $dir = $v2ibx->{mainrepo} or die "no mainrepo in inbox\n";
-	unless (-d $dir) {
-		if ($creat) {
-			require File::Path;
-			File::Path::mkpath($dir);
-		} else {
-			die "$dir does not exist\n";
-		}
-	}
-
+sub count_partitions ($) {
+	my ($self) = @_;
 	my $nparts = 0;
-	my $xpfx = "$dir/xap" . PublicInbox::Search::SCHEMA_VERSION;
+	my $xpfx = $self->{xpfx};
 
 	# always load existing partitions in case core count changes:
+	# Also, partition count may change while -watch is running
+	# due to -compact
 	if (-d $xpfx) {
 		foreach my $part (<$xpfx/*>) {
 			-d $part && $part =~ m!/\d+\z! or next;
@@ -51,21 +43,37 @@ sub new {
 			};
 		}
 	}
-	$nparts = nproc() if ($nparts == 0);
+	$nparts;
+}
+
+sub new {
+	my ($class, $v2ibx, $creat) = @_;
+	my $dir = $v2ibx->{mainrepo} or die "no mainrepo in inbox\n";
+	unless (-d $dir) {
+		if ($creat) {
+			require File::Path;
+			File::Path::mkpath($dir);
+		} else {
+			die "$dir does not exist\n";
+		}
+	}
 
 	$v2ibx = PublicInbox::InboxWritable->new($v2ibx);
+
+	my $xpfx = "$dir/xap" . PublicInbox::Search::SCHEMA_VERSION;
 	my $self = {
 		-inbox => $v2ibx,
 		im => undef, #  PublicInbox::Import
-		partitions => $nparts,
 		parallel => 1,
 		transact_bytes => 0,
+		xpfx => $xpfx,
 		over => PublicInbox::OverIdxFork->new("$xpfx/over.sqlite3"),
 		lock_path => "$dir/inbox.lock",
 		# limit each repo to 1GB or so
 		rotate_bytes => int((1024 * 1024 * 1024) / $PACKING_FACTOR),
 		last_commit => [],
 	};
+	$self->{partitions} = count_partitions($self) || nproc();
 	bless $self, $class;
 }
 
@@ -206,6 +214,12 @@ sub idx_init {
 		$self->lock_acquire;
 		$over->create($self);
 
+		# -compact can change partition count while -watch is idle
+		my $nparts = count_partitions($self);
+		if ($nparts && $nparts != $self->{partitions}) {
+			$self->{partitions} = $nparts;
+		}
+
 		# need to create all parts before initializing msgmap FD
 		my $max = $self->{partitions} - 1;
 
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/7] searchmsg: remove unused `tid' and `path' methods
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 1/7] v2writable: recount partitions after acquiring lock Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 3/7] search: remove unnecessary OP_AND of query Eric Wong (Contractor, The Linux Foundation)
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

These internal attributes are not exposed and no longer
used in our APIs.
---
 lib/PublicInbox/SearchMsg.pm | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index 6c0780e..d43853a 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -186,9 +186,4 @@ sub mid ($;$) {
 
 sub _extract_mid { mid_clean(mid_mime($_[0]->{mime})) }
 
-sub tid { $_[0]->{tid} }
-
-# XXX: consider removing this, we can phrase match subject
-sub path { $_[0]->{path} }
-
 1;
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/7] search: remove unnecessary OP_AND of query
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 1/7] v2writable: recount partitions after acquiring lock Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 2/7] searchmsg: remove unused `tid' and `path' methods Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 4/7] mbox: do not sort search results Eric Wong (Contractor, The Linux Foundation)
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

This was vestigial code from the switch to the overview DB
---
 lib/PublicInbox/Search.pm | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index eca2b0f..4e014f4 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -216,7 +216,6 @@ sub _do_enquire {
 sub _enquire_once {
 	my ($self, $query, $opts) = @_;
 	my $enquire = enquire($self);
-	$query = Search::Xapian::Query->new(OP_AND,$query);
 	$enquire->set_query($query);
 	$opts ||= {};
         my $desc = !$opts->{asc};
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/7] mbox: do not sort search results
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
                   ` (2 preceding siblings ...)
  2018-04-05  9:34 ` [PATCH 3/7] search: remove unnecessary OP_AND of query Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 5/7] searchview: minor cleanup Eric Wong (Contractor, The Linux Foundation)
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

Sorting large msets is a waste when it comes to mboxes
since MUAs should thread and sort them as the user desires.

This forces us to rework each of the mbox download mechanisms
to be more independent of each other, but might make things
easier to reason about.
---
 lib/PublicInbox/Mbox.pm   | 139 ++++++++++++++++++++++++++--------------------
 lib/PublicInbox/Search.pm |   6 +-
 2 files changed, 83 insertions(+), 62 deletions(-)

diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index c66ccaa..c5e1cb9 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -138,13 +138,24 @@ sub thread_mbox {
 	my ($ctx, $srch, $sfx) = @_;
 	eval { require IO::Compress::Gzip };
 	return sub { need_gzip(@_) } if $@;
-	my $prev = 0;
+	my $mid = $ctx->{mid};
+	my $msgs = $srch->get_thread($mid, 0);
+	return [404, [qw(Content-Type text/plain)], []] if !@$msgs;
+	my $prev = $msgs->[-1]->{num};
+	my $i = 0;
 	my $cb = sub {
-		my $msgs = $srch->get_thread($ctx->{mid}, $prev);
-		$prev = $msgs->[-1]->{num} if scalar(@$msgs);
-		$msgs;
+		while (1) {
+			if (my $smsg = $msgs->[$i++]) {
+				return $smsg;
+			}
+			# refill result set
+			$msgs = $srch->get_thread($mid, $prev);
+			return unless @$msgs;
+			$prev = $msgs->[-1]->{num};
+			$i = 0;
+		}
 	};
-	PublicInbox::MboxGz->response($ctx, $cb);
+	PublicInbox::MboxGz->response($ctx, $cb, $msgs->[0]->subject);
 }
 
 sub emit_range {
@@ -159,22 +170,55 @@ sub emit_range {
 	mbox_all($ctx, $query);
 }
 
+sub mbox_all_ids {
+	my ($ctx) = @_;
+	my $prev = 0;
+	my $ids = $ctx->{-inbox}->mm->ids_after(\$prev) or return
+		[404, [qw(Content-Type text/plain)], ["No results found\n"]];
+	my $i = 0;
+	my $over = $ctx->{srch}->{over_ro};
+	my $cb = sub {
+		do {
+			while ((my $num = $ids->[$i++])) {
+				my $smsg = $over->get_art($num) or next;
+				return $smsg;
+			}
+			$ids = $ctx->{-inbox}->mm->ids_after(\$prev);
+			$i = 0;
+		} while (@$ids);
+		undef;
+	};
+	return PublicInbox::MboxGz->response($ctx, $cb, 'all');
+}
+
 sub mbox_all {
 	my ($ctx, $query) = @_;
 
 	eval { require IO::Compress::Gzip };
 	return sub { need_gzip(@_) } if $@;
-	if ($query eq '') {
-		my $prev = 0;
-		my $cb = sub { $ctx->{-inbox}->mm->ids_after(\$prev) };
-		return PublicInbox::MboxGz->response($ctx, $cb, 'all');
-	}
-	my $opts = { offset => 0 };
+	return mbox_all_ids($ctx) if $query eq '';
+	my $opts = { mset => 2 };
 	my $srch = $ctx->{srch};
+	my $mset = $srch->query($query, $opts);
+	$opts->{offset} = $mset->size or
+			return [404, [qw(Content-Type text/plain)],
+				["No results found\n"]];
+	my $i = 0;
 	my $cb = sub { # called by MboxGz->getline
-		my $msgs = $srch->query($query, $opts);
-		$opts->{offset} += scalar @$msgs;
-		$msgs;
+		while (1) {
+			while (my $mi = (($mset->items)[$i++])) {
+				my $doc = $mi->get_document;
+				my $smsg = $srch->retry_reopen(sub {
+					PublicInbox::SearchMsg->load_doc($doc);
+				}) or next;
+				return $smsg;
+			}
+			# refill result set
+			$mset = $srch->query($query, $opts);
+			my $size = $mset->size or return;
+			$opts->{offset} += $size;
+			$i = 0;
+		}
 	};
 	PublicInbox::MboxGz->response($ctx, $cb, 'results-'.$query);
 }
@@ -206,7 +250,6 @@ sub new {
 		gz => IO::Compress::Gzip->new(\$buf, Time => 0),
 		cb => $cb,
 		ctx => $ctx,
-		msgs => [],
 	}, $class;
 }
 
@@ -214,60 +257,34 @@ sub response {
 	my ($class, $ctx, $cb, $fn) = @_;
 	my $body = $class->new($ctx, $cb);
 	# http://www.iana.org/assignments/media-types/application/gzip
-	$body->{hdr} = [ 'Content-Type', 'application/gzip' ];
-	$body->{fn} = $fn;
-	my $hdr = $body->getline; # fill in Content-Disposition filename
-	[ 200, $hdr, $body ];
-}
-
-sub set_filename ($$) {
-	my ($fn, $msg) = @_;
-	return to_filename($fn) if defined($fn);
-
-	PublicInbox::Mbox::subject_fn($msg);
+	my @h = qw(Content-Type application/gzip);
+	if ($fn) {
+		$fn = to_filename($fn);
+		push @h, 'Content-Disposition', "inline; filename=$fn.mbox.gz";
+	}
+	[ 200, \@h, $body ];
 }
 
 # called by Plack::Util::foreach or similar
 sub getline {
 	my ($self) = @_;
 	my $ctx = $self->{ctx} or return;
-	my $ibx = $ctx->{-inbox};
-	my $gz = $self->{gz};
-	my $msgs = $self->{msgs};
-	do {
-		# work on existing result set
-		while (defined(my $smsg = shift @$msgs)) {
-			# ids_after may return integers
-			ref($smsg) or
-				$smsg = $ctx->{srch}->{over_ro}->get_art($smsg);
-
-			my $msg = eval { $ibx->msg_by_smsg($smsg) } or next;
-			$msg = Email::Simple->new($msg);
-			$gz->write(PublicInbox::Mbox::msg_str($ctx, $msg,
-								$smsg->mid));
-
-			# use subject of first message as subject
-			if (my $hdr = delete $self->{hdr}) {
-				my $fn = set_filename($self->{fn}, $msg);
-				push @$hdr, 'Content-Disposition',
-						"inline; filename=$fn.mbox.gz";
-				return $hdr;
-			}
-			my $bref = $self->{buf};
-			if (length($$bref) >= 8192) {
-				my $ret = $$bref; # copy :<
-				${$self->{buf}} = '';
-				return $ret;
-			}
-
-			# be fair to other clients on public-inbox-httpd:
-			return '';
+	while (my $smsg = $self->{cb}->()) {
+		my $msg = $ctx->{-inbox}->msg_by_smsg($smsg) or next;
+		$msg = Email::Simple->new($msg);
+		$self->{gz}->write(PublicInbox::Mbox::msg_str($ctx, $msg,
+				$smsg->{mid}));
+		my $bref = $self->{buf};
+		if (length($$bref) >= 8192) {
+			my $ret = $$bref; # copy :<
+			${$self->{buf}} = '';
+			return $ret;
 		}
 
-		# refill result set
-		$msgs = $self->{msgs} = $self->{cb}->();
-	} while (@$msgs);
-	$gz->close;
+		# be fair to other clients on public-inbox-httpd:
+		return '';
+	}
+	delete($self->{gz})->close;
 	# signal that we're done and can return undef next call:
 	delete $self->{ctx};
 	${delete $self->{buf}};
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 4e014f4..9eb0728 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -219,7 +219,11 @@ sub _enquire_once {
 	$enquire->set_query($query);
 	$opts ||= {};
         my $desc = !$opts->{asc};
-	if ($opts->{relevance}) {
+	if (($opts->{mset} || 0) == 2) {
+		$enquire->set_docid_order(Search::Xapian::ENQ_ASCENDING());
+		$enquire->set_weighting_scheme(Search::Xapian::BoolWeight->new);
+		delete $self->{enquire};
+	} elsif ($opts->{relevance}) {
 		$enquire->set_sort_by_relevance_then_value(TS, $desc);
 	} else {
 		$enquire->set_sort_by_value_then_relevance(TS, $desc);
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/7] searchview: minor cleanup
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
                   ` (3 preceding siblings ...)
  2018-04-05  9:34 ` [PATCH 4/7] mbox: do not sort search results Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 6/7] support altid mechanism for v2 Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 7/7] compact: better handling of over.sqlite3* files Eric Wong (Contractor, The Linux Foundation)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

$mset->size is probably more obvious than relying on a tied
array and saves us a line.
---
 lib/PublicInbox/SearchView.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index c789795..d038dfc 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -180,9 +180,8 @@ sub search_nav_top {
 sub search_nav_bot {
 	my ($mset, $q) = @_;
 	my $total = $mset->get_matches_estimated;
-	my $nr = scalar $mset->items;
 	my $o = $q->{o};
-	my $end = $o + $nr;
+	my $end = $o + $mset->size;
 	my $beg = $o + 1;
 	my $rv = '</pre><hr><pre id=t>';
 	if ($beg <= $end) {
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/7] support altid mechanism for v2
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
                   ` (4 preceding siblings ...)
  2018-04-05  9:34 ` [PATCH 5/7] searchview: minor cleanup Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  2018-04-05  9:34 ` [PATCH 7/7] compact: better handling of over.sqlite3* files Eric Wong (Contractor, The Linux Foundation)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

There's enough gmane links out there in wild that it makes sense
to maintain support for these mappings.
---
 MANIFEST                           |  1 +
 lib/PublicInbox/AltId.pm           | 20 +++++++++++---
 lib/PublicInbox/Filter/RubyLang.pm | 22 ++++++++++-----
 t/altid_v2.t                       | 55 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 10 deletions(-)
 create mode 100644 t/altid_v2.t

diff --git a/MANIFEST b/MANIFEST
index b17f1be..82cc67d 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -135,6 +135,7 @@ scripts/ssoma-replay
 scripts/xhdr-num2mid
 t/address.t
 t/altid.t
+t/altid_v2.t
 t/cgi.t
 t/check-www-inbox.perl
 t/common.perl
diff --git a/lib/PublicInbox/AltId.pm b/lib/PublicInbox/AltId.pm
index d1b2dc2..f8aa4cb 100644
--- a/lib/PublicInbox/AltId.pm
+++ b/lib/PublicInbox/AltId.pm
@@ -22,17 +22,31 @@ sub new {
 	} split(/[&;]/, $query);
 	my $f = $params{file} or die "file: required for $type spec $spec\n";
 	unless (index($f, '/') == 0) {
-		$f = "$inbox->{mainrepo}/public-inbox/$f";
+		if (($inbox->{version} || 1) == 1) {
+			$f = "$inbox->{mainrepo}/public-inbox/$f";
+		} else {
+			$f = "$inbox->{mainrepo}/$f";
+		}
 	}
 	bless {
-		mm_alt => PublicInbox::Msgmap->new_file($f, $writable),
+		filename => $f,
+		writable => $writable,
 		xprefix => 'X'.uc($prefix),
 	}, $class;
 }
 
+sub mm_alt {
+	my ($self) = @_;
+	$self->{mm_alt} ||= eval {
+		my $f = $self->{filename};
+		my $writable = $self->{filename};
+		PublicInbox::Msgmap->new_file($f, $writable);
+	};
+}
+
 sub mid2alt {
 	my ($self, $mid) = @_;
-	$self->{mm_alt}->num_for($mid);
+	$self->mm_alt->num_for($mid);
 }
 
 1;
diff --git a/lib/PublicInbox/Filter/RubyLang.pm b/lib/PublicInbox/Filter/RubyLang.pm
index 63e8d42..cb69e38 100644
--- a/lib/PublicInbox/Filter/RubyLang.pm
+++ b/lib/PublicInbox/Filter/RubyLang.pm
@@ -6,6 +6,7 @@ package PublicInbox::Filter::RubyLang;
 use base qw(PublicInbox::Filter::Base);
 use strict;
 use warnings;
+use PublicInbox::MID qw(mids);
 
 my $l1 = qr/Unsubscribe:\s
 	<mailto:ruby-\w+-request\@ruby-lang\.org\?subject=unsubscribe>/x;
@@ -44,16 +45,23 @@ sub scrub {
 	my $altid = $self->{-altid};
 	if ($altid) {
 		my $hdr = $mime->header_obj;
-		my $mid = $hdr->header_raw('Message-ID');
-		unless (defined $mid) {
-			return $self->REJECT('Message-Id missing');
+		my $mids = mids($hdr);
+		return $self->REJECT('Message-ID missing') unless (@$mids);
+		my @v = $hdr->header_raw('X-Mail-Count');
+		my $n;
+		foreach (@v) {
+			/\A\s*(\d+)\s*\z/ or next;
+			$n = $1;
+			last;
 		}
-		my $n = $hdr->header_raw('X-Mail-Count');
-		if (!defined($n) || $n !~ /\A\s*\d+\s*\z/) {
+		unless (defined $n) {
 			return $self->REJECT('X-Mail-Count not numeric');
 		}
-		$mid = PublicInbox::MID::mid_clean($mid);
-		$altid->{mm_alt}->mid_set($n, $mid);
+		foreach my $mid (@$mids) {
+			my $r = $altid->mm_alt->mid_set($n, $mid);
+			next if $r == 0;
+			last;
+		}
 	}
 	$self->ACCEPT($mime);
 }
diff --git a/t/altid_v2.t b/t/altid_v2.t
new file mode 100644
index 0000000..87f1452
--- /dev/null
+++ b/t/altid_v2.t
@@ -0,0 +1,55 @@
+# Copyright (C) 2016-2018 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use warnings;
+use Test::More;
+use File::Temp qw/tempdir/;
+foreach my $mod (qw(DBD::SQLite Search::Xapian)) {
+	eval "require $mod";
+	plan skip_all => "$mod missing for altid_v2.t" if $@;
+}
+
+use_ok 'PublicInbox::V2Writable';
+use_ok 'PublicInbox::Inbox';
+my $tmpdir = tempdir('pi-altidv2-XXXXXX', TMPDIR => 1, CLEANUP => 1);
+my $mainrepo = "$tmpdir/inbox";
+my $full = "$tmpdir/inbox/another-nntp.sqlite3";
+my $altid = [ 'serial:gmane:file=another-nntp.sqlite3' ];
+
+{
+	ok(mkdir($mainrepo), 'created repo for msgmap');
+	my $mm = PublicInbox::Msgmap->new_file($full, 1);
+	is($mm->mid_set(1234, 'a@example.com'), 1, 'mid_set once OK');
+	ok(0 == $mm->mid_set(1234, 'a@example.com'), 'mid_set not idempotent');
+	ok(0 == $mm->mid_set(1, 'a@example.com'), 'mid_set fails with dup MID');
+}
+
+my $ibx = {
+	mainrepo => $mainrepo,
+	name => 'test-v2writable',
+	version => 2,
+	-primary_address => 'test@example.com',
+	altid => $altid,
+};
+$ibx = PublicInbox::Inbox->new($ibx);
+my $v2w = PublicInbox::V2Writable->new($ibx, 1);
+$v2w->add(Email::MIME->create(
+		header => [
+			From => 'a@example.com',
+			To => 'b@example.com',
+			'Content-Type' => 'text/plain',
+			Subject => 'boo!',
+			'Message-ID' => '<a@example.com>',
+		],
+		body => "hello world gmane:666\n",
+	));
+$v2w->done;
+
+my $msgs = $ibx->search->reopen->query("gmane:1234");
+is_deeply([map { $_->mid } @$msgs], ['a@example.com'], 'got one match');
+$msgs = $ibx->search->query("gmane:666");
+is_deeply([], $msgs, 'body did NOT match');
+
+done_testing();
+
+1;
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 7/7] compact: better handling of over.sqlite3* files
  2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
                   ` (5 preceding siblings ...)
  2018-04-05  9:34 ` [PATCH 6/7] support altid mechanism for v2 Eric Wong (Contractor, The Linux Foundation)
@ 2018-04-05  9:34 ` Eric Wong (Contractor, The Linux Foundation)
  6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-04-05  9:34 UTC (permalink / raw)
  To: meta

Lets not scare users when they encounter files that are supposed
to be there.  Then, preserve the journal and pipe.lock, even if
they're supposedly unused due to us holding the inbox-wide lock.
---
 script/public-inbox-compact | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/script/public-inbox-compact b/script/public-inbox-compact
index e697716..b8aaa4b 100755
--- a/script/public-inbox-compact
+++ b/script/public-inbox-compact
@@ -35,8 +35,16 @@ $ibx->umask_prepare;
 sub commit_changes ($$$) {
 	my ($im, $old, $new) = @_;
 	my @st = stat($old) or die "failed to stat($old): $!\n";
+
+	for my $suf (qw(.pipe.lock -journal)) {
+		my $orig = "$old/over.sqlite3$suf";
+		link($orig, "$new/over.sqlite3$suf") and next;
+		next if $!{ENOENT};
+		die "failed to link $orig => $new/over.sqlite3$suf: $!\n";
+	}
 	link("$old/over.sqlite3", "$new/over.sqlite3") or die
 		"failed to link {$old => $new}/over.sqlite3: $!\n";
+
 	rename($old, "$new/old") or die "rename $old => $new/old: $!\n";
 	chmod($st[2] & 07777, $new) or die "chmod $old: $!\n";
 	rename($new, $old) or die "rename $new => $old: $!\n";
@@ -58,6 +66,7 @@ if ($v == 2) {
 			if ($dn =~ /\A\d+\z/) {
 				push @parts, "$old/$dn";
 			} elsif ($dn eq '.' || $dn eq '..') {
+			} elsif ($dn =~ /\Aover\.sqlite3/) {
 			} else {
 				warn "W: skipping unknown Xapian DB: $old/$dn\n"
 			}
-- 
EW


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-04-05  9:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-05  9:34 [PATCH 0/7] v2 odds and ends Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 1/7] v2writable: recount partitions after acquiring lock Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 2/7] searchmsg: remove unused `tid' and `path' methods Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 3/7] search: remove unnecessary OP_AND of query Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 4/7] mbox: do not sort search results Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 5/7] searchview: minor cleanup Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 6/7] support altid mechanism for v2 Eric Wong (Contractor, The Linux Foundation)
2018-04-05  9:34 ` [PATCH 7/7] compact: better handling of over.sqlite3* files Eric Wong (Contractor, The Linux Foundation)

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).