From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C6C821FF40 for ; Tue, 21 Jun 2016 04:29:00 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 2/2] searchidx: merge old thread id from ghosts Date: Tue, 21 Jun 2016 04:28:59 +0000 Message-Id: <20160621042859.16146-3-e@80x24.org> In-Reply-To: <20160621042859.16146-1-e@80x24.org> References: <20160621042859.16146-1-e@80x24.org> List-Id: We failed to discard old thread IDs when vivifying ghosts due to out-of-order message arrival. This rectifies the failure and will trigger a re-index. --- lib/PublicInbox/Search.pm | 3 ++- lib/PublicInbox/SearchIdx.pm | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm index 8c0bab1..bf50365 100644 --- a/lib/PublicInbox/Search.pm +++ b/lib/PublicInbox/Search.pm @@ -36,7 +36,8 @@ use constant { # 8 - remove redundant/unneeded document data # 9 - disable Message-ID compression (SHA-1) # 10 - optimize doc for NNTP overviews - SCHEMA_VERSION => 10, + # 11 - merge threads when vivifying ghosts + SCHEMA_VERSION => 11, # n.b. FLAG_PURE_NOT is expensive not suitable for a public website # as it could become a denial-of-service vector diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm index 3134687..58eccc1 100644 --- a/lib/PublicInbox/SearchIdx.pm +++ b/lib/PublicInbox/SearchIdx.pm @@ -150,7 +150,7 @@ sub add_message { if ($was_ghost) { $doc_id = $smsg->doc_id; - $self->link_message($smsg); + $self->link_message($smsg, $smsg->thread_id); $doc->set_data($smsg->to_doc_data); $db->replace_document($doc_id, $doc); } else { @@ -211,7 +211,7 @@ sub next_thread_id { } sub link_message { - my ($self, $smsg) = @_; + my ($self, $smsg, $old_tid) = @_; my $doc = $smsg->{doc}; my $mid = $smsg->mid; my $mime = $smsg->mime; @@ -247,6 +247,7 @@ sub link_message { # but we can never trust clients to do the right thing my $ref = shift @refs; $tid = $self->_resolve_mid_to_tid($ref); + $self->merge_threads($tid, $old_tid) if defined $old_tid; # the rest of the refs should point to this tid: foreach $ref (@refs) {