From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id DF46D1FA19 for ; Sun, 10 Oct 2021 14:25:19 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 6/8] extindex: --gc doesn't touch ghost entries Date: Sun, 10 Oct 2021 14:25:16 +0000 Message-Id: <20211010142518.7012-7-e@80x24.org> In-Reply-To: <20211010142518.7012-1-e@80x24.org> References: <20211010142518.7012-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: We were deleting ghost entries, this was usually harmless since other messages could fill-in-the-blanks, but could cause misthreading in odd cases where a big chunk of a thread is missing and the latest messages only referenced ghosts. We'll also save some cycles when scanning Xapian shards since docids won't be <= 0. --- lib/PublicInbox/ExtSearchIdx.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm index 42488e12..acf35e3d 100644 --- a/lib/PublicInbox/ExtSearchIdx.pm +++ b/lib/PublicInbox/ExtSearchIdx.pm @@ -425,13 +425,13 @@ DELETE FROM xref3 WHERE docid NOT IN (SELECT num FROM over) # fixup from old bugs: $nr = $self->{oidx}->dbh->do(<<''); -DELETE FROM over WHERE num NOT IN (SELECT docid FROM xref3) +DELETE FROM over WHERE num > 0 AND num NOT IN (SELECT docid FROM xref3) warn "I: eliminated $nr stale over entries\n" if $nr != 0; reindex_checkpoint($self, $sync) if checkpoint_due($sync); my ($cur) = $self->{oidx}->dbh->selectrow_array(< 0 EOM $cur // return; # empty my ($r, $n, %active);