user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <>
Subject: [PATCH 6/8] extindex: --gc doesn't touch ghost entries
Date: Sun, 10 Oct 2021 14:25:16 +0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

We were deleting ghost entries, this was usually harmless since
other messages could fill-in-the-blanks, but could cause
misthreading in odd cases where a big chunk of a thread is
missing and the latest messages only referenced ghosts.

We'll also save some cycles when scanning Xapian shards since
docids won't be <= 0.
 lib/PublicInbox/ | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/ b/lib/PublicInbox/
index 42488e12..acf35e3d 100644
--- a/lib/PublicInbox/
+++ b/lib/PublicInbox/
@@ -425,13 +425,13 @@ DELETE FROM xref3 WHERE docid NOT IN (SELECT num FROM over)
 	# fixup from old bugs:
 	$nr = $self->{oidx}->dbh->do(<<'');
-DELETE FROM over WHERE num NOT IN (SELECT docid FROM xref3)
+DELETE FROM over WHERE num > 0 AND num NOT IN (SELECT docid FROM xref3)
 	warn "I: eliminated $nr stale over entries\n" if $nr != 0;
 	reindex_checkpoint($self, $sync) if checkpoint_due($sync);
 	my ($cur) = $self->{oidx}->dbh->selectrow_array(<<EOM);
-SELECT MIN(num) FROM over
+SELECT MIN(num) FROM over WHERE num > 0
 	$cur // return; # empty
 	my ($r, $n, %active);

  parent reply	other threads:[~2021-10-10 14:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-10 14:25 [PATCH 0/8] extindex and then some Eric Wong
2021-10-10 14:25 ` [PATCH 1/8] lei_to_mail: show --output on augment progress failure Eric Wong
2021-10-10 14:25 ` [PATCH 2/8] admin: add '# ' prefix for progress messages Eric Wong
2021-10-10 14:25 ` [PATCH 3/8] set nodatacow on more SQLite files Eric Wong
2021-10-10 14:25 ` [PATCH 4/8] extindex: speed up Xapian cleanup in --gc Eric Wong
2021-10-10 14:25 ` [PATCH 5/8] extindex: minor cost reductions Eric Wong
2021-10-10 14:25 ` Eric Wong [this message]
2021-10-10 14:25 ` [PATCH 7/8] lei/store: keep ".err-XXXX" in stderr tmpfile Eric Wong
2021-10-10 14:25 ` [PATCH 8/8] extindex: sync each inbox before checking for missed messages Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).