* [PATCH 6/8] extindex: --gc doesn't touch ghost entries
2021-10-10 14:25 6% [PATCH 0/8] extindex and then some Eric Wong
@ 2021-10-10 14:25 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2021-10-10 14:25 UTC (permalink / raw)
To: meta
We were deleting ghost entries, this was usually harmless since
other messages could fill-in-the-blanks, but could cause
misthreading in odd cases where a big chunk of a thread is
missing and the latest messages only referenced ghosts.
We'll also save some cycles when scanning Xapian shards since
docids won't be <= 0.
---
lib/PublicInbox/ExtSearchIdx.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 42488e12..acf35e3d 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -425,13 +425,13 @@ DELETE FROM xref3 WHERE docid NOT IN (SELECT num FROM over)
# fixup from old bugs:
$nr = $self->{oidx}->dbh->do(<<'');
-DELETE FROM over WHERE num NOT IN (SELECT docid FROM xref3)
+DELETE FROM over WHERE num > 0 AND num NOT IN (SELECT docid FROM xref3)
warn "I: eliminated $nr stale over entries\n" if $nr != 0;
reindex_checkpoint($self, $sync) if checkpoint_due($sync);
my ($cur) = $self->{oidx}->dbh->selectrow_array(<<EOM);
-SELECT MIN(num) FROM over
+SELECT MIN(num) FROM over WHERE num > 0
EOM
$cur // return; # empty
my ($r, $n, %active);
^ permalink raw reply related [relevance 7%]
* [PATCH 0/8] extindex and then some...
@ 2021-10-10 14:25 6% Eric Wong
2021-10-10 14:25 7% ` [PATCH 6/8] extindex: --gc doesn't touch ghost entries Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2021-10-10 14:25 UTC (permalink / raw)
To: meta
One notable fix for -extindex --gc, a couple of minor things
here and there. Still need to speed up --reindex...
Eric Wong (8):
lei_to_mail: show --output on augment progress failure
admin: add '# ' prefix for progress messages
set nodatacow on more SQLite files
extindex: speed up Xapian cleanup in --gc
extindex: minor cost reductions
extindex: --gc doesn't touch ghost entries
lei/store: keep ".err-XXXX" in stderr tmpfile
extindex: sync each inbox before checking for missed messages
lib/PublicInbox/Admin.pm | 2 +-
lib/PublicInbox/ExtSearchIdx.pm | 51 +++++++++++++++++++++------------
lib/PublicInbox/LeiStore.pm | 2 +-
lib/PublicInbox/LeiToMail.pm | 2 +-
lib/PublicInbox/Over.pm | 4 ++-
lib/PublicInbox/SearchIdx.pm | 3 ++
lib/PublicInbox/SharedKV.pm | 3 +-
7 files changed, 43 insertions(+), 24 deletions(-)
^ permalink raw reply [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-10-10 14:25 6% [PATCH 0/8] extindex and then some Eric Wong
2021-10-10 14:25 7% ` [PATCH 6/8] extindex: --gc doesn't touch ghost entries Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).