user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 11/16] xt: add fsck script over over.sqlite3
Date: Sun, 19 Sep 2021 12:50:30 +0000	[thread overview]
Message-ID: <20210919125035.6331-12-e@80x24.org> (raw)
In-Reply-To: <20210919125035.6331-1-e@80x24.org>

I'm not sure what caused it, but I've noticed two missing
messages that failed from "lei up" on an https:// external;
and I've also seen some duplicates in the past (which I
think I fixed...).
---
 MANIFEST          |  1 +
 xt/over-fsck.perl | 44 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)
 create mode 100644 xt/over-fsck.perl

diff --git a/MANIFEST b/MANIFEST
index 218e20e9..2df743f8 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -568,6 +568,7 @@ xt/msgtime_cmp.t
 xt/net_nntp_socks.t
 xt/net_writer-imap.t
 xt/nntpd-validate.t
+xt/over-fsck.perl
 xt/perf-msgview.t
 xt/perf-nntpd.t
 xt/perf-obfuscate.t
diff --git a/xt/over-fsck.perl b/xt/over-fsck.perl
new file mode 100644
index 00000000..053204fe
--- /dev/null
+++ b/xt/over-fsck.perl
@@ -0,0 +1,44 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# unstable dev script, chasing a bug which may be in LeiSavedSearch->is_dup
+use v5.12;
+use Data::Dumper;
+use PublicInbox::OverIdx;
+@ARGV == 1 or die "Usage: $0 /path/to/over.sqlite3\n";
+my $over = PublicInbox::OverIdx->new($ARGV[0]);
+my $dbh = $over->dbh;
+$dbh->do('PRAGMA mmap_size = '.(2 ** 48));
+my $num = 0;
+my ($err, $none, $nr, $ids);
+$Data::Dumper::Useqq = $Data::Dumper::Sortkeys = 1;
+do {
+	$ids = $over->ids_after(\$num);
+	$nr += @$ids;
+	for my $n (@$ids) {
+		my $smsg = $over->get_art($n);
+		if (!$smsg) {
+			warn "#$n article missing\n";
+			++$err;
+			next;
+		}
+		my $exp = $smsg->{blob};
+		if ($exp eq '') {
+			++$none if $smsg->{bytes};
+			next;
+		}
+		my $xr3 = $over->get_xref3($n, 1);
+		my $found;
+		for my $r (@$xr3) {
+			$r->[2] = unpack('H*', $r->[2]);
+			$found = 1 if $r->[2] eq $exp;
+		}
+		if (!$found) {
+			warn Dumper([$smsg, $xr3 ]);
+			++$err;
+		}
+	}
+} while (@$ids);
+warn "$none/$nr had no blob (external?)\n" if $none;
+warn "$err errors\n" if $err;
+exit($err ? 1 : 0);

  parent reply	other threads:[~2021-09-19 12:50 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-19 12:50 [PATCH 00/16] lei IPC overhaul, NNTP fixes Eric Wong
2021-09-19 12:50 ` [PATCH 01/16] ipc: wq_do: support synchronous waits and responses Eric Wong
2021-09-19 12:50 ` [PATCH 02/16] ipc: allow disabling broadcast for wq_workers Eric Wong
2021-09-19 12:50 ` [PATCH 03/16] lei/store: use SOCK_SEQPACKET rather than pipe Eric Wong
2021-09-19 12:50 ` [PATCH 04/16] lei: simplify sto_done_request Eric Wong
2021-09-19 12:50 ` [PATCH 05/16] lei_xsearch: drop Data::Dumper use Eric Wong
2021-09-19 12:50 ` [PATCH 06/16] ipc: drop dynamic WQ process counts Eric Wong
2021-09-19 12:50 ` [PATCH 07/16] lei: clamp internal worker processes to 4 Eric Wong
2021-09-19 12:50 ` [PATCH 08/16] lei ls-mail-source: use "high"/"low" for NNTP Eric Wong
2021-09-19 12:50 ` [PATCH 09/16] lei ls-mail-source: pretty JSON support Eric Wong
2021-09-19 12:50 ` [PATCH 10/16] net_reader: fix single NNTP article fetch, test ranges Eric Wong
2021-09-19 12:50 ` Eric Wong [this message]
2021-09-19 12:50 ` [PATCH 12/16] watch: use net_reader->mic_new wrapper for SOCKS+TLS Eric Wong
2021-09-19 12:50 ` [PATCH 13/16] net_reader: no STARTTLS for IMAP localhost or onions Eric Wong
2021-09-19 12:50 ` [PATCH 14/16] lei config --edit: use controlling terminal Eric Wong
2021-09-19 12:50 ` [PATCH 15/16] net_reader: disallow imap.fetchBatchSize=0 Eric Wong
2021-09-19 12:50 ` [PATCH 16/16] doc: lei-config: document various knobs Eric Wong
2021-09-19 16:14   ` Kyle Meyer
2021-09-19 20:00     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210919125035.6331-12-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).