user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Eric Wong <e@80x24.org>
Cc: <meta@public-inbox.org>
Subject: [PATCH] SearchIdx: Decrement regen_down even for added messages that are later deleted.
Date: Tue, 17 Jul 2018 17:06:17 -0500	[thread overview]
Message-ID: <87wottjz52.fsf@xmission.com> (raw)


Decrement regen_down when visiting messages that appear in %D that we
know will later be deleted.  This ensures consistent message numbers are
generated no matter which commit number is on top.  Allowing deletes to
propagage separately from the messages they delete without causing
problems.

The v2 trees already do this and when the indexes are deleted and
rebuilt they maintain they commit numbers.

Add a v1 version of the v2reindex test to verify that reindexing is
working properly on v1 as well as v2.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 lib/PublicInbox/SearchIdx.pm |   7 ++-
 t/v1reindex.t                | 109 +++++++++++++++++++++++++++++++++++
 2 files changed, 115 insertions(+), 1 deletion(-)
 create mode 100644 t/v1reindex.t

diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 107cd3457133..0e0796c12c12 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -561,7 +561,12 @@ sub read_log {
 	while (defined($line = <$log>)) {
 		if ($line =~ /$addmsg/o) {
 			my $blob = $1;
-			delete $D{$blob} and next;
+			if (delete $D{$blob}) {
+				if (defined $self->{regen_down}) {
+					$self->{regen_down}--;
+				}
+				next;
+			}
 			my $mime = do_cat_mail($git, $blob, \$bytes) or next;
 			batch_adjust(\$max, $bytes, $batch_cb, $latest);
 			$add_cb->($self, $mime, $bytes, $blob);
diff --git a/t/v1reindex.t b/t/v1reindex.t
new file mode 100644
index 000000000000..7b8d883753ee
--- /dev/null
+++ b/t/v1reindex.t
@@ -0,0 +1,109 @@
+# Copyright (C) 2018 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use warnings;
+use Test::More;
+use PublicInbox::MIME;
+use PublicInbox::ContentId qw(content_digest);
+use File::Temp qw/tempdir/;
+use File::Path qw(remove_tree);
+
+foreach my $mod (qw(DBD::SQLite Search::Xapian)) {
+	eval "require $mod";
+	plan skip_all => "$mod missing for v1reindex.t" if $@;
+}
+use_ok 'PublicInbox::SearchIdx';
+use_ok 'PublicInbox::Import';
+my $mainrepo = tempdir('pi-v1reindex-XXXXXX', TMPDIR => 1, CLEANUP => 1);
+is(system(qw(git init --bare), $mainrepo), 0);
+my $ibx_config = {
+	mainrepo => $mainrepo,
+	name => 'test-v1reindex',
+	-primary_address => 'test@example.com',
+};
+my $ibx = PublicInbox::Inbox->new($ibx_config);
+my $mime = PublicInbox::MIME->create(
+	header => [
+		From => 'a@example.com',
+		To => 'test@example.com',
+		Subject => 'this is a subject',
+		Date => 'Fri, 02 Oct 1993 00:00:00 +0000',
+	],
+	body => "hello world\n",
+);
+my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+foreach my $i (1..10) {
+	$mime->header_set('Message-Id', "<$i\@example.com>");
+	ok($im->add($mime), "message $i added");
+	if ($i == 4) {
+		$im->remove($mime);
+	}
+}
+
+if ('test remove later') {
+	$mime->header_set('Message-Id', "<5\@example.com>");
+	$im->remove($mime);
+}
+
+$im->done;
+my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+eval { $rw->index_sync() };
+is($@, '', 'no error from indexing');
+
+my $minmax = [ $ibx->mm->minmax ];
+ok(defined $minmax->[0] && defined $minmax->[1], 'minmax defined');
+is_deeply($minmax, [ 1, 10 ], 'minmax as expected');
+
+$rw = PublicInbox::SearchIdx->new($ibx, 1);
+eval { $rw->index_sync({reindex => 1}) };
+is($@, '', 'no error from reindexing');
+$im->done;
+
+my $xap = "$mainrepo/public-inbox/xapian".PublicInbox::Search::SCHEMA_VERSION();
+remove_tree($xap);
+ok(!-d $xap, 'Xapian directories removed');
+$rw = PublicInbox::SearchIdx->new($ibx, 1);
+
+eval { $rw->index_sync({reindex => 1}) };
+is($@, '', 'no error from reindexing');
+$im->done;
+ok(-d $xap, 'Xapian directories recreated');
+
+delete $ibx->{mm};
+is_deeply([ $ibx->mm->minmax ], $minmax, 'minmax unchanged');
+
+ok(unlink "$mainrepo/public-inbox/msgmap.sqlite3", 'remove msgmap');
+remove_tree($xap);
+$rw = PublicInbox::SearchIdx->new($ibx, 1);
+
+ok(!-d $xap, 'Xapian directories removed again');
+{
+	my @warn;
+	#local $SIG{__WARN__} = sub { push @warn, @_ };
+	eval { $rw->index_sync({reindex => 1}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is(scalar(@warn), 0, 'no warnings from reindexing');
+	$im->done;
+	ok(-d $xap, 'Xapian directories recreated');
+	delete $ibx->{mm};
+	is_deeply([ $ibx->mm->minmax ], $minmax, 'minmax unchanged');
+}
+
+ok(unlink "$mainrepo/public-inbox/msgmap.sqlite3", 'remove msgmap');
+remove_tree($xap);
+$rw = PublicInbox::SearchIdx->new($ibx, 1);
+
+ok(!-d $xap, 'Xapian directories removed again');
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	eval { $rw->index_sync({reindex => 1}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	ok(-d $xap, 'Xapian directories recreated');
+	delete $ibx->{mm};
+	is_deeply([ $ibx->mm->minmax ], $minmax, 'minmax unchanged');
+}
+
+done_testing();
-- 
2.17.1


             reply	other threads:[~2018-07-17 22:06 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-17 22:06 Eric W. Biederman [this message]
2018-07-18 10:15 ` [PATCH] SearchIdx: Decrement regen_down even for added messages that are later deleted Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wottjz52.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).