user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 06/13] t/v[12]reindex.t: Test incremental indexing works
  2018-08-01 16:41  7% [PATCH 00/13]: Incremental index fixes Eric W. Biederman
@ 2018-08-01 16:43  3% ` Eric W. Biederman
  0 siblings, 0 replies; 2+ results
From: Eric W. Biederman @ 2018-08-01 16:43 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta, Eric W. Biederman

Capture interesting commits of the test repository in mark variables.

Use those marks to build interesting scenarios where index_sync proceeds
as if those marks are the heads of the repositor.  Use this capability to
test what happens when adds and deletes are mixed within a repository.

Be sad because things don't yet work as they should.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 t/v1reindex.t | 194 +++++++++++++++++++++++++++++++++++++++++++++++++
 t/v2reindex.t | 195 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 389 insertions(+)

diff --git a/t/v1reindex.t b/t/v1reindex.t
index 877e70390910..8e78aa761333 100644
--- a/t/v1reindex.t
+++ b/t/v1reindex.t
@@ -33,6 +33,7 @@ my $mime = PublicInbox::MIME->create(
 );
 my $minmax;
 my $msgmap;
+my ($mark1, $mark2, $mark3, $mark4);
 {
 	my %config = %$ibx_config;
 	my $ibx = PublicInbox::Inbox->new(\%config);
@@ -41,13 +42,17 @@ my $msgmap;
 		$mime->header_set('Message-Id', "<$i\@example.com>");
 		ok($im->add($mime), "message $i added");
 		if ($i == 4) {
+			$mark1 = $im->get_mark($im->{tip});
 			$im->remove($mime);
+			$mark2 = $im->get_mark($im->{tip});
 		}
 	}
 
 	if ('test remove later') {
+		$mark3 = $im->get_mark($im->{tip});
 		$mime->header_set('Message-Id', "<5\@example.com>");
 		$im->remove($mime);
+		$mark4 = $im->get_mark($im->{tip});
 	}
 
 	$im->done;
@@ -222,4 +227,193 @@ ok(!-d $xap, 'Xapian directories removed again');
 	is_deeply($ibx->mm->msg_range(\$min, $max), $msgmap, 'msgmap unchanged');
 }
 
+# An incremental indexing test
+ok(unlink "$mainrepo/public-inbox/msgmap.sqlite3", 'remove msgmap');
+remove_tree($xap);
+ok(!-d $xap, 'Xapian directories removed again');
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark1 4 simple additions in the same index_sync
+	eval { $rw->index_sync({ref => $mark1}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 4, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [4, '4@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark2 A delete separated form and add in the same index_sync
+	eval { $rw->index_sync({ref => $mark2}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 3, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark3 adds following the delete at mark2
+	eval { $rw->index_sync({ref => $mark3}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [5, '5@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark4 A delete of an older message
+	eval { $rw->index_sync({ref => $mark4}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+
+
+# Another incremental indexing test
+ok(unlink "$mainrepo/public-inbox/msgmap.sqlite3", 'remove msgmap');
+remove_tree($xap);
+ok(!-d $xap, 'Xapian directories removed again');
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark2 an add and it's delete in the same index_sync
+	eval { $rw->index_sync({ref => $mark2}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 3, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark3 adds following the delete at mark2
+	eval { $rw->index_sync({ref => $mark3}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [5, '5@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	my $im = PublicInbox::Import->new($ibx->git, undef, undef, $ibx);
+	my $rw = PublicInbox::SearchIdx->new($ibx, 1);
+	# mark4 A delete of an older message
+	eval { $rw->index_sync({ref => $mark4}) };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+
+
 done_testing();
diff --git a/t/v2reindex.t b/t/v2reindex.t
index 31e61d5b2578..ce13a07c39c0 100644
--- a/t/v2reindex.t
+++ b/t/v2reindex.t
@@ -33,21 +33,27 @@ my $mime = PublicInbox::MIME->create(
 local $ENV{NPROC} = 2;
 my $minmax;
 my $msgmap;
+my ($mark1, $mark2, $mark3, $mark4);
 {
 	my %config = %$ibx_config;
 	my $ibx = PublicInbox::Inbox->new(\%config);
 	my $im = PublicInbox::V2Writable->new($ibx, 1);
+	my $im0 = $im->importer();
 	foreach my $i (1..10) {
 		$mime->header_set('Message-Id', "<$i\@example.com>");
 		ok($im->add($mime), "message $i added");
 		if ($i == 4) {
+			$mark1 = $im0->get_mark($im0->{tip});
 			$im->remove($mime);
+			$mark2 = $im0->get_mark($im0->{tip});
 		}
 	}
 
 	if ('test remove later') {
+		$mark3 = $im0->get_mark($im0->{tip});
 		$mime->header_set('Message-Id', "<5\@example.com>");
 		$im->remove($mime);
+		$mark4 = $im0->get_mark($im0->{tip});
 	}
 
 	$im->done;
@@ -211,4 +217,193 @@ ok(!-d $xap, 'Xapian directories removed again');
 	is_deeply($ibx->mm->msg_range(\$min, $max), $msgmap, 'msgmap unchanged');
 }
 
+
+# An incremental indexing test
+ok(unlink "$mainrepo/msgmap.sqlite3", 'remove msgmap');
+remove_tree($xap);
+ok(!-d $xap, 'Xapian directories removed again');
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark1 4 simple additions in the same index_sync
+	$ibx->{ref_head} = $mark1;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 4, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [4, '4@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark2 A delete separated from an add in the same index_sync
+	$ibx->{ref_head} = $mark2;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 3, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark3 adds following the delete at mark2
+	$ibx->{ref_head} = $mark3;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [5, '5@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark4 A delete of an older message
+	$ibx->{ref_head} = $mark4;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+
+
+# Another incremental indexing test
+ok(unlink "$mainrepo/msgmap.sqlite3", 'remove msgmap');
+remove_tree($xap);
+ok(!-d $xap, 'Xapian directories removed again');
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark2 an add and it's delete in the same index_sync
+	$ibx->{ref_head} = $mark2;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 3, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark3 adds following the delete at mark2
+	$ibx->{ref_head} = $mark3;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [5, '5@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+{
+	my @warn;
+	local $SIG{__WARN__} = sub { push @warn, @_ };
+	my %config = %$ibx_config;
+	my $ibx = PublicInbox::Inbox->new(\%config);
+	# mark4 A delete of an older message
+	$ibx->{ref_head} = $mark4;
+	my $im = PublicInbox::V2Writable->new($ibx);
+	eval { $im->index_sync() };
+	is($@, '', 'no error from reindexing without msgmap');
+	is_deeply(\@warn, [], 'no warnings');
+	$im->done;
+	my ($min, $max) = $ibx->mm->minmax;
+	is($min, 1, 'min as expected');
+	is($max, 10, 'max as expected');
+	is_deeply($ibx->mm->msg_range(\$min, $max),
+		  [
+		   [1, '1@example.com' ],
+		   [2, '2@example.com' ],
+		   [3, '3@example.com' ],
+		   [6, '6@example.com' ],
+		   [7, '7@example.com' ],
+		   [8, '8@example.com' ],
+		   [9, '9@example.com' ],
+		   [10, '10@example.com' ],
+		  ], 'msgmap as expected' );
+}
+
 done_testing();
-- 
2.17.1


^ permalink raw reply related	[relevance 3%]

* [PATCH 00/13]: Incremental index fixes
@ 2018-08-01 16:41  7% Eric W. Biederman
  2018-08-01 16:43  3% ` [PATCH 06/13] t/v[12]reindex.t: Test incremental indexing works Eric W. Biederman
  0 siblings, 1 reply; 2+ results
From: Eric W. Biederman @ 2018-08-01 16:41 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta


Eric,

Recently I noticed some weird indexing on one of my mailboxes.  So I dug
in and made the index_sync tests more robust and found several corner
cases where the current code can be problematic.  What follows are the
test cases and fixes for those cases.

The big one is when a message with the highest message number is
deleted.  This can cause message numbers to be reassigned so I
explicitly track the highest message number assigned.

Very confusing when reading mail there is a case in SearchIdx
that has been assigning message numbers in the wrong order.  Depending
on your time range that can be very confusing as old messages show up
before newer ones.

Finally in v2 deleted messages have not been being deleted from the
msgmap.  Which while great for keeping message numbers from going
backwards it means things still show up that shouldn't.

Eric W. Biederman (13):
      Import.pm: Don't assume {in} and {out} always exist
      t/v1reindex.t: Isolate the test cases
      t/v2reindex.t: Isolate the test cases more
      t/v[12]reindex.t: Place expected second in Xapian tests
      t/v[12]reindex.t: Test that the resulting msgmap is as expected
      t/v[12]reindex.t: Test incremental indexing works
      SearchIdx.pm: Always assign numbers backwards during incremental indexing
      Msgmap.pm: Track the largest value of num ever assigned
      t/v[12]reindex.t Verify num_highwater
      t/v[12]reindex.t: Verify the num highwater is as expected
      SearchIdx,V2Writeable: Update num_highwater on optimized deletes
      V2Writeable.pm: Ensure that a found message number is in the msgmap
      V2Writeable.pm: In unindex_oid delete the message from msgmap

 lib/PublicInbox/Import.pm     |   1 +
 lib/PublicInbox/Msgmap.pm     |  23 ++-
 lib/PublicInbox/SearchIdx.pm  |  30 ++--
 lib/PublicInbox/V2Writable.pm |  20 ++-
 t/v1reindex.t                 | 359 ++++++++++++++++++++++++++++++++++++------
 t/v2reindex.t                 | 335 +++++++++++++++++++++++++++++++++++----
 6 files changed, 666 insertions(+), 102 deletions(-)


Eric



^ permalink raw reply	[relevance 7%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2018-08-01 16:41  7% [PATCH 00/13]: Incremental index fixes Eric W. Biederman
2018-08-01 16:43  3% ` [PATCH 06/13] t/v[12]reindex.t: Test incremental indexing works Eric W. Biederman

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).