user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* stable 1.6.1 release? [was: [PATCH] eml: fix undefined vars on <Perl 5.28]
  @ 2020-12-26 20:35  8%     ` Eric Wong
  0 siblings, 0 replies; 4+ results
From: Eric Wong @ 2020-12-26 20:35 UTC (permalink / raw)
  To: Ali Alnubani, Konstantin Ryabitsev; +Cc: meta

Ali Alnubani <alialnu@nvidia.com> wrote:
> I no longer see the uninitialized value warnings or the test
> failure on Debian 9 with both patches applied on master. Do
> you plan on creating a new release tag soon with these fixes?

Thanks, pushed to master and I've started a stable-1.6 branch
to https://80x24.org/public-inbox.git with 29 commits
cherry-picked from master which I hope are suitable.

There's two more I'm tempted to cherry-pick, though they
introduce behavior changes:

  fe01d7b117c8b1e1 import: drop X-Status in addition to Status
  1bf653ad139bf7bb nntp+www: drop List-* and Archived-At headers

Konstantin: thoughts on 1bf653ad139bf7bb being suitable for 1.6.1?

Anyways, here's what I have so far:

fdbd73069af6eed9 eml: fix undefined vars on <Perl 5.28
e16e09b239b4d8bf t/config: test --get-urlmatch for git <2.26
933fce93167eba86 inboxidle: avoid needless syscalls on refresh
a0b470cbaf01c699 inboxidle: clue users into resolving ENOSPC from inotify
b782533a0413578d inbox: name variable for values loop iterator
4f1a683dc895a7bd public-inbox-v[12]-format.pod: make lexgrog happy
7a92c24157953dc6 manifest.js.gz: fix per-inbox /$INBOX/manifest.js.gz
78e81ae914ad24df Fix manpage section of perl module documentation
a4a1a74a2f60ec58 t/psgi_v2: ignore warnings on missing P::M::ReverseProxy
1cbb6243533fc2d4 daemon: support --daemonize without Net::Server::Daemonize
734daa9b165e248c doc: v2-format: drop repeated word
b63c27f36a44d8de over: ensure old, merged {tid} is really gone
c39ed01a3a4c6c46 wwwattach: prevent deep-linking via Referer match
0366c73f20b436d4 t/eml.t: workaround newer Email::MIME* behavior
bf14a3670da72358 nntp: attempt RFC 5536 3.1.5-conformant Path: headers
2fcf2b14a9ce3336 nntp: delimit Newsgroup: header with commas
31f9b61a318f4daf tls: epollbit: account for miscellaneous OpenSSL errors
5efbbd5e3e45ff3a scripts/dupe-finder: restore $dbh variable
59cc88bb5bc5ce3e searchidx: index lower-case List-Id value
4ccff6f9122da89c ds: add missing label for systems w/o EPOLLEXCLUSIVE
f9c3b3746445219b imap: avoid raising exception if client disconnects
e578a012532cd91f idxstack: fix comment about file_char
d94b6dd634381748 mda: match List-Id insensitively
c6ca576baf1700a8 mid: drop repeated ';' in mid_escape() regular expression
8e9d4f877730dbdf doc: post-1.6 updates, start 1.7
d6d442866106248e config: warn on multiple values for some fields
64f7ab3a571b9db0 doc: txt2pre: more manpage URLs
915e01b9cd771a84 doc: flow: include -imapd
dec02da946b6bb29 t/indexlevels-mirror: fix improperly skipped test

Thoughts?

Fwiw, I consider these two the most important and was
considering a 1.6.1 release even before the recent fixes:

  7a92c24157953dc6 manifest.js.gz: fix per-inbox /$INBOX/manifest.js.gz
  b63c27f36a44d8de over: ensure old, merged {tid} is really gone

^ permalink raw reply	[relevance 8%]

* [PATCH] over: ensure old, merged {tid} is really gone
  2020-12-04  3:35 10%     ` Kyle Meyer
@ 2020-12-04 12:09 14%       ` Eric Wong
  0 siblings, 0 replies; 4+ results
From: Eric Wong @ 2020-12-04 12:09 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > Yes, the fix is quite small (I think the below test case can be
> > made smaller).
> >
> > --rethread seems to be a separate bug, will fix when more awake.
> 
> Great, thank you for the explanation and fix.

No problem, thanks for the excellent bug report.  Below patch
fixes the rethread case, too, which has the same conceptual
problem in it was leaving a stale {tid} in the DB.

Also made the test data smaller and a shuffle test to ensure
it's truly order-independent.

> By the way, I've been enjoying playing with the extindex feature.  I
> haven't been doing anything fancy and it's just a few inboxes, but it's
> been _really_ nice to be able to group the inboxes for a project.
> Anyway, off topic for this thread, but thanks for that too!

Great to know.  Working on some other stuff in that area to
allow selecting inbox subsets and (hopefully) reduce SSD wear.

-------------------8<---------------

Subject: [PATCH] over: ensure old, merged {tid} is really gone

We must use the result of link_refs() since it can trigger
merge_threads() and invalidate $old_tid.  In case
merge_threads() isn't triggered, link_refs() will return
$old_tid anyways.

When rethreading and allocating new {tid}, we also must update
the row where the now-expired {tid} came from to ensure only the
new {tid} is seen when reindexing subsequent messages in
history.  Otherwise, every subsequently reindexed+rethreaded
message could end up getting a new {tid}.

Reported-by: Kyle Meyer <kyle@kyleam.com>
Link: https://public-inbox.org/meta/87360nlc44.fsf@kyleam.com/
---
 MANIFEST                   |  1 +
 lib/PublicInbox/OverIdx.pm | 12 ++++++--
 t/thread-index-gap.t       | 57 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+), 3 deletions(-)
 create mode 100644 t/thread-index-gap.t

diff --git a/MANIFEST b/MANIFEST
index 544ec5f9..946e4b8a 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -366,6 +366,7 @@ t/solver_git.t
 t/spamcheck_spamc.t
 t/spawn.t
 t/thread-cycle.t
+t/thread-index-gap.t
 t/time.t
 t/uri_imap.t
 t/utf8.eml
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 07cca4e5..88daa64f 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -170,8 +170,14 @@ sub _resolve_mid_to_tid {
 		$$tid = $cur_tid;
 	} else { # rethreading, queue up dead ghosts
 		$$tid = next_tid($self);
-		my $num = $smsg->{num};
-		push(@{$self->{-ghosts_to_delete}}, $num) if $num < 0;
+		my $n = $smsg->{num};
+		if ($n > 0) {
+			$self->{dbh}->prepare_cached(<<'')->execute($$tid, $n);
+UPDATE over SET tid = ? WHERE num = ?
+
+		} elsif ($n < 0) {
+			push(@{$self->{-ghosts_to_delete}}, $n);
+		}
 	}
 	1;
 }
@@ -298,7 +304,7 @@ sub _add_over {
 		}
 	} elsif ($n < 0) { # ghost
 		$$old_tid //= $cur_valid ? $cur_tid : next_tid($self);
-		link_refs($self, $refs, $$old_tid);
+		$$old_tid = link_refs($self, $refs, $$old_tid);
 		delete_by_num($self, $n);
 		$$v++;
 	}
diff --git a/t/thread-index-gap.t b/t/thread-index-gap.t
new file mode 100644
index 00000000..49f254e9
--- /dev/null
+++ b/t/thread-index-gap.t
@@ -0,0 +1,57 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use PublicInbox::Eml;
+use PublicInbox::InboxWritable;
+use PublicInbox::Config;
+use List::Util qw(shuffle);
+require_mods(qw(DBD::SQLite));
+require_git(2.6);
+
+chomp(my @msgs = split(/\n\n/, <<'EOF')); # "git log" order
+Subject: [bug#45000] [PATCH 1/9]
+References: <20201202045335.31096-1-j@example.com>
+Message-Id: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 0/9]
+Message-Id: <20201202045335.31096-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 0/9]
+References: <20201202045335.31096-1-j@example.com>
+Message-ID: <86sg8o1mou.fsf@example.com>
+
+Subject: [bug#45000] [PATCH 8/9]
+Message-Id: <20201202045540.31248-8-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+EOF
+
+my ($home, $for_destroy) = tmpdir();
+local $ENV{HOME} = $home;
+for my $msgs (['orig', reverse @msgs], ['shuffle', shuffle(@msgs)]) {
+	my $desc = shift @$msgs;
+	my $n = "index-cap-$desc";
+	run_script([qw(-init -L basic -V2), $n, "$home/$n",
+		"http://example.com/$n", "$n\@example.com"]) or
+		BAIL_OUT 'init';
+	my $ibx = PublicInbox::Config->new->lookup_name($n);
+	my $im = PublicInbox::InboxWritable->new($ibx)->importer(0);
+	for my $m (@$msgs) {
+		$im->add(PublicInbox::Eml->new("$m\nFrom: x\@example.com\n\n"));
+	}
+	$im->done;
+	my $over = $ibx->over;
+	my @tid = $over->dbh->selectall_array('SELECT DISTINCT(tid) FROM over');
+	is(scalar(@tid), 1, "only one thread initially ($desc)");
+	$over->dbh_close;
+	run_script([qw(-index --reindex --rethread), $ibx->{inboxdir}]) or
+		BAIL_OUT 'rethread';
+	@tid = $over->dbh->selectall_array('SELECT DISTINCT(tid) FROM over');
+	is(scalar(@tid), 1, "only one thread after rethread ($desc)");
+}
+
+done_testing;

^ permalink raw reply related	[relevance 14%]

* Re: [WIP] over: ensure old, merged {tid} is really gone.
  2020-12-04  2:12 12%   ` [WIP] over: ensure old, merged {tid} is really gone Eric Wong
@ 2020-12-04  3:35 10%     ` Kyle Meyer
  2020-12-04 12:09 14%       ` [PATCH] " Eric Wong
  0 siblings, 1 reply; 4+ results
From: Kyle Meyer @ 2020-12-04  3:35 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta


Eric Wong writes:

> Yes, the fix is quite small (I think the below test case can be
> made smaller).
>
> --rethread seems to be a separate bug, will fix when more awake.

Great, thank you for the explanation and fix.

By the way, I've been enjoying playing with the extindex feature.  I
haven't been doing anything fancy and it's just a few inboxes, but it's
been _really_ nice to be able to group the inboxes for a project.
Anyway, off topic for this thread, but thanks for that too!

^ permalink raw reply	[relevance 10%]

* [WIP] over: ensure old, merged {tid} is really gone.
  @ 2020-12-04  2:12 12%   ` Eric Wong
  2020-12-04  3:35 10%     ` Kyle Meyer
  0 siblings, 1 reply; 4+ results
From: Eric Wong @ 2020-12-04  2:12 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> It seems like the presence of two ghosts in the histories fails
> to get merged together; so it's order-dependency bug in
> OverIdx.pm rethread seems to be exacerbating the condition (less
> sure about rethread, though).

Yes, the fix is quite small (I think the below test case can be
made smaller).

--rethread seems to be a separate bug, will fix when more awake.

-----------------8<--------------
Subject: [PATCH] over: ensure old, merged {tid} is really gone.

We must use the result of link_refs() since it can trigger
merge_threads() and invalidate $old_tid.  In case
merge_threads() isn't triggered,  link_refs() will return
$old_tid anyways.
---
 MANIFEST                   |  1 +
 lib/PublicInbox/OverIdx.pm |  2 +-
 t/thread-index-gap.t       | 94 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 96 insertions(+), 1 deletion(-)
 create mode 100644 t/thread-index-gap.t

diff --git a/MANIFEST b/MANIFEST
index 544ec5f9..946e4b8a 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -366,6 +366,7 @@ t/solver_git.t
 t/spamcheck_spamc.t
 t/spawn.t
 t/thread-cycle.t
+t/thread-index-gap.t
 t/time.t
 t/uri_imap.t
 t/utf8.eml
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 07cca4e5..ac53518c 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -298,7 +298,7 @@ sub _add_over {
 		}
 	} elsif ($n < 0) { # ghost
 		$$old_tid //= $cur_valid ? $cur_tid : next_tid($self);
-		link_refs($self, $refs, $$old_tid);
+		$$old_tid = link_refs($self, $refs, $$old_tid);
 		delete_by_num($self, $n);
 		$$v++;
 	}
diff --git a/t/thread-index-gap.t b/t/thread-index-gap.t
new file mode 100644
index 00000000..1772ce22
--- /dev/null
+++ b/t/thread-index-gap.t
@@ -0,0 +1,94 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use PublicInbox::Eml;
+use PublicInbox::InboxWritable;
+use PublicInbox::Config;
+require_mods(qw(DBD::SQLite));
+require_git(2.6);
+my ($home, $for_destroy) = tmpdir();
+local $ENV{HOME} = $home;
+ok(run_script([qw(-init -V2 index-gap), "$home/index-gap",
+	qw(http://example.com/v2test index-gap@example.com)]), 'init');
+my $pi_cfg = PublicInbox::Config->new;
+my $ibx = $pi_cfg->lookup_name('index-gap');
+PublicInbox::InboxWritable->new($ibx);
+
+chomp(my @msgs = reverse(split(/\n\n/, <<'EOF')));
+Subject: [bug#45000] [PATCH 2/9]
+Message-Id: <20201202045540.31248-2-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 9/9]
+Message-Id: <20201202045540.31248-9-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 7/9]
+Message-Id: <20201202045540.31248-7-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 1/9]
+References: <20201202045335.31096-1-j@example.com>
+In-Reply-To: <20201202045335.31096-1-j@example.com>
+Message-Id: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 0/9]
+Message-Id: <20201202045335.31096-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 0/9]
+In-Reply-To: <20201202045335.31096-1-j@example.com>
+References: <20201202045335.31096-1-j@example.com>
+Message-ID: <86sg8o1mou.fsf@example.com>
+
+Subject: [bug#45000] [PATCH 8/9]
+Message-Id: <20201202045540.31248-8-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 6/9]
+Message-Id: <20201202045540.31248-6-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 5/9]
+Message-Id: <20201202045540.31248-5-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 4/9]
+Message-Id: <20201202045540.31248-4-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+
+Subject: [bug#45000] [PATCH 3/9]
+Message-Id: <20201202045540.31248-3-j@example.com>
+In-Reply-To: <20201202045540.31248-1-j@example.com>
+References: <20201202045540.31248-1-j@example.com>
+EOF
+
+my $im = $ibx->importer(0);
+for my $msg (@msgs) {
+	$im->add(PublicInbox::Eml->new("$msg\nFrom: x\@example.com\n\n"));
+}
+$im->done;
+
+my @tid = $ibx->over->dbh->selectall_array('SELECT DISTINCT(tid) FROM over');
+is(scalar(@tid), 1, 'only one thread');
+
+if (0 && 'FIXME') {
+	$ibx->over->dbh_close;
+	ok(run_script([qw(-index --reindex --rethread -v), $ibx->{inboxdir}]),
+		'rethread');
+	@tid = $ibx->over->dbh->selectall_array(
+				'SELECT DISTINCT(tid) FROM over');
+	is(scalar(@tid), 1, 'only one thread after rethread');
+}
+
+done_testing;

^ permalink raw reply related	[relevance 12%]

Results 1-4 of 4 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-12-03  4:59     missing messages in thread overview Kyle Meyer
2020-12-03 20:19     ` Eric Wong
2020-12-04  2:12 12%   ` [WIP] over: ensure old, merged {tid} is really gone Eric Wong
2020-12-04  3:35 10%     ` Kyle Meyer
2020-12-04 12:09 14%       ` [PATCH] " Eric Wong
2020-12-26 11:27     [Debian 9][Perl 5.24] uninitialized value errors in Encode/MIME/Header.pm Ali Alnubani
2020-12-26 12:25     ` [PATCH] eml: fix undefined vars on <Perl 5.28 Eric Wong
2020-12-26 14:10       ` Ali Alnubani
2020-12-26 20:35  8%     ` stable 1.6.1 release? [was: [PATCH] eml: fix undefined vars on <Perl 5.28] Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).