user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/2] threading fixes
@ 2018-04-25  8:52 Eric Wong
  2018-04-25  8:52 ` [PATCH 1/2] thread: prevent hidden threads in /$INBOX/ landing page Eric Wong
  2018-04-25  8:52 ` [PATCH 2/2] thread: sort incoming messages by Date Eric Wong
  0 siblings, 2 replies; 3+ messages in thread
From: Eric Wong @ 2018-04-25  8:52 UTC (permalink / raw)
  To: meta

Apparently people use multiple Message-IDs in In-Reply-To
headers to reply to multiple messages at once.  This should
might render like a git merge commit, someday.

For now, our handling of References (assuming ordering) isn't
equipped to handle that.  Furthermore, having multiple
Message-IDs in an In-Reply-To header causes messages later in
the thread to screw up References ordering, too.  So, it's a
mess all around, lets at least hope folks can set the Date
correctly and use that to order things somewhat reasonably...

Eric Wong (2):
  thread: prevent hidden threads in /$INBOX/ landing page
  thread: sort incoming messages by Date

 lib/PublicInbox/SearchThread.pm | 15 +++++--
 t/thread-cycle.t                | 88 ++++++++++++++++++++++++++++++-----------
 2 files changed, 77 insertions(+), 26 deletions(-)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] thread: prevent hidden threads in /$INBOX/ landing page
  2018-04-25  8:52 [PATCH 0/2] threading fixes Eric Wong
@ 2018-04-25  8:52 ` Eric Wong
  2018-04-25  8:52 ` [PATCH 2/2] thread: sort incoming messages by Date Eric Wong
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Wong @ 2018-04-25  8:52 UTC (permalink / raw)
  To: meta

In retrospect, the loop prevention done by our indexer is not
always sufficient since it can have an improperly sorted
or incomplete References headers.

This bug was triggered multiple bracketed Message-IDs in an
In-Reply-To: header (not References) where the Message-IDs were
in non-chronological order when somebody tried to reply to
different leafs of a thread with a single message.

So we must check for descendents before blindly trying to
use the last one.

Fixes: c6a8fdf71e2c336f ("thread: last Reference always wins")
---
 lib/PublicInbox/SearchThread.pm |  4 +-
 t/thread-cycle.t                | 88 ++++++++++++++++++++++++++++++-----------
 2 files changed, 68 insertions(+), 24 deletions(-)

diff --git a/lib/PublicInbox/SearchThread.pm b/lib/PublicInbox/SearchThread.pm
index 1d250b4..450a06f 100644
--- a/lib/PublicInbox/SearchThread.pm
+++ b/lib/PublicInbox/SearchThread.pm
@@ -76,7 +76,9 @@ sub _add_message ($$) {
 
 	# C. Set the parent of this message to be the last element in
 	# References.
-	$prev->add_child($this) if defined $prev;
+	if (defined $prev && !$this->has_descendent($prev)) { # would loop
+		$prev->add_child($this);
+	}
 }
 
 package PublicInbox::SearchThread::Msg;
diff --git a/t/thread-cycle.t b/t/thread-cycle.t
index 7d85909..9e915a1 100644
--- a/t/thread-cycle.t
+++ b/t/thread-cycle.t
@@ -11,18 +11,25 @@ my $mt = eval {
 	$Mail::Thread::nosubject = 1;
 	$Mail::Thread::noprune = 1;
 };
-my @check;
-my @msgs = map {
-	my $msg = $_;
-	$msg->{references} =~ s/\s+/ /sg if $msg->{references};
-	my $simple = Email::Simple->create(header => [
-		'Message-Id' => "<$msg->{mid}>",
-		'References' => $msg->{references},
-	]);
-	push @check, $simple;
-	bless $msg, 'PublicInbox::SearchMsg'
-} (
 
+sub make_objs {
+	my @simples;
+	my $n = 0;
+	my @msgs = map {
+		my $msg = $_;
+		$msg->{ds} ||= ++$n;
+		$msg->{references} =~ s/\s+/ /sg if $msg->{references};
+		my $simple = Email::Simple->create(header => [
+			'Message-ID' => "<$msg->{mid}>",
+			'References' => $msg->{references},
+		]);
+		push @simples, $simple;
+		bless $msg, 'PublicInbox::SearchMsg'
+	} @_;
+	(\@simples, \@msgs);
+}
+
+my ($simples, $smsgs) = make_objs(
 # data from t/testbox-6 in Mail::Thread 2.55:
 	{ mid => '20021124145312.GA1759@nlin.net' },
 	{ mid => 'slrnau448m.7l4.markj+0111@cloaked.freeserve.co.uk',
@@ -50,23 +57,42 @@ my @msgs = map {
 	}
 );
 
-my $st = thread_to_s(\@msgs);
+my $st = thread_to_s($smsgs);
 
 SKIP: {
 	skip 'Mail::Thread missing', 1 unless $mt;
-	$mt = Mail::Thread->new(@check);
-	$mt->thread;
-	$mt->order(sub { sort { $a->messageid cmp $b->messageid } @_ });
-	my $check = '';
+	check_mt($st, $simples, 'Mail::Thread output matches');
+}
 
-	my @q = map { (0, $_) } $mt->rootset;
-	while (@q) {
-		my $level = shift @q;
-		my $node = shift @q or next;
-		$check .= (" "x$level) . $node->messageid . "\n";
-		unshift @q, $level + 1, $node->child, $level, $node->next;
+my @backwards = (
+	{ mid => 1, references => '<2> <3> <4>' },
+	{ mid => 4, references => '<2> <3>' },
+	{ mid => 5, references => '<6> <7> <8> <3> <2>' },
+	{ mid => 9, references => '<6> <3>' },
+	{ mid => 10, references => '<8> <7> <6>' },
+	{ mid => 2, references => '<6> <7> <8> <3>' },
+	{ mid => 3, references => '<6> <7> <8>' },
+	{ mid => 6, references => '<8> <7>' },
+	{ mid => 7, references => '<8>' },
+	{ mid => 8, references => '' }
+);
+
+($simples, $smsgs) = make_objs(@backwards);
+my $backward = thread_to_s($smsgs);
+SKIP: {
+	skip 'Mail::Thread missing', 1 unless $mt;
+	check_mt($backward, $simples, 'matches Mail::Thread backwards');
+}
+($simples, $smsgs) = make_objs(reverse @backwards);
+my $forward = thread_to_s($smsgs);
+if ('Mail::Thread sorts by Date') {
+	SKIP: {
+		skip 'Mail::Thread missing', 1 unless $mt;
+		check_mt($forward, $simples, 'matches Mail::Thread forwards');
 	}
-	is($check, $st, 'Mail::Thread output matches');
+}
+unless ('sorting by Date') {
+	is("\n".$backward, "\n".$forward, 'forward and backward matches');
 }
 
 done_testing();
@@ -86,3 +112,19 @@ sub thread_to_s {
 	}
 	$st;
 }
+
+sub check_mt {
+	my ($st, $simples, $msg) = @_;
+	my $mt = Mail::Thread->new(@$simples);
+	$mt->thread;
+	$mt->order(sub { sort { $a->messageid cmp $b->messageid } @_ });
+	my $check = '';
+	my @q = map { (0, $_) } $mt->rootset;
+	while (@q) {
+		my $level = shift @q;
+		my $node = shift @q or next;
+		$check .= (" "x$level) . $node->messageid . "\n";
+		unshift @q, $level + 1, $node->child, $level, $node->next;
+	}
+	is("\n".$check, "\n".$st, $msg);
+}
-- 
EW


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] thread: sort incoming messages by Date
  2018-04-25  8:52 [PATCH 0/2] threading fixes Eric Wong
  2018-04-25  8:52 ` [PATCH 1/2] thread: prevent hidden threads in /$INBOX/ landing page Eric Wong
@ 2018-04-25  8:52 ` Eric Wong
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Wong @ 2018-04-25  8:52 UTC (permalink / raw)
  To: meta

Improve the display by finding any parent when we see out-of-order
References.  This prevents us from having two roots in the test
case like Mail::Thread does.
---
 lib/PublicInbox/SearchThread.pm | 11 +++++++++--
 t/thread-cycle.t                |  4 ++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/SearchThread.pm b/lib/PublicInbox/SearchThread.pm
index 450a06f..be29098 100644
--- a/lib/PublicInbox/SearchThread.pm
+++ b/lib/PublicInbox/SearchThread.pm
@@ -22,9 +22,16 @@ use strict;
 use warnings;
 
 sub thread {
-	my ($messages, $ordersub, $ibx) = @_;
+	my ($msgs, $ordersub, $ibx) = @_;
 	my $id_table = {};
-	_add_message($id_table, $_) foreach @$messages;
+
+	# Sadly, we sort here anyways since the fill-in-the-blanks References:
+	# can be shakier if somebody used In-Reply-To with multiple, disparate
+	# messages.  So, take the client Date: into account since we can't
+	# alway determine ordering when somebody uses multiple In-Reply-To.
+	# We'll trust the client Date: header here instead of the Received:
+	# time since this is for display (and not retrieval)
+	_add_message($id_table, $_) for sort { $a->{ds} <=> $b->{ds} } @$msgs;
 	my $rootset = [ grep {
 			!delete($_->{parent}) && $_->visible($ibx)
 		} values %$id_table ];
diff --git a/t/thread-cycle.t b/t/thread-cycle.t
index 9e915a1..78dcb6f 100644
--- a/t/thread-cycle.t
+++ b/t/thread-cycle.t
@@ -85,13 +85,13 @@ SKIP: {
 }
 ($simples, $smsgs) = make_objs(reverse @backwards);
 my $forward = thread_to_s($smsgs);
-if ('Mail::Thread sorts by Date') {
+unless ('Mail::Thread sorts by Date') {
 	SKIP: {
 		skip 'Mail::Thread missing', 1 unless $mt;
 		check_mt($forward, $simples, 'matches Mail::Thread forwards');
 	}
 }
-unless ('sorting by Date') {
+if ('sorting by Date') {
 	is("\n".$backward, "\n".$forward, 'forward and backward matches');
 }
 
-- 
EW


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-04-25  8:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-25  8:52 [PATCH 0/2] threading fixes Eric Wong
2018-04-25  8:52 ` [PATCH 1/2] thread: prevent hidden threads in /$INBOX/ landing page Eric Wong
2018-04-25  8:52 ` [PATCH 2/2] thread: sort incoming messages by Date Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).