Date | Commit message (Collapse) |
|
Instead of only passing an Inbox object, we'll pass the $ctx
reference as PublicInbox::SearchView::mset_thread did.
So although mset_thread was wrong, we now make it's usage
of SearchThread::thread correct and update other callers to
favor the new style of passing the entire $ctx (with ->{-inbox})
instead of just the Inbox object.
This makes the thread skeleton at the bottom of the search
page to show subjects of messages, but unfortunately links to
non-existent #anchors. The next commit will fix that.
While we're at it, favor "\&foo" over "*foo" since the former
makes the code reference (aka "function pointer) obvious so it
won't be confused for other things named "foo" in that
scope (e.g. $foo/@foo/%foo).
|
|
{mapping} overhead is now down to ~1.3M at the end of
a giant thread from hell.
|
|
Improve the display by finding any parent when we see out-of-order
References. This prevents us from having two roots in the test
case like Mail::Thread does.
|
|
In retrospect, the loop prevention done by our indexer is not
always sufficient since it can have an improperly sorted
or incomplete References headers.
This bug was triggered multiple bracketed Message-IDs in an
In-Reply-To: header (not References) where the Message-IDs were
in non-chronological order when somebody tried to reply to
different leafs of a thread with a single message.
So we must check for descendents before blindly trying to
use the last one.
Fixes: c6a8fdf71e2c336f ("thread: last Reference always wins")
|
|
Too many similar functions doing the same basic thing was
redundant and misleading, especially since Message-ID is
no longer treated as a truly unique identifier.
For displaying threads in the HTML, this makes it clear
that we favor the primary Message-ID mapped to an NNTP
article number if a message cannot be found.
|
|
Since we attempt to fill in threads by Subject, our thread
skeletons can cross actual thread IDs, leading to the
possibility of false ghosts showing up in the skeleton.
Try to fill in the ghosts as well as possible by performing
a message lookup.
|
|
It seems possible for git-send-email(1) to generate repeated
repeated instances of References and In-Reply-To headers,
as evidenced in:
https://public-inbox.org/git/20161111124541.8216-17-vascomalmeida@sapo.pt/raw
This causes a mismatch between how our search indexer threads
and how our HTML view handles threading. In the future, View.pm
will use the smsg-parsed {references} field and avoid redoing
Email::MIME header parsing.
We will still need to figure out a way to deal with messages
with repeated Message-IDs, at some point, too.
|
|
This simplifies callers to prevent errors and avoids
needless object-orientation in favor of a single procedure
call to handle threading and ordering.
|
|
It definitely is necessary to prevent looping with the
%seen hash.
|
|
Since we use SearchMsg from Xapian data, we can be
assured we do not get self-referential {references}
field.
However, we may need to be more careful when checking
has_descendent for loops, as blindly calling add_child
could open us up to that possibility...
|
|
Otherwise, a malicious or broken client could populate the
thread skeleton with invalid References. We only care about
ghosts which messages correctly refer to, not totally bogus ones
which may be the result of long line or token truncation +
wrapping in MUA headers.
|
|
Mail::Thread is UNavailable on many distros, meaning ordinary
users will have to rely on CPAN, a Perl-specific packaging tool.
|
|
This reverts commit 3c9dd6619f825f0515e7e4afa1bd55c99c1a68d3
("thread: fix sorting without topmost")
and reinstates the "topmost" routine for sorting purposes.
|
|
The ordering change in add_child is critical if $self == $parent
as the {children} hash was lost before this change.
has_descendent can be simplified by walking upwards from the child
instead of downwards from the parent.
This fixes threading regressions introduced in
commit 30100c46326e2eac275e0af13116636701d2537e
("thread: use hash + array instead of hand-rolled linked list")
|
|
This should reduce differences from the original Mail::Thread
code and hopefully make things easier-to-follow.
|
|
We have to walk through all the messages after threading
anyways to build the rootset, so we can just delete all
the parent references at that point.
|
|
This starts to show noticeable performance improvements when
attempting to thread over 400 messages; but the improvement
may not be measurable with less.
However, the resulting code is much shorter and (IMHO)
much easier to understand.
|
|
We no longer recurse, and it's too hard to come up with
a new name for a sub we will only use once.
|
|
We never use the depth anywhere in this sub
|
|
It is pointless to increment when setting a true value is
simpler as there is no need to read before writing.
|
|
Unnecessary subs and complexity. This was hiding the fact
that $before is never used.
|
|
Single use subroutines actually make the code more complex in
this case, and there's never a {seen} field in $self.
|
|
It doesn't buy us much and copying to a new array is slower;
but probably not measurable in real-world use.
|
|
This roughly doubles performance due to the reduction in
object creation and abstraction layers.
|
|
This improves top-level index generation performance by 3-4%.
|
|
Copying large arrays is expensive, so avoid it.
This reduces /$INBOX/ time by around 1%.
|
|
Introduce our own SearchThread class for threading messages.
This should allow us to specialize and optimize away objects
in future commits.
|