about summary refs log tree commit homepage
path: root/lib/PublicInbox/SearchView.pm
DateCommit message (Collapse)
2017-02-06searchview: increase limit for displaying search results
We are in no danger of excessive buffering or OOM-ing, the main page for every inbox already loads 200 results; and thread page views even load 1000! Increase this to 200 for now.
2017-02-06searchview: clarify numeric summary at bottom
Xapian can only give estimated results when a result limit is given to it, so make clear it is an estimate to avoid showing non-sensical ranges when no results are returned.
2017-01-10introduce PublicInbox::MIME wrapper class
This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2016-12-21searchthread: simplify API and remove needless OO
This simplifies callers to prevent errors and avoids needless object-orientation in favor of a single procedure call to handle threading and ordering.
2016-12-10search: retry document loading from Xapian
In addition to needing to retry enquire queries, we also need to protect document loading from the Xapian DB and retry on modification, as it seems to throw the same errors. Checking the $@ ref for Search::Xapian::DatabaseModifiedError is actually in the test suite for both the XS and SWIG Xapian bindings, so we should be good as far as forward/backwards compatibility.
2016-12-03atom: switch to getline/close for response bodies
This will let us stream larger Atom documents bodies without wasting too much memory and reduce the amount of round-trip requests needed to get necessary information. Hopefully clients are using streaming (SAX) parsers, too. This is the final transition in the core public-inbox code to allow migrating to a "pull"-based body streaming scheme which allows a HTTP server to respond appropriately to backpressure from slow clients.
2016-12-03searchview: fix <title> tag in Atom feed
This only affects the Atom feed for search results. "xmlstarlet val" failed to detect or warn about this, and I only noticed this bug while working on another patch.
2016-10-14thread: reinstates stable ordering when ghosts are present
This reverts commit 3c9dd6619f825f0515e7e4afa1bd55c99c1a68d3 ("thread: fix sorting without topmost") and reinstates the "topmost" routine for sorting purposes.
2016-10-05thread: fix sorting without topmost
This bug was hidden, and we may not be able to efficiently implement a topmost subroutine with the hash-based (vs linked-list) based container for threading in the next commit.
2016-10-05thread: remove Email::Abstract wrapping
This roughly doubles performance due to the reduction in object creation and abstraction layers.
2016-10-05thread: pass array refs instead of entire arrays
Copying large arrays is expensive, so avoid it. This reduces /$INBOX/ time by around 1%.
2016-10-05thread: remove Mail::Thread dependency
Introduce our own SearchThread class for threading messages. This should allow us to specialize and optimize away objects in future commits.
2016-08-18searchview: link to internal help text
The internal help text links to the Xapian query parser documentation anyways, but also provides information on which prefixes exist.
2016-08-14www: do not unecessarily escape some chars in paths
Based on reading RFC 3986, it seems '@', ':', '!', '$', '&', "'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed in path-absolute where we have the Message-ID. In any case, it seems '@' is fairly common in path components nowadays and too common in Message-IDs.
2016-07-06www: use HTML <hr> instead of XHTML <hr />
We only need XHTML-compatibility inside Atom feeds, as anecdotally, feed readers are stricter than normal browsers and some do not support HTML, only XHTML. So we will continue to accomodate them. However we favor HTML elsewhere since it tends to be smaller than the equivalent well-formed XHTML.
2016-07-02www: use PSGI env directly
More work on on the Plack::Request/CGI.pm removal front, No need to access the PSGI env through an extra hash lookup.
2016-07-01searchview: add missing newline in search results
Hrm... is there a more obvious way to do an internal API for this while still being streamable?
2016-06-30searchview: show result count in thread index, for now
I'm not sure what to show here, actually; but it's better than triggering an uninitialized variable warning.
2016-06-30www_stream: add response wrapper sub
This encapsulates an entire PSGI response array, hopefully making it easier to generate responses and avoid typos when setting the Content-Type.
2016-06-30view: merge $state hash with existing $ctx
This reduces the level of indirection to reach certain objects within the hash and there are no namespace or lifetime conflicts anyways.
2016-06-30view: show thread context in the thread-aware flat view
This lets user have a small window of the context of the current message relative to other threads.
2016-06-30www: use WwwStream for dumping thread and search views
This allows us the HTTP server to react to backpressure from slow clients when writing. As a side effect, this also makes it easier for us to maintain a consistent header/footer across our HTML.
2016-06-30www: implement hybrid flat+thread conversation view
This should be more accessible to readers on narrow terminals (or giant fonts) while providing a chronological view which is also aware of message threading relationships.
2016-06-21view: common thread walking interface
Since we have a common pattern, for walking threads, extract it into a function and reduce the amount of code we haev. This will make it easier to switch to an event-driven interface for getline, too.
2016-06-21searchview: remove recursion from thread view
As before, recursion can cause problems sooner than unshifting objects into the head of a queue.
2016-06-20searchview: use inbox->msg_by_mid
This abstracts out the path lookup logic and and allow us potentially allow different heads in the same repository. We may also bypass slow tree name lookups in the future by storing the raw blob ID in the Xapian document. Followup-to: 4b313dc74bc9 ("feed: various object-orientation cleanups")
2016-06-20searchview: fix Atom dump
Ugh, and I will still need to write better tests for this (and a billion other things :x) Fixes: 4b313dc74bc9 ("feed: various object-orientation cleanups")
2016-06-20feed: remove dependence on fh->write for streaming
We'll be switching to a getline/close response body to give the HTTP server more control when dealing with slow clients.
2016-05-30www: remove a few more Plack::Request dependencies
Still a work in progress, but SearchView no longer depends on Plack::Request at all and Feed is getting there. We now parse all query parameters up front, but we may do that lazily again in the future.
2016-05-25remove Email::Address dependency
git has stricter requirements for ident names (no '<>') which Email::Address allows. Even in 1.908, Email::Address also has an incomplete fix for CVE-2015-7686 with a DoS-able regexp for comments. Since we don't care for or need all the RFC compliance of Email::Address, avoiding it entirely may be preferable. Email::Address will still be installed as a requirement for Email::MIME, but it is only used by the Email::MIME::header_str_set which we do not use
2016-05-22www: avoid warnings on bad offsets for Xapian
The offset argument must be an integer for Xapian, however users (or bots) type the darndest things. AFAIK this has no security implications besides triggering a warning (which could lead to out-of-space-errors)
2016-05-02view: disable subject threading
Broken threads should be exposed to hopefully encourage people to use proper mail clients which set In-Reply-To headers.
2016-04-25searchview: add "rel=next" and "rel=prev" here, too
ref: https://www.w3.org/TR/html/links.html#sequential-link-types Followup-to: c4183f56aab6 ("www: add rel=next and rel=prev navigation hints")
2016-04-13searchview: deal with the removal of rsort
Oops. While we're at it, simplify the calls to do threading slightly by reducing the places where we touch Mail::Thread globals. Fixes: 56164afc2034 (view: allow topics to be "bumped" by new replies)
2016-04-02www: various style changes and comment updates
Reduce stack depth of arguments and rely more on state hashref to store response state. We may end up shoving everything in ctx eventually.
2016-03-12reduce "PublicInbox::Hval->new_oneline" use
It's probably a bad idea to strip extraneous whitespace from some headers as an extra space may convey useful information. Newlines don't seem to be preserved by Email::MIME or Email::Simple anyways, so there's no danger in breaking formatting.
2016-03-03use raw header for Message-ID
Message-IDs should not be MIME encoded, but in case they are, use the raw form for compatibility with ssoma and possibly other tools. This prevents a potential problem where a malicious client could confuse our storage layer into indexing incorrect contents.
2016-02-25hval: implement common UI for protocol-relative URLs
This allows users to avoid HTTPS -> HTTP downgrade warnings, but we will also avoid encouraging them towards HTTPS, for now. IMHO: the CA system gives a false sense of security, TLS libraries (e.g. OpenSSL) can introduce new bugs and problems (even to attack clients), and TLS libraries also eats memory on cheap servers.
2016-01-09hval: use more appropriate hvals for documentation
Not needed, but this is good documentation. Some of these values should never have newlines.
2016-01-04view: label "relevance" in threaded search view
The threaded search view is somewhat alien to new users, so ensure we label the word "relevance" for them.
2015-12-31view: fixup indentation nesting in search
Oops, the rarely-accessed threaded search view was completely broken. Additionally, the normal threading depths were broken when we attempted to go up-thread and replies got nested improperly Followup to commit be984ce279776d4513b4ca1bff05ebecafdd1bad ("view: thread using <ul> instead of <table>")
2015-12-26use "Atom feed" consistently in headers/footers
While having the extra " feed" is noisy in the main topic landing page, it is useful in headers/footers which have plenty of space to be more descriptive.
2015-12-26searchview: fix unclosed tags in threaded search results
Oops, we've had this forever and we also lacked a space between the this was noticed while adding an extra line between the "Search results ordered by" header and actual messages.
2015-12-26searchview: fix up Atom feed in search results
Oops :x We need better testing... Fixes: commit 4c2c2325d2948ec5340e2fcafbee798cf568f5fd ("rename 'GitCatFile' package to 'Git'")
2015-12-26searchview: fixup stupid syntax error
Fixes: commit 398e29344ecc43548a7d3998bb5d2fcee62d66cd ("view: favor whitespace wrap in <head>") Oops.
2015-12-25view: favor whitespace wrap in <head>
If we bite the bullet and rely on inline CSS, we might as well only specify it once per page instead of inline in every <pre> tag which may handle UGC. So this actually saves us a small amount of bandwith on most pages which have multiple <pre> start tags.
2015-12-22rename 'GitCatFile' package to 'Git'
We'll be using it for more than just cat-file. Adding a `popen' API for internal use allows us to save a bunch of code in other places.
2015-12-22hval: move PRE constant for wrapping UGC here
User-generated content (UGC) may have excessively long lines which screw up rendering. This is the only bit of CSS we use.
2015-12-05*view: avoid leading zero in time display of the hour
Avoid the visual noise entirely by using a space instead. I sometimes have difficulty distinguishing '0' from '8' while other users may mistake it for an 'O' character. Most digital clocks I've seen will omit displaying a leading zero for the hour, too. This may also save transfer time by allowing better compression (since there is a space between the date and time anyways) and perhaps reduce client rendering time on some displays. We'll leave the leading zero for minutes since that seems pretty standard for digital clocks.
2015-11-20various internal documentation updates
Hopefully this gives new hackers a better overview of how the components relate to each other.