about summary refs log tree commit homepage
path: root/lib/PublicInbox/SearchView.pm
DateCommit message (Collapse)
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-01-27searchview: keep $noop sub private to the package
It'll always be used as a callback, so there's no point in giving it a name to be called non-anonymously. Making assigments to it is slightly faster since there's no need to repeatedly do a lookup by name.
2020-01-27view: improve readability around walk_thread
Pass \&coderefs explicitly to walk_thread, and add some prototypes + comments to describe what goes on.
2020-01-27www: use "skel" terminology consistently
This saves us a few comments and confusion. Yes, it's a destination so "dst" can be appropriate, but we may be using that term elsewhere.
2020-01-06treewide: "require" + "use" cleanup and docs
There's a bunch of leftover "require" and "use" statements we no longer need and can get rid of, along with some excessive imports via "use". IO::Handle usage isn't always obvious, so add comments describing why a package loads it. Along the same lines, document the tmpdir support as the reason we depend on File::Temp 0.19, even though every Perl 5.10.1+ user has it. While we're at it, favor "use" over "require", since it it gives us extra compile-time checking.
2019-12-28search: retry_reopen passes user arg to callback
This allows callers to pass named (not anonymous) subs. Update all retry_reopen callers to use this feature, and fix some places where we failed to use retry_reopen :x
2019-12-27searchview: remove anonymous sub when sorting threads by relevance
We don't need to return a closure or have a separate hash for sorting threads by relevance. Instead, we can stuff the relevance {pct} into the SearchMsg object itself and use that. Note: upon reviewing this code, the sort-by-relevance seems bogus as it only considers the relevance of the topmost message. Instead, it would make more sense to the user to sort by the highest relevance of all messages in that particular thread.
2019-12-27searchview: pass named subs to Www*Stream
Both WwwStream and WwwAtomStream ->response pass the WWW $ctx to the callback nowadays, so we can pass named subs to them.
2019-12-21searchview: save a column in &x=t thread skeleton
Displaying "100%" wastes a precious column. Show "99%" instead since there's little practical difference and <xapian/mset.h> states: Note that these generally aren't percentages of anything meaningful (unless you use a custom weighting formula where they are!) And we're not using a custom weighting formula.
2019-12-20searchthread: fix usage of user-supplied parameter
Instead of only passing an Inbox object, we'll pass the $ctx reference as PublicInbox::SearchView::mset_thread did. So although mset_thread was wrong, we now make it's usage of SearchThread::thread correct and update other callers to favor the new style of passing the entire $ctx (with ->{-inbox}) instead of just the Inbox object. This makes the thread skeleton at the bottom of the search page to show subjects of messages, but unfortunately links to non-existent #anchors. The next commit will fix that. While we're at it, favor "\&foo" over "*foo" since the former makes the code reference (aka "function pointer) obvious so it won't be confused for other things named "foo" in that scope (e.g. $foo/@foo/%foo).
2019-09-09run update-copyrights from gnulib for 2019
2019-06-25searchview: avoid displaying full paths on errors
Displaying full path names of installed modules could expose unnecessary information about user home directory names or other potentially sensitive information. However, displaying a module name could still be useful for diagnosing problems, so map full paths to the relevant part of the path name which is relevant to the package name. Reported-by: Ali Alnubani <alialnu@mellanox.com> https://public-inbox.org/meta/20190611193815.c4uovtlp574bid6x@dcvr/
2019-06-15searchview: add link at bottom to reverse results
I could not find a place to put the link the top without making navigation too cluttered. Putting it at the bottom of the page seems reasonable...
2019-06-15searchview: support negative offsets to reverse ordering
Taking a hint from Perl array access, we'll allow negative offsets for the 'o' parameter and to reverse the sort order.
2019-06-04searchview: do not allow non-ASCII offsets and limits
Non-ASCII digits would be interpreted as zero when used as integers.
2019-05-15www: use Inbox->over where appropriate
We don't need to rely on Xapian search functionality for the majority of the WWW code, even. subject_normalized is moved to SearchMsg, where it (probably) makes more sense, anyways.
2019-04-18view: show "(no subject)" consistently in HTML
Empty subjects ("") and undefined Subjects: are now both displayed as "(no subject)" for now.
2019-04-16cleanup: use '$ibx' consistently when referring to Inbox refs
'$inbox' is more human-readable, so that is for the more human-readable name in most cases. Making our variable naming more consistent should make the code easier-to-review and harder to screw up.
2019-01-08searchview: drop unused {seen} hashref
Unused since commit 5f09452bb7e6cf49fb6eb7e6cf166a7c3cdc5433 ("view: cull redundant phrases in subjects")
2018-04-23searchview: do not blindly append "l" parameter to URL
It's ugly and all of our other parameters are omitted when values are not the default.
2018-04-18Merge remote-tracking branch 'origin/master' into v2
* origin/master: nntp: allow and ignore empty commands mbox: do not barf on queries which return no results nntp: fix NEWNEWS command searchview: fix non-numeric comparison Allow specification of the number of search results to return githttpbackend: avoid infinite loop on generic PSGI servers http: fix modification of read-only value extmsg: use news.gmane.org for Message-ID lookups extmsg: rework partial MID matching to favor current inbox Update the installation instructions with Fedora package names nntp: do not drain rbuf if there is a command pending nntp: improve fairness during XOVER and similar commands searchidx: do not modify Xapian DB while iterating Don't use LIMIT in UPDATE statements
2018-04-05searchview: minor cleanup
$mset->size is probably more obvious than relying on a tied array and saves us a line.
2018-04-03mbox: do not barf on queries which return no results
Having zero search results means we never get a chance to populate the Content-Disposition header for mbox downloads.
2018-04-01searchview: fix non-numeric comparison
We don't want non-fully-numeric limits being compared and tripping warnings. While we're at it, avoid hard-coding '200' and reuse $LIM as the default.
2018-04-01Allow specification of the number of search results to return
Add an "l=" parameter to the search query syntax to specify how many results should be returned.
2018-03-29search: get rid of most lookup_* subroutines
Too many similar functions doing the same basic thing was redundant and misleading, especially since Message-ID is no longer treated as a truly unique identifier. For displaying threads in the HTML, this makes it clear that we favor the primary Message-ID mapped to an NNTP article number if a message cannot be found.
2018-03-27view: depend on SearchMsg for Message-ID
Since we need to handle messages with multiple and duplicate Message-ID headers, our thread skeleton display must account for that. Since we have a "preferred" Message-ID in case of conflicts, use it as the UUID in an Atom feed so readers do not get confused by conflicts.
2018-03-27searchview: remove unnecessary imports from MID module
We do not need many of these, anymore.
2018-03-22use both Date: and Received: times
We want to rely on Date: to sort messages within individual threads since it keeps messages from git-send-email(1) sorted. However, since developers occasionally have the clock set wrong on their machines, sort overall messages by the newest date in a Received: header so the landing page isn't forever polluted by messages from the future. This also gives us determinism for commit times in most cases, as we'll used the Received: timestamp there, as well.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-12-08search: force large mbox result downloads to POST
This should prevent crawlers (including most robots.txt ignoring ones) from burning our CPU time without severely compromising usability for humans.
2017-12-07searchview: nofollow on mbox downloads
Some search results are gigantic, and search engines are unlikely to be able to handle gzipped mboxes anyways.
2017-12-01search: allow downloading search results as mbox
Allowing downloading of all search results as an gzipped mboxrd file can be convenient for some users.
2017-11-29searchview: s/threaded/nested/
We want to be consistent with the view change in commit b223e6f49debb99b9132bc85d97a065ebcee00b9
2017-10-03search: try to fill in ghosts when generating thread skeleton
Since we attempt to fill in threads by Subject, our thread skeletons can cross actual thread IDs, leading to the possibility of false ghosts showing up in the skeleton. Try to fill in the ghosts as well as possible by performing a message lookup.
2017-06-23allow admins to configure non-obfuscated addresses/domains
We will also treat all known list addresses as non-obfuscated. By setting publicinbox.noObfuscate in ~/.public-inbox/config, this will allow users to disable address obfuscation on a per-domain or per-address basis.
2017-06-16view: implement optional address obfuscation
This is lightly-tested and seems to work. I'm still hesitant to support this, but the alternative of receiving death threats for displaying unobfuscated addresses seems to be not worth it.
2017-05-23searchview: retry queries if uri_unescape-able
It is possible to have double-escaped queries when copy and pasting into browsers, so try to help users work around this common error by automatically retrying after unescaping once. Of course, we must inform the user when doing this results in success, in case they really meant to search for a double-escaped term which resulted in nothing. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> https://public-inbox.org/meta/CACBZZX5Gnow08r=0A1J_kt3a=zpGyMfvsqu8nAN7kacNnDm+dg@mail.gmail.com/
2017-05-23www: do not mangle characters from search queries
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> https://public-inbox.org/meta/CACBZZX5Gnow08r=0A1J_kt3a=zpGyMfvsqu8nAN7kacNnDm+dg@mail.gmail.com/
2017-03-24searchview: show full (&x=t) messages in ascending chronlogical order
When displaying search results with full messages, it makes more sense to show them in ascending chronological order when going by date. Reverse chronological order makes more sense for search results which only show the subject.
2017-03-24searchview: add "t" id to link to thread overview
At least for the thread view (&x=t); this will make it easy to link to the overview.
2017-02-06searchview: increase limit for displaying search results
We are in no danger of excessive buffering or OOM-ing, the main page for every inbox already loads 200 results; and thread page views even load 1000! Increase this to 200 for now.
2017-02-06searchview: clarify numeric summary at bottom
Xapian can only give estimated results when a result limit is given to it, so make clear it is an estimate to avoid showing non-sensical ranges when no results are returned.
2017-01-10introduce PublicInbox::MIME wrapper class
This should fix problems with multipart messages where text/plain parts lack a header. cf. git clone --mirror https://github.com/rjbs/Email-MIME.git refs/pull/28/head In the future, we may still introduce as streaming interface to reduce memory usage on large emails.
2016-12-21searchthread: simplify API and remove needless OO
This simplifies callers to prevent errors and avoids needless object-orientation in favor of a single procedure call to handle threading and ordering.
2016-12-10search: retry document loading from Xapian
In addition to needing to retry enquire queries, we also need to protect document loading from the Xapian DB and retry on modification, as it seems to throw the same errors. Checking the $@ ref for Search::Xapian::DatabaseModifiedError is actually in the test suite for both the XS and SWIG Xapian bindings, so we should be good as far as forward/backwards compatibility.
2016-12-03atom: switch to getline/close for response bodies
This will let us stream larger Atom documents bodies without wasting too much memory and reduce the amount of round-trip requests needed to get necessary information. Hopefully clients are using streaming (SAX) parsers, too. This is the final transition in the core public-inbox code to allow migrating to a "pull"-based body streaming scheme which allows a HTTP server to respond appropriately to backpressure from slow clients.
2016-12-03searchview: fix <title> tag in Atom feed
This only affects the Atom feed for search results. "xmlstarlet val" failed to detect or warn about this, and I only noticed this bug while working on another patch.
2016-10-14thread: reinstates stable ordering when ghosts are present
This reverts commit 3c9dd6619f825f0515e7e4afa1bd55c99c1a68d3 ("thread: fix sorting without topmost") and reinstates the "topmost" routine for sorting purposes.
2016-10-05thread: fix sorting without topmost
This bug was hidden, and we may not be able to efficiently implement a topmost subroutine with the hash-based (vs linked-list) based container for threading in the next commit.