user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH] search: implement subject summarization
@ 2015-08-25  2:03 Eric Wong
  2015-08-25  3:56 ` Eric Wong
  0 siblings, 1 reply; 2+ messages in thread
From: Eric Wong @ 2015-08-25  2:03 UTC (permalink / raw)
  To: meta

We ought to summarize subjects to avoid exploding
line lengths in the web interface.
---
 lib/PublicInbox/Search.pm    | 25 +++++++++++++++++++++++++
 lib/PublicInbox/SearchMsg.pm |  3 +--
 t/search.t                   | 17 +++++++++++++++++
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index bcc5312..5ef380e 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -237,6 +237,31 @@ sub subject_normalized {
 	$subj;
 }
 
+# for doc data
+sub subject_summary {
+	my $subj = pop;
+	my $max = 68;
+	if (length($subj) > $max) {
+		my @subj = split(/\s+/, $subj);
+		$subj = '';
+		my $l;
+
+		while ($l = shift @subj) {
+			my $new = $subj . $l . ' ';
+			last if length($new) >= $max;
+			$subj = $new;
+		}
+		if (length $subj) {
+			my $r = scalar @subj ? ' ...' : '';
+			$subj =~ s/ \z/$r/s;
+		} else {
+			@subj = ($l =~ /\A(.{1,72})/);
+			$subj = $subj[0] . ' ...';
+		}
+	}
+	$subj;
+}
+
 sub enquire {
 	my ($self) = @_;
 	$self->{enquire} ||= Search::Xapian::Enquire->new($self->{xdb});
diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index a8f99bd..a9f3180 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -94,9 +94,8 @@ sub date {
 
 sub to_doc_data {
 	my ($self) = @_;
-
 	$self->mid . "\n" .
-	$self->subject . "\n" .
+	PublicInbox::Search::subject_summary($self->subject) . "\n" .
 	$self->from_name . "\n".
 	$self->date . "\n" .
 	$self->references_sorted;
diff --git a/t/search.t b/t/search.t
index 17e9eaf..65539f1 100644
--- a/t/search.t
+++ b/t/search.t
@@ -16,6 +16,23 @@ is(0, system(qw(git init -q --bare), $git_dir), "git init (main)");
 eval { PublicInbox::Search->new($git_dir) };
 ok($@, "exception raised on non-existent DB");
 
+{
+	my $orig = "FOO " x 30;
+	my $summ = PublicInbox::Search::subject_summary($orig);
+
+	$summ = length($summ);
+	$orig = length($orig);
+	ok($summ < $orig && $summ > 0, "summary shortened ($orig => $summ)");
+
+	$orig = "FOO" x 30;
+	$summ = PublicInbox::Search::subject_summary($orig);
+
+	$summ = length($summ);
+	$orig = length($orig);
+	ok($summ < $orig && $summ > 0,
+	   "summary shortened but not empty: $summ");
+}
+
 my $rw = PublicInbox::SearchIdx->new($git_dir, 1);
 my $ro = PublicInbox::Search->new($git_dir);
 my $rw_commit = sub {
-- 
EW


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] search: implement subject summarization
  2015-08-25  2:03 [PATCH] search: implement subject summarization Eric Wong
@ 2015-08-25  3:56 ` Eric Wong
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Wong @ 2015-08-25  3:56 UTC (permalink / raw)
  To: meta

Note: no schema version change, yet, this doesn't really affect
anything other than visual display.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-08-25  3:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-25  2:03 [PATCH] search: implement subject summarization Eric Wong
2015-08-25  3:56 ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).