user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 1/5] view: Email::Address cache purge is optional
@ 2014-08-28  2:47 Eric Wong
  2014-08-28  2:47 ` [PATCH 2/5] redo main HTML index to show nested messages Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Eric Wong @ 2014-08-28  2:47 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong

We will reuse the html_footer function in a nested index.
---
 lib/PublicInbox/View.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 8bc28cd..ab607a0 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -30,7 +30,7 @@ sub msg_html {
 	headers_to_html_header($mime, $full_pfx) .
 		multipart_text_as_html($mime, $full_pfx) .
 		'</pre><hr />' . PRE_WRAP .
-		html_footer($mime) . $footer .
+		html_footer($mime, 1) . $footer .
 		'</pre></body></html>';
 }
 
@@ -204,7 +204,7 @@ sub headers_to_html_header {
 }
 
 sub html_footer {
-	my ($mime) = @_;
+	my ($mime, $purge) = @_;
 	my %cc; # everyone else
 	my $to; # this is the From address
 
@@ -219,7 +219,7 @@ sub html_footer {
 			$to ||= $dst;
 		}
 	}
-	Email::Address->purge_cache;
+	Email::Address->purge_cache if $purge;
 
 	my $subj = $mime->header('Subject') || '';
 	$subj = "Re: $subj" unless $subj =~ /\bRe:/;
-- 
EW


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/5] redo main HTML index to show nested messages
  2014-08-28  2:47 [PATCH 1/5] view: Email::Address cache purge is optional Eric Wong
@ 2014-08-28  2:47 ` Eric Wong
  2014-08-28  2:47 ` [PATCH 3/5] view: increase MAX_INLINE_QUOTED threshold to 12 Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2014-08-28  2:47 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong

This reduces the need for page reloads in common cases and should
improve reading speed so users do not need to open many browser
tabs.  This will hopefully increase an encourage readership.

The downside of this are increased server processing overhead and
easier address scraping by spam bots.
---
 lib/PublicInbox/Feed.pm | 37 +++++--------------
 lib/PublicInbox/View.pm | 97 ++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 105 insertions(+), 29 deletions(-)

diff --git a/lib/PublicInbox/Feed.pm b/lib/PublicInbox/Feed.pm
index 4ec8e97..cf64517 100644
--- a/lib/PublicInbox/Feed.pm
+++ b/lib/PublicInbox/Feed.pm
@@ -8,6 +8,7 @@ use Email::MIME;
 use Date::Parse qw(strptime str2time);
 use PublicInbox::Hval;
 use PublicInbox::GitCatFile;
+use PublicInbox::View;
 use constant {
 	DATEFMT => '%Y-%m-%dT%H:%M:%SZ', # atom standard
 	MAX_PER_PAGE => 25, # this needs to be tunable
@@ -18,7 +19,6 @@ use constant {
 sub generate {
 	my ($class, $args) = @_;
 	require XML::Atom::SimpleFeed;
-	require PublicInbox::View;
 	require POSIX;
 	my $max = $args->{max} || MAX_PER_PAGE;
 
@@ -61,7 +61,6 @@ sub generate_html_index {
 	my $git = PublicInbox::GitCatFile->new($args->{git_dir});
 	my $last = each_recent_blob($args, sub {
 		my $mime = do_cat_mail($git, $_[0]) or return 0;
-		$mime->body_set(''); # save some memory
 
 		my $t = eval { str2time($mime->header('Date')) };
 		defined($t) or $t = 0;
@@ -85,7 +84,8 @@ sub generate_html_index {
 			$a->topmost->message->header('X-PI-Date')
 		} @_;
 	});
-	dump_html_line($_, 0, \$html, time) for $th->rootset;
+	my %seen;
+	dump_msg($_, 0, \$html, time, \%seen) for $th->rootset;
 
 	Email::Address->purge_cache;
 
@@ -277,34 +277,15 @@ sub add_to_feed {
 	1;
 }
 
-sub dump_html_line {
-	my ($self, $level, $html, $now) = @_;
+sub dump_msg {
+	my ($self, $level, $html, $now, $seen) = @_;
 	if ($self->message) {
 		my $mime = $self->message;
-		my $subj = $mime->header('Subject');
-		my $ts = $mime->header('X-PI-Date');
-		my $mid = $mime->header_obj->header_raw('Message-ID');
-		$mid = PublicInbox::Hval->new_msgid($mid);
-		my $href = 'm/' . $mid->as_href . '.html';
-		my $from = mime_header($mime, 'From');
-
-		my @from = Email::Address->parse($from);
-		$from = $from[0]->name;
-		(defined($from) && length($from)) or $from = $from[0]->address;
-
-		$from = PublicInbox::Hval->new_oneline($from)->as_html;
-		$subj = PublicInbox::Hval->new_oneline($subj)->as_html;
-		if ($now > ($ts + (24 * 60 * 60))) {
-			$ts = POSIX::strftime('%m/%d ', gmtime($ts));
-		} else {
-			$ts = POSIX::strftime('%H:%M ', gmtime($ts));
-		}
-
-		$$html .= $ts . (' ' x $level);
-		$$html .= "<a href=\"$href\">$subj</a> $from\n";
+		$$html .=
+		    PublicInbox::View->index_entry($mime, $now, $level, $seen);
 	}
-	dump_html_line($self->child, $level+1, $html, $now) if $self->child;
-	dump_html_line($self->next, $level, $html, $now) if $self->next;
+	dump_msg($self->child, $level+1, $html, $now, $seen) if $self->child;
+	dump_msg($self->next, $level, $html, $now, $seen) if $self->next;
 }
 
 sub do_cat_mail {
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index ab607a0..0d97428 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -8,6 +8,7 @@ use URI::Escape qw/uri_escape_utf8/;
 use Encode qw/find_encoding/;
 use Encode::MIME::Header;
 use Email::MIME::ContentType qw/parse_content_type/;
+require POSIX;
 
 # TODO: make these constants tunable
 use constant MAX_INLINE_QUOTED => 5;
@@ -40,6 +41,92 @@ sub feed_entry {
 	PRE_WRAP . multipart_text_as_html($mime, $full_pfx) . '</pre>';
 }
 
+# this is already inside a <pre>
+sub index_entry {
+	my ($class, $mime, $now, $level, $seen) = @_;
+	my $rv = "";
+	my $part_nr = 0;
+	my $enc_msg = enc_for($mime->header("Content-Type"));
+	my $subj = $mime->header('Subject');
+	my $header_obj = $mime->header_obj;
+
+	my $mid_raw = $header_obj->header_raw('Message-ID');
+	my $name = anchor_for($mid_raw);
+	$seen->{$name} = "#$name"; # save the anchor for later
+
+	my $mid = PublicInbox::Hval->new_msgid($mid_raw);
+	my $from = PublicInbox::Hval->new_oneline($mime->header('From'))->raw;
+	my @from = Email::Address->parse($from);
+	$from = $from[0]->name;
+	(defined($from) && length($from)) or $from = $from[0]->address;
+
+	$from = PublicInbox::Hval->new_oneline($from)->as_html;
+	$subj = PublicInbox::Hval->new_oneline($subj)->as_html;
+	my $pfx = ('  ' x $level);
+
+	my $ts = $mime->header('X-PI-Date');
+	my $fmt = '%H:%M';
+	if ($now > ($ts + (365 * 24 * 60 * 60))) {
+		# doesn't have to be exactly 1 year
+		$fmt = '%Y/%m/%d';
+	} elsif ($now > ($ts + (24 * 60 * 60))) {
+		$fmt = '%m/%d';
+	}
+	$ts = POSIX::strftime($fmt, gmtime($ts));
+
+	$rv .= "$pfx<a name=\"$name\"><b>$subj</b> $from - $ts</a>\n\n";
+
+	# scan through all parts, looking for displayable text
+	$mime->walk_parts(sub {
+		my ($part) = @_;
+		return if $part->subparts; # walk_parts already recurses
+		my $enc = enc_for($part->content_type) || $enc_msg || $enc_utf8;
+
+		if ($part_nr > 0) {
+			my $fn = $part->filename;
+			defined($fn) or $fn = "part #" . ($part_nr + 1);
+			$rv .= $pfx . add_filename_line($enc->decode($fn));
+		}
+
+		my $s = ascii_html($enc->decode($part->body));
+
+		# drop quotes, including the "so-and-so wrote:" line
+		$s =~ s/(?:^[^\n]*:\s*\n)?(?:^&gt;[^\n]*\n)+(?:^\s*\n)?//mg;
+
+		# Drop signatures
+		$s =~ s/\n*-- \n.*\z//s;
+
+		# kill any trailing whitespace
+		$s =~ s/\s+\z//s;
+
+		# add prefix:
+		$s =~ s/^/$pfx/sgm;
+
+		$rv .= $s . "\n";
+		++$part_nr;
+	});
+
+	my $href = 'm/' . $mid->as_href . '.html';
+	$rv .= "$pfx<a\nhref=\"$href\">more</a> ";
+	my $txt = 'm/' . $mid->as_href . '.txt';
+	$rv .= "<a\nhref=\"$txt\">raw</a> ";
+	$rv .= html_footer($mime, 0);
+
+	my $irp = $header_obj->header_raw('In-Reply-To');
+	if (defined $irp) {
+		my $anchor_idx = anchor_for($irp);
+		my $anchor = $seen->{$anchor_idx};
+		unless (defined $anchor) {
+			my $v = PublicInbox::Hval->new_msgid($irp);
+			my $html = $v->as_html;
+			$anchor = 'm/' . $v->as_href . '.html';
+			$seen->{$anchor_idx} = $anchor;
+		}
+		$rv .= " <a\nhref=\"$anchor\">parent</a>";
+	}
+
+	$rv . "\n\n";
+}
 
 # only private functions below.
 
@@ -232,7 +319,7 @@ sub html_footer {
 	my $cc = uri_escape_utf8(join(',', values %cc));
 	my $href = "mailto:$to?In-Reply-To=$irp&Cc=${cc}&Subject=$subj";
 
-	'<a href="' . ascii_html($href) . '">reply</a>';
+	"<a\nhref=\"" . ascii_html($href) . '">reply</a>';
 }
 
 sub linkify_refs {
@@ -244,4 +331,12 @@ sub linkify_refs {
 	} @_);
 }
 
+require Digest::SHA;
+sub anchor_for {
+	my ($msgid) = @_;
+	$msgid =~ s/\A\s*<?//;
+	$msgid =~ s/>?\s*\z//;
+	Digest::SHA::sha1_hex($msgid);
+}
+
 1;
-- 
EW


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/5] view: increase MAX_INLINE_QUOTED threshold to 12
  2014-08-28  2:47 [PATCH 1/5] view: Email::Address cache purge is optional Eric Wong
  2014-08-28  2:47 ` [PATCH 2/5] redo main HTML index to show nested messages Eric Wong
@ 2014-08-28  2:47 ` Eric Wong
  2014-08-28  2:47 ` [PATCH 4/5] feed: show permalink to home page Eric Wong
  2014-08-28  2:47 ` [PATCH 5/5] feed: deal with removed files Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2014-08-28  2:47 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong

12 lines is half an 80x24 terminal, so it is probably a reasonable
amount to quote.  Often 5 lines was not enough for context.  This
feature is mainly to reduce scrolling necessary to view pages.
---
 lib/PublicInbox/View.pm | 2 +-
 t/feed.t                | 9 +++++++++
 t/view.t                | 9 ++++++++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 0d97428..2794339 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -11,7 +11,7 @@ use Email::MIME::ContentType qw/parse_content_type/;
 require POSIX;
 
 # TODO: make these constants tunable
-use constant MAX_INLINE_QUOTED => 5;
+use constant MAX_INLINE_QUOTED => 12; # half an 80x24 terminal
 use constant MAX_TRUNC_LEN => 72;
 use constant PRE_WRAP => '<pre style="white-space:pre-wrap">';
 
diff --git a/t/feed.t b/t/feed.t
index 880716c..978e215 100644
--- a/t/feed.t
+++ b/t/feed.t
@@ -31,6 +31,15 @@ Date: Thu, 01 Jan 1970 00:00:00 +0000
 > I quote to much
 > I quote to much
 > I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
+> I quote to much
 
 msg $i
 
diff --git a/t/view.t b/t/view.t
index bc6fbed..91ba168 100644
--- a/t/view.t
+++ b/t/view.t
@@ -18,7 +18,14 @@ OK
 > We generate links to a separate full page where quoted-text is inline.
 > This is
 >
-> Currently 5 lines
+> Currently 12 lines
+> See MAX_INLINE_QUOTED
+> See MAX_INLINE_QUOTED
+> See MAX_INLINE_QUOTED
+> See MAX_INLINE_QUOTED
+> See MAX_INLINE_QUOTED
+> See MAX_INLINE_QUOTED
+> See MAX_INLINE_QUOTED
 > See MAX_INLINE_QUOTED
 
 hello world
-- 
EW


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/5] feed: show permalink to home page
  2014-08-28  2:47 [PATCH 1/5] view: Email::Address cache purge is optional Eric Wong
  2014-08-28  2:47 ` [PATCH 2/5] redo main HTML index to show nested messages Eric Wong
  2014-08-28  2:47 ` [PATCH 3/5] view: increase MAX_INLINE_QUOTED threshold to 12 Eric Wong
@ 2014-08-28  2:47 ` Eric Wong
  2014-08-28  2:47 ` [PATCH 5/5] feed: deal with removed files Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2014-08-28  2:47 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong

This will make it easier to bookmark an index page with threading
context.
---
 lib/PublicInbox/Feed.pm | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/Feed.pm b/lib/PublicInbox/Feed.pm
index cf64517..1eaba6c 100644
--- a/lib/PublicInbox/Feed.pm
+++ b/lib/PublicInbox/Feed.pm
@@ -59,7 +59,7 @@ sub generate_html_index {
 
 	my @messages;
 	my $git = PublicInbox::GitCatFile->new($args->{git_dir});
-	my $last = each_recent_blob($args, sub {
+	my ($first, $last) = each_recent_blob($args, sub {
 		my $mime = do_cat_mail($git, $_[0]) or return 0;
 
 		my $t = eval { str2time($mime->header('Date')) };
@@ -89,7 +89,7 @@ sub generate_html_index {
 
 	Email::Address->purge_cache;
 
-	my $footer = nav_footer($args->{cgi}, $last, $feed_opts);
+	my $footer = nav_footer($args->{cgi}, $first, $last, $feed_opts);
 	my $list_footer = $args->{footer};
 	$footer .= "\n" . $list_footer if ($footer && $list_footer);
 	$footer = "<hr />" . PRE_WRAP . "$footer</pre>" if $footer;
@@ -99,21 +99,22 @@ sub generate_html_index {
 # private subs
 
 sub nav_footer {
-	my ($cgi, $last, $feed_opts) = @_;
+	my ($cgi, $first, $last, $feed_opts) = @_;
 	$cgi or return '';
 	my $old_r = $cgi->param('r');
 	my $head = '    ';
 	my $next = '    ';
 
 	if ($last) {
-		$next = qq!<a href="?r=$last">next</a>!;
+		$next = qq!<a\nhref="?r=$last">next</a>!;
 	}
 	if ($old_r) {
 		$head = $cgi->path_info;
-		$head = qq!<a href="$head">head</a>!;
+		$head = qq!<a\nhref="$head">head</a>!;
 	}
-	my $atom = "<a href=\"$feed_opts->{atomurl}\">atom</a>";
-	"$next $head $atom";
+	my $atom = "<a\nhref=\"$feed_opts->{atomurl}\">atom</a>";
+	my $permalink = "<a\nhref=\"?r=$first\">permalink</a>";
+	"$next $head $atom $permalink";
 }
 
 sub each_recent_blob {
@@ -174,7 +175,7 @@ sub each_recent_blob {
 
 	close $log; # we may EPIPE here
 	# for pagination
-	$commits[-1];
+	($commits[0], $commits[-1]);
 }
 
 # private functions below
-- 
EW


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 5/5] feed: deal with removed files
  2014-08-28  2:47 [PATCH 1/5] view: Email::Address cache purge is optional Eric Wong
                   ` (2 preceding siblings ...)
  2014-08-28  2:47 ` [PATCH 4/5] feed: show permalink to home page Eric Wong
@ 2014-08-28  2:47 ` Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2014-08-28  2:47 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong

Sometimes we get spam and need to delete messages,
we must prevent errors on missing messages from propagating.
---
 lib/PublicInbox/Feed.pm | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Feed.pm b/lib/PublicInbox/Feed.pm
index 1eaba6c..646c85c 100644
--- a/lib/PublicInbox/Feed.pm
+++ b/lib/PublicInbox/Feed.pm
@@ -291,8 +291,11 @@ sub dump_msg {
 
 sub do_cat_mail {
 	my ($git, $path) = @_;
-	my $str = $git->cat_file("HEAD:$path");
-	Email::MIME->new($str);
+	my $mime = eval {
+		my $str = $git->cat_file("HEAD:$path");
+		Email::MIME->new($str);
+	};
+	$@ ? undef : $mime;
 }
 
 1;
-- 
EW


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-08-28  2:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-28  2:47 [PATCH 1/5] view: Email::Address cache purge is optional Eric Wong
2014-08-28  2:47 ` [PATCH 2/5] redo main HTML index to show nested messages Eric Wong
2014-08-28  2:47 ` [PATCH 3/5] view: increase MAX_INLINE_QUOTED threshold to 12 Eric Wong
2014-08-28  2:47 ` [PATCH 4/5] feed: show permalink to home page Eric Wong
2014-08-28  2:47 ` [PATCH 5/5] feed: deal with removed files Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).