* [PATCH 00/37] viewvcs: diff highlighting and more
@ 2019-01-21 20:52 7% Eric Wong
2019-01-21 20:52 4% ` [PATCH 35/37] highlight: initial wrapper and PSGI service Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-01-21 20:52 UTC (permalink / raw)
To: meta
Still working on VCS integration and I'm not comfortable deploying
this on the main public-inbox.org because of performance/fairness
concerns, yet.
But, perfect is the enemy of good and I figure it's worth
publishing at the moment. It's also on a Tor mirror:
http://hjrcffqmbrq6wope.onion/meta/
http://hjrcffqmbrq6wope.onion/git/
It looks great to me in Netsurf and dillo :>
People with machines powerful enough to run Firefox
(or Tor Browser Bundle) can use "View -> Page Style" to adjust
colors.
Performance considerations:
* diff highlighting alone adds 10-20% overhead to message rendering
Maybe I can speed it up with some less-readable Perl...
* blob reconstruction is horribly unfair to other clients at the
moment. Fixing this is a priority for me.
I haven't hooked up highlight to blob viewing, yet; but that's
coming; too.
Thinking about it more, the blob lookups is so specific to git
that I'm not sure other VCSes can be supported...
The following changes since commit 55db8a2a51c13aec813ac56bbaac1505791fd262:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TODO: autolinkify that(!)
t/git.t: do not pass "-b" to git-repack(1) (2019-01-18 22:00:33 +0000)
are available in the Git repository at:
https://public-inbox.org/ viewvcs
for you to fetch changes up to c440c879d38e67f62bdbb74f616dc84d20899c33:
t/check-www-inbox: trap SIGINT for File::Temp destruction (2019-01-21 06:53:35 +0000)
----------------------------------------------------------------
Eric Wong (37):
view: disable bold in topic display
hval: force monospace for <form> elements, too
t/perf-msgview: add test to check msg_html performance
solver: initial Perl implementation
git: support multiple URL endpoints
git: add git_quote
git: check saves error on disambiguation
solver: various bugfixes and cleanups
view: wire up diff and vcs viewers with solver
git: disable abbreviations with cat-file hints
solver: operate directly on git index
view: enable naming hints for raw blob downloads
git: support 'ambiguous' result from --batch-check
solver: more verbose blob resolution
solver: break up patch application steps
solver: switch patch application to use a callback
solver: simplify control flow for initial loop
solver: break @todo loop into a callback
solver: note the synchronous nature of index preparation
solver: add a TODO note about making this fully evented
view: enforce trailing slash for /$INBOX/$OID/s/ endpoints
solver: restore diagnostics and deal with CRLF
www: admin-configurable CSS via "publicinbox.css"
$INBOX/_/text/color/ and sample user-side CSS
viewdiff: support diff-highlighting w/o coderepo
viewdiff: cleanup state transitions a bit
viewdiff: quote attributes for Atom feed
t/check-www-inbox: use xmlstarlet to validate Atom if available
viewdiff: do not link to 0{7,40} blobs (again)
viewvcs: disable white-space prewrap in blob view
solver: force quoted-printable bodies to LF
solver: remove extra "^index $OID..$OID" line
config: each_inbox iteration preserves config order
t/check-www-inbox: warn on missing Content-Type
highlight: initial wrapper and PSGI service
hval: split out escape sequences to a separate table
t/check-www-inbox: trap SIGINT for File::Temp destruction
Documentation/design_www.txt | 6 +-
MANIFEST | 15 +
Makefile.PL | 3 +
TODO | 2 -
contrib/css/216dark.css | 26 ++
contrib/css/216light.css | 25 ++
contrib/css/README | 41 +++
examples/highlight.psgi | 13 +
examples/public-inbox.psgi | 2 +-
lib/PublicInbox/Config.pm | 96 +++++-
lib/PublicInbox/Git.pm | 87 ++++-
lib/PublicInbox/HlMod.pm | 126 ++++++++
lib/PublicInbox/Hval.pm | 38 +--
lib/PublicInbox/SolverGit.pm | 454 +++++++++++++++++++++++++++
lib/PublicInbox/UserContent.pm | 78 +++++
lib/PublicInbox/View.pm | 51 ++-
lib/PublicInbox/ViewDiff.pm | 161 ++++++++++
lib/PublicInbox/ViewVCS.pm | 110 +++++++
lib/PublicInbox/WWW.pm | 152 ++++++++-
lib/PublicInbox/WwwHighlight.pm | 73 +++++
lib/PublicInbox/WwwStream.pm | 4 +-
lib/PublicInbox/WwwText.pm | 35 +++
script/public-inbox-httpd | 2 +-
t/check-www-inbox.perl | 26 +-
t/config.t | 19 ++
t/git.t | 7 +-
t/hl_mod.t | 54 ++++
t/perf-msgview.t | 50 +++
t/solve/0001-simple-mod.patch | 20 ++
t/solve/0002-rename-with-modifications.patch | 37 +++
t/solver_git.t | 91 ++++++
t/view.t | 2 +
32 files changed, 1841 insertions(+), 65 deletions(-)
create mode 100644 contrib/css/216dark.css
create mode 100644 contrib/css/216light.css
create mode 100644 contrib/css/README
create mode 100644 examples/highlight.psgi
create mode 100644 lib/PublicInbox/HlMod.pm
create mode 100644 lib/PublicInbox/SolverGit.pm
create mode 100644 lib/PublicInbox/UserContent.pm
create mode 100644 lib/PublicInbox/ViewDiff.pm
create mode 100644 lib/PublicInbox/ViewVCS.pm
create mode 100644 lib/PublicInbox/WwwHighlight.pm
create mode 100644 t/hl_mod.t
create mode 100644 t/perf-msgview.t
create mode 100644 t/solve/0001-simple-mod.patch
create mode 100644 t/solve/0002-rename-with-modifications.patch
create mode 100644 t/solver_git.t
Eric Wong (37):
view: disable bold in topic display
hval: force monospace for <form> elements, too
t/perf-msgview: add test to check msg_html performance
solver: initial Perl implementation
git: support multiple URL endpoints
git: add git_quote
git: check saves error on disambiguation
solver: various bugfixes and cleanups
view: wire up diff and vcs viewers with solver
git: disable abbreviations with cat-file hints
solver: operate directly on git index
view: enable naming hints for raw blob downloads
git: support 'ambiguous' result from --batch-check
solver: more verbose blob resolution
solver: break up patch application steps
solver: switch patch application to use a callback
solver: simplify control flow for initial loop
solver: break @todo loop into a callback
solver: note the synchronous nature of index preparation
solver: add a TODO note about making this fully evented
view: enforce trailing slash for /$INBOX/$OID/s/ endpoints
solver: restore diagnostics and deal with CRLF
www: admin-configurable CSS via "publicinbox.css"
$INBOX/_/text/color/ and sample user-side CSS
viewdiff: support diff-highlighting w/o coderepo
viewdiff: cleanup state transitions a bit
viewdiff: quote attributes for Atom feed
t/check-www-inbox: use xmlstarlet to validate Atom if available
viewdiff: do not link to 0{7,40} blobs (again)
viewvcs: disable white-space prewrap in blob view
solver: force quoted-printable bodies to LF
solver: remove extra "^index $OID..$OID" line
config: each_inbox iteration preserves config order
t/check-www-inbox: warn on missing Content-Type
highlight: initial wrapper and PSGI service
hval: split out escape sequences to a separate table
t/check-www-inbox: trap SIGINT for File::Temp destruction
Documentation/design_www.txt | 6 +-
MANIFEST | 15 +
Makefile.PL | 3 +
TODO | 2 -
contrib/css/216dark.css | 26 ++
contrib/css/216light.css | 25 +
contrib/css/README | 41 ++
examples/highlight.psgi | 13 +
examples/public-inbox.psgi | 2 +-
lib/PublicInbox/Config.pm | 96 +++-
lib/PublicInbox/Git.pm | 87 +++-
lib/PublicInbox/HlMod.pm | 126 +++++
lib/PublicInbox/Hval.pm | 38 +-
lib/PublicInbox/SolverGit.pm | 454 +++++++++++++++++++
lib/PublicInbox/UserContent.pm | 78 ++++
lib/PublicInbox/View.pm | 51 ++-
lib/PublicInbox/ViewDiff.pm | 161 +++++++
lib/PublicInbox/ViewVCS.pm | 110 +++++
lib/PublicInbox/WWW.pm | 152 ++++++-
lib/PublicInbox/WwwHighlight.pm | 73 +++
lib/PublicInbox/WwwStream.pm | 4 +-
lib/PublicInbox/WwwText.pm | 35 ++
script/public-inbox-httpd | 2 +-
t/check-www-inbox.perl | 26 +-
t/config.t | 19 +
t/git.t | 7 +-
t/hl_mod.t | 54 +++
t/perf-msgview.t | 50 ++
t/solve/0001-simple-mod.patch | 20 +
t/solve/0002-rename-with-modifications.patch | 37 ++
t/solver_git.t | 91 ++++
t/view.t | 2 +
32 files changed, 1841 insertions(+), 65 deletions(-)
create mode 100644 contrib/css/216dark.css
create mode 100644 contrib/css/216light.css
create mode 100644 contrib/css/README
create mode 100644 examples/highlight.psgi
create mode 100644 lib/PublicInbox/HlMod.pm
create mode 100644 lib/PublicInbox/SolverGit.pm
create mode 100644 lib/PublicInbox/UserContent.pm
create mode 100644 lib/PublicInbox/ViewDiff.pm
create mode 100644 lib/PublicInbox/ViewVCS.pm
create mode 100644 lib/PublicInbox/WwwHighlight.pm
create mode 100644 t/hl_mod.t
create mode 100644 t/perf-msgview.t
create mode 100644 t/solve/0001-simple-mod.patch
create mode 100644 t/solve/0002-rename-with-modifications.patch
create mode 100644 t/solver_git.t
^ permalink raw reply [relevance 7%]
* [PATCH 35/37] highlight: initial wrapper and PSGI service
2019-01-21 20:52 7% [PATCH 00/37] viewvcs: diff highlighting and more Eric Wong
@ 2019-01-21 20:52 4% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-01-21 20:52 UTC (permalink / raw)
To: meta
I'll probably expose the PSGI service for cgit;
but it could be useful to others as well.
---
MANIFEST | 4 +
examples/highlight.psgi | 13 ++++
lib/PublicInbox/HlMod.pm | 126 ++++++++++++++++++++++++++++++++
lib/PublicInbox/WwwHighlight.pm | 73 ++++++++++++++++++
t/hl_mod.t | 54 ++++++++++++++
5 files changed, 270 insertions(+)
create mode 100644 examples/highlight.psgi
create mode 100644 lib/PublicInbox/HlMod.pm
create mode 100644 lib/PublicInbox/WwwHighlight.pm
create mode 100644 t/hl_mod.t
diff --git a/MANIFEST b/MANIFEST
index 53d51b2..e627206 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -38,6 +38,7 @@ examples/apache2_perl.conf
examples/apache2_perl_old.conf
examples/cgi-webrick.rb
examples/cgit-commit-filter.lua
+examples/highlight.psgi
examples/logrotate.conf
examples/public-inbox-config
examples/public-inbox-httpd.socket
@@ -74,6 +75,7 @@ lib/PublicInbox/GitHTTPBackend.pm
lib/PublicInbox/HTTP.pm
lib/PublicInbox/HTTPD.pm
lib/PublicInbox/HTTPD/Async.pm
+lib/PublicInbox/HlMod.pm
lib/PublicInbox/Hval.pm
lib/PublicInbox/Import.pm
lib/PublicInbox/Inbox.pm
@@ -120,6 +122,7 @@ lib/PublicInbox/WWW.pod
lib/PublicInbox/WatchMaildir.pm
lib/PublicInbox/WwwAtomStream.pm
lib/PublicInbox/WwwAttach.pm
+lib/PublicInbox/WwwHighlight.pm
lib/PublicInbox/WwwStream.pm
lib/PublicInbox/WwwText.pm
sa_config/Makefile
@@ -170,6 +173,7 @@ t/git-http-backend.psgi
t/git-http-backend.t
t/git.fast-import-data
t/git.t
+t/hl_mod.t
t/html_index.t
t/httpd-corner.psgi
t/httpd-corner.t
diff --git a/examples/highlight.psgi b/examples/highlight.psgi
new file mode 100644
index 0000000..244b128
--- /dev/null
+++ b/examples/highlight.psgi
@@ -0,0 +1,13 @@
+#!/usr/bin/perl -w
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+#
+# Usage: plackup [OPTIONS] /path/to/this/file
+# A startup command for development which monitors changes:
+# plackup -I lib -o 127.0.0.1 -R lib -r examples/highlight.psgi
+use strict;
+use warnings;
+use PublicInbox::WwwHighlight;
+use Plack::Builder;
+my $hl = PublicInbox::WwwHighlight->new;
+builder { sub { $hl->call(@_) }; }
diff --git a/lib/PublicInbox/HlMod.pm b/lib/PublicInbox/HlMod.pm
new file mode 100644
index 0000000..5cbfb29
--- /dev/null
+++ b/lib/PublicInbox/HlMod.pm
@@ -0,0 +1,126 @@
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# I have no idea how stable or safe this is for handling untrusted
+# input, but it seems to have been around for a while, and the
+# highlight(1) executable is supported by gitweb and cgit.
+#
+# I'm also unsure about API stability, but highlight 3.x seems to
+# have been around a few years and ikiwiki (apparently the only
+# user of the SWIG/Perl bindings, at least in Debian) hasn't needed
+# major changes to support it in recent years.
+#
+# Some code stolen from ikiwiki (GPL-2.0+)
+# wrapper for SWIG-generated highlight.pm bindings
+package PublicInbox::HlMod;
+use strict;
+use warnings;
+use highlight; # SWIG-generated stuff
+
+sub _parse_filetypes ($) {
+ my $ft_conf = $_[0]->searchFile('filetypes.conf') or
+ die 'filetypes.conf not found by highlight';
+ open my $fh, '<', $ft_conf or die "failed to open($ft_conf): $!";
+ local $/;
+ my $cfg = <$fh>;
+ my %ext2lang;
+ my @shebang; # order matters
+
+ # Hrm... why isn't this exposed by the highlight API?
+ # highlight >= 3.2 format (bind-style) (from ikiwiki)
+ while ($cfg =~ /\bLang\s*=\s*\"([^"]+)\"[,\s]+
+ Extensions\s*=\s*{([^}]+)}/sgx) {
+ my $lang = $1;
+ foreach my $bit (split(/,/, $2)) {
+ $bit =~ s/.*"(.*)".*/$1/s;
+ $ext2lang{$bit} = $lang;
+ }
+ }
+ # AFAIK, all the regexps used by in filetypes.conf distributed
+ # by highlight work as Perl REs
+ while ($cfg =~ /\bLang\s*=\s*\"([^"]+)\"[,\s]+
+ Shebang\s*=\s*\[\s*\[([^}]+)\s*\]\s*\]\s*}\s*,/sgx) {
+ my ($lang, $re) = ($1, $2);
+ eval {
+ my $perl_re = qr/$re/;
+ push @shebang, [ $lang, $perl_re ];
+ };
+ if ($@) {
+ warn "$lang shebang=[[$re]] did not work in Perl: $@";
+ }
+ }
+ (\%ext2lang, \@shebang);
+}
+
+sub new {
+ my ($class) = @_;
+ my $dir = highlight::DataDir->new;
+ $dir->initSearchDirectories('');
+ my ($ext2lang, $shebang) = _parse_filetypes($dir);
+ bless {
+ -dir => $dir,
+ -ext2lang => $ext2lang,
+ -shebang => $shebang,
+ }, $class;
+}
+
+sub _shebang2lang ($$) {
+ my ($self, $str) = @_;
+ my $shebang = $self->{-shebang};
+ foreach my $s (@$shebang) {
+ return $s->[0] if $$str =~ $s->[1];
+ }
+ undef;
+}
+
+sub _path2lang ($$) {
+ my ($self, $path) = @_;
+ my ($ext) = ($path =~ m!([^\\/\.]+)\z!);
+ $ext = lc($ext);
+ $self->{-ext2lang}->{$ext} || $ext;
+}
+
+sub do_hl {
+ my ($self, $str, $path) = @_;
+ my $lang = _path2lang($self, $path) if defined $path;
+ my $dir = $self->{-dir};
+ my $langpath;
+ if (defined $lang) {
+ $langpath = $dir->getLangPath("$lang.lang") or return;
+ $langpath = undef unless -f $langpath;
+ }
+ unless (defined $langpath) {
+ $lang = _shebang2lang($self, $str) or return;
+ $langpath = $dir->getLangPath("$lang.lang") or return;
+ $langpath = undef unless -f $langpath;
+ }
+ return unless defined $langpath;
+
+ my $gen = $self->{$langpath} ||= do {
+ my $g = highlight::CodeGenerator::getInstance($highlight::HTML);
+ $g->setFragmentCode(1); # generate html fragment
+ $g->setHTMLEnclosePreTag(1); # include <pre>
+
+ # whatever theme works
+ my $themepath = $dir->getThemePath('print.theme');
+ $g->initTheme($themepath);
+ $g->loadLanguage($langpath);
+ $g->setEncoding('utf-8');
+ $g;
+ };
+ \($gen->generateString($$str))
+}
+
+# SWIG instances aren't reference-counted, but $self is;
+# so we need to delete all the CodeGenerator instances manually
+# at our own destruction
+sub DESTROY {
+ my ($self) = @_;
+ foreach my $gen (values %$self) {
+ if (ref($gen) eq 'highlight::CodeGenerator') {
+ highlight::CodeGenerator::deleteInstance($gen);
+ }
+ }
+}
+
+1;
diff --git a/lib/PublicInbox/WwwHighlight.pm b/lib/PublicInbox/WwwHighlight.pm
new file mode 100644
index 0000000..3d6ca03
--- /dev/null
+++ b/lib/PublicInbox/WwwHighlight.pm
@@ -0,0 +1,73 @@
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Standalone PSGI app to provide syntax highlighting as-a-service
+# via "highlight" Perl module ("libhighlight-perl" in Debian).
+#
+# This allows exposing highlight as a persistent HTTP service for
+# other scripts via HTTP PUT requests. PATH_INFO will be used
+# as a hint for detecting the language for highlight.
+#
+# The following example using curl(1) will do the right thing
+# regarding the file extension:
+#
+# curl -HExpect: -T /path/to/file http://example.com/
+#
+# You can also force a file extension by giving a path
+# (in this case, "c") via:
+#
+# curl -HExpect: -T /path/to/file http://example.com/x.c
+
+package PublicInbox::WwwHighlight;
+use strict;
+use warnings;
+use HTTP::Status qw(status_message);
+use parent qw(PublicInbox::HlMod);
+
+# TODO: support highlight(1) for distros which don't package the
+# SWIG extension. Also, there may be admins who don't want to
+# have ugly SWIG-generated code in a long-lived Perl process.
+
+sub r ($) {
+ my ($code) = @_;
+ my $msg = status_message($code);
+ my $len = length($msg);
+ [ $code, [qw(Content-Type text/plain Content-Length), $len], [$msg] ]
+}
+
+# another slurp API hogging up all my memory :<
+# This is capped by whatever the PSGI server allows,
+# $ENV{GIT_HTTP_MAX_REQUEST_BUFFER} for PublicInbox::HTTP (10 MB)
+sub read_in_full ($) {
+ my ($env) = @_;
+
+ my $in = $env->{'psgi.input'};
+ my $off = 0;
+ my $buf = '';
+ my $len = $env->{CONTENT_LENGTH} || 8192;
+ while (1) {
+ my $r = $in->read($buf, $len, $off);
+ last unless defined $r;
+ return \$buf if $r == 0;
+ $off += $r;
+ }
+ $env->{'psgi.errors'}->print("input read error: $!\n");
+}
+
+# entry point for PSGI
+sub call {
+ my ($self, $env) = @_;
+ my $req_method = $env->{REQUEST_METHOD};
+
+ return r(405) if $req_method ne 'PUT';
+
+ my $bref = read_in_full($env) or return r(500);
+ $bref = $self->do_hl($bref, $env->{PATH_INFO});
+
+ my $h = [ 'Content-Type', 'text/html; charset=UTF-8' ];
+ push @$h, 'Content-Length', bytes::length($$bref);
+
+ [ 200, $h, [ $$bref ] ]
+}
+
+1;
diff --git a/t/hl_mod.t b/t/hl_mod.t
new file mode 100644
index 0000000..b8b8eb9
--- /dev/null
+++ b/t/hl_mod.t
@@ -0,0 +1,54 @@
+#!/usr/bin/perl -w
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use warnings;
+use Test::More;
+eval { require highlight } or
+ plan skip_all => 'failed to load highlight.pm';
+use_ok 'PublicInbox::HlMod';
+my $hls = PublicInbox::HlMod->new;
+ok($hls, 'initialized OK');
+is($hls->_shebang2lang(\"#!/usr/bin/perl -w\n"), 'perl', 'perl shebang OK');
+is($hls->{-ext2lang}->{'pm'}, 'perl', '.pm suffix OK');
+is($hls->{-ext2lang}->{'pl'}, 'perl', '.pl suffix OK');
+is($hls->_path2lang('Makefile'), 'make', 'Makefile OK');
+my $str = do { local $/; open(my $fh, __FILE__); <$fh> };
+my $orig = $str;
+
+{
+ my $ref = $hls->do_hl(\$str, 'foo.perl');
+ is(ref($ref), 'SCALAR', 'got a scalar reference back');
+ like($$ref, qr/I can see you!/, 'we can see ourselves in output');
+
+ use PublicInbox::Spawn qw(which);
+ if (eval { require IPC::Run } && which('w3m')) {
+ require File::Temp;
+ my $cmd = [ qw(w3m -T text/html -dump -config /dev/null) ];
+ my ($out, $err) = ('', '');
+ IPC::Run::run($cmd, $ref, \$out, \$err);
+ # expand tabs and normalize whitespace,
+ # w3m doesn't preserve tabs
+ $orig =~ s/\t/ /gs;
+ $out =~ s/\s*\z//sg;
+ $orig =~ s/\s*\z//sg;
+ is($out, $orig, 'w3m output matches');
+ }
+}
+
+my $nr = $ENV{TEST_MEMLEAK};
+if ($nr && -r "/proc/$$/status") {
+ my $fh;
+ open $fh, '<', "/proc/$$/status";
+ diag "starting at memtest at ".join('', grep(/VmRSS:/, <$fh>));
+ PublicInbox::HlMod->new->do_hl(\$orig) for (1..$nr);
+ open $fh, '<', "/proc/$$/status";
+ diag "creating $nr instances: ".join('', grep(/VmRSS:/, <$fh>));
+ my $hls = PublicInbox::HlMod->new;
+ $hls->do_hl(\$orig) for (1..$nr);
+ $hls = undef;
+ open $fh, '<', "/proc/$$/status";
+ diag "reused instance $nr times: ".join('', grep(/VmRSS:/, <$fh>));
+}
+
+done_testing;
--
EW
^ permalink raw reply related [relevance 4%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2019-01-21 20:52 7% [PATCH 00/37] viewvcs: diff highlighting and more Eric Wong
2019-01-21 20:52 4% ` [PATCH 35/37] highlight: initial wrapper and PSGI service Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).