* [PATCH 1/6] viewvcs: cleanup utf8 handling
2019-02-05 11:10 7% [PATCH 0/6] highlighting cleanups + help update Eric Wong
@ 2019-02-05 11:10 6% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-02-05 11:10 UTC (permalink / raw)
To: meta
Favor in-place utf8::decode since it's a bit faster without
method dispatch overhead; and don't care about validity just
yet.
HlMod->do_hl itself should return "utf8" strings, since other
parts of our code can use it, so it's not the job of ViewVCS to
post-process HlMod output.
---
lib/PublicInbox/HlMod.pm | 7 ++++++-
lib/PublicInbox/ViewVCS.pm | 6 ++----
t/hl_mod.t | 1 +
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/lib/PublicInbox/HlMod.pm b/lib/PublicInbox/HlMod.pm
index 237ffac..decfd71 100644
--- a/lib/PublicInbox/HlMod.pm
+++ b/lib/PublicInbox/HlMod.pm
@@ -107,7 +107,12 @@ sub do_hl {
$g->setEncoding('utf-8');
$g;
};
- \($gen->generateString($$str))
+
+ # we assume $$str is valid UTF-8, but the SWIG binding doesn't
+ # know that, so ensure it's marked as UTF-8 even if it isnt...
+ my $out = $gen->generateString($$str);
+ utf8::decode($out);
+ \$out;
}
# SWIG instances aren't reference-counted, but $self is;
diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm
index d67b5eb..acdd822 100644
--- a/lib/PublicInbox/ViewVCS.pm
+++ b/lib/PublicInbox/ViewVCS.pm
@@ -16,7 +16,6 @@
package PublicInbox::ViewVCS;
use strict;
use warnings;
-use Encode qw(find_encoding);
use PublicInbox::SolverGit;
use PublicInbox::WwwStream;
use PublicInbox::Linkify;
@@ -33,7 +32,6 @@ END { $hl = undef };
my %QP_MAP = ( A => 'oid_a', B => 'oid_b', a => 'path_a', b => 'path_b' );
my $max_size = 1024 * 1024; # TODO: configurable
-my $enc_utf8 = find_encoding('UTF-8');
my $BIN_DETECT = 8000; # same as git
sub html_page ($$$) {
@@ -122,14 +120,14 @@ sub solve_result {
return html_page($ctx, 200, \$log);
}
- $$blob = $enc_utf8->decode($$blob);
+ # TODO: detect + convert to ensure validity
+ utf8::decode($$blob);
my $nl = ($$blob =~ tr/\n/\n/);
my $pad = length($nl);
$l->linkify_1($$blob);
my $ok = $hl->do_hl($blob, $path) if $hl;
if ($ok) {
- $$ok = $enc_utf8->decode($$ok);
src_escape($$ok);
$blob = $ok;
} else {
diff --git a/t/hl_mod.t b/t/hl_mod.t
index 80f8890..c402f1f 100644
--- a/t/hl_mod.t
+++ b/t/hl_mod.t
@@ -19,6 +19,7 @@ my $orig = $str;
{
my $ref = $hls->do_hl(\$str, 'foo.perl');
is(ref($ref), 'SCALAR', 'got a scalar reference back');
+ ok(utf8::valid($$ref), 'resulting string is utf8::valid');
like($$ref, qr/I can see you!/, 'we can see ourselves in output');
like($$ref, qr/&&/, 'escaped');
--
EW
^ permalink raw reply related [relevance 6%]
* [PATCH 0/6] highlighting cleanups + help update
@ 2019-02-05 11:10 7% Eric Wong
2019-02-05 11:10 6% ` [PATCH 1/6] viewvcs: cleanup utf8 handling Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-02-05 11:10 UTC (permalink / raw)
To: meta
Experimenting with using Markdown-style "```$LANG" blocks
support for our user documentation. It makes the CSS
example easier-to-follow when the CSS source is in front
of the user.
Markdown-style blocks definitely won't be enabled by default for
emails. I don't want to encourage people to use Markdown in
emails (as that inevitably ends with "which flavor?") and the
same mess with privacy/compatibility problems have with HTML
mail.
But preserving the WYSIWYG nature of plain-text while allowing a
tiny subset of Markdown (we already respect it for
linkification) might be useful for mailing lists which forward
messages from Markdown-supported forums/trackers (e.g. Redmine
and ruby-core mailing list).
Eric Wong (6):
viewvcs: cleanup utf8 handling
hlmod: hoist out do_hl_lang sub
hlmod: make into a singleton
hlmod: do_hl* performs src_escape immediately
hlmod: support "```$LANG" blocks in text
wwwtext: inline sample CSS and use highlight
contrib/css/216dark.css | 30 ++++++++++--------
lib/PublicInbox/HlMod.pm | 70 ++++++++++++++++++++++++++++--------------
lib/PublicInbox/UserContent.pm | 16 +++++-----
lib/PublicInbox/ViewVCS.pm | 14 ++-------
lib/PublicInbox/WwwText.pm | 35 ++++++++++-----------
t/hl_mod.t | 34 ++++++++++++--------
6 files changed, 114 insertions(+), 85 deletions(-)
--
EW
^ permalink raw reply [relevance 7%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2019-02-05 11:10 7% [PATCH 0/6] highlighting cleanups + help update Eric Wong
2019-02-05 11:10 6% ` [PATCH 1/6] viewvcs: cleanup utf8 handling Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).