* [PATCH 0/2] hval: handle wide characters properly
@ 2020-01-19 9:40 7% Eric Wong
2020-01-19 9:40 6% ` [PATCH 1/2] hval: to_attr: support wide characters Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-01-19 9:40 UTC (permalink / raw)
To: meta
We don't see wide characters frequently in filenames and rarely
generate attributes with them for diffstat links, so this wasn't
noticed until now.
Fwiw, I'm still confused by Perl Unicode APIs despite reading
and rereading perluni* manpages over the years :x, so there
could still be bugs :x
Eric Wong (2):
hval: to_attr: support wide characters
hval: from_attr: move to unit test
lib/PublicInbox/Hval.pm | 16 +++-------------
t/hval.t | 15 ++++++++++++++-
t/solve/0001-simple-mod.patch | 2 ++
t/solver_git.t | 11 ++++++++++-
4 files changed, 29 insertions(+), 15 deletions(-)
^ permalink raw reply [relevance 7%]
* [PATCH 1/2] hval: to_attr: support wide characters
2020-01-19 9:40 7% [PATCH 0/2] hval: handle wide characters properly Eric Wong
@ 2020-01-19 9:40 6% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-01-19 9:40 UTC (permalink / raw)
To: meta
We need to escape wide characters when making attribute names from
filename-looking things in diffstats.
---
lib/PublicInbox/Hval.pm | 3 +++
t/solve/0001-simple-mod.patch | 2 ++
t/solver_git.t | 11 ++++++++++-
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm
index 7e007027..39256ee0 100644
--- a/lib/PublicInbox/Hval.pm
+++ b/lib/PublicInbox/Hval.pm
@@ -139,10 +139,12 @@ sub to_attr ($) {
return if index($str, '//') >= 0;
my $first = '';
+ utf8::encode($str); # to octets
if ($str =~ s/\A([^A-Ya-z])//ms) { # start with a letter
$first = sprintf('Z%02x', ord($1));
}
$str =~ s/([^A-Za-z0-9_\.\-])/$ESCAPES{$1}/egms;
+ utf8::decode($str); # allow wide chars
$first . $str;
}
@@ -155,6 +157,7 @@ sub from_attr ($) {
}
$str =~ s!::([a-f0-9]{2})!chr(hex($1))!egms;
$str =~ tr!:!/!;
+ utf8::decode($str);
$first . $str;
}
diff --git a/t/solve/0001-simple-mod.patch b/t/solve/0001-simple-mod.patch
index c6bb1575..c55fe310 100644
--- a/t/solve/0001-simple-mod.patch
+++ b/t/solve/0001-simple-mod.patch
@@ -3,9 +3,11 @@ To: meta@public-inbox.org
Subject: [PATCH] TODO: take expert web design advice
Date: Mon, 1 Apr 2019 08:15:20 +0000
Message-Id: <20190401081523.16213-1-BOFH@YHBT.net>
+Content-Type: text/plain; charset=utf-8
---
TODO | 2 ++
+ Ω | 5 --
1 file changed, 2 insertions(+)
diff --git a/TODO b/TODO
diff --git a/t/solver_git.t b/t/solver_git.t
index 92402c3a..92c07334 100644
--- a/t/solver_git.t
+++ b/t/solver_git.t
@@ -154,7 +154,16 @@ EOF
my $non_existent = 'ee5e32211bf62ab6531bdf39b84b6920d0b6775a';
my $client = sub {
my ($cb) = @_;
- my $res = $cb->(GET("/$name/3435775/s/"));
+ my $mid = '20190401081523.16213-1-BOFH@YHBT.net';
+ my @warn;
+ my $res = do {
+ local $SIG{__WARN__} = sub { push @warn, @_ };
+ $cb->(GET("/$name/$mid/"));
+ };
+ is_deeply(\@warn, [], 'no warnings from rendering diff');
+ like($res->content, qr!>Ω</a>!, 'omega escaped');
+
+ $res = $cb->(GET("/$name/3435775/s/"));
is($res->code, 200, 'success with existing blob');
$res = $cb->(GET("/$name/".('0'x40).'/s/'));
^ permalink raw reply related [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-01-19 9:40 7% [PATCH 0/2] hval: handle wide characters properly Eric Wong
2020-01-19 9:40 6% ` [PATCH 1/2] hval: to_attr: support wide characters Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).