user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@yhbt.net>
To: meta@public-inbox.org
Subject: [PATCH 1/2] hval: to_attr: support wide characters
Date: Sun, 19 Jan 2020 09:40:51 +0000	[thread overview]
Message-ID: <20200119094052.11772-2-e@yhbt.net> (raw)
In-Reply-To: <20200119094052.11772-1-e@yhbt.net>

We need to escape wide characters when making attribute names from
filename-looking things in diffstats.
---
 lib/PublicInbox/Hval.pm       |  3 +++
 t/solve/0001-simple-mod.patch |  2 ++
 t/solver_git.t                | 11 ++++++++++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm
index 7e007027..39256ee0 100644
--- a/lib/PublicInbox/Hval.pm
+++ b/lib/PublicInbox/Hval.pm
@@ -139,10 +139,12 @@ sub to_attr ($) {
 	return if index($str, '//') >= 0;
 
 	my $first = '';
+	utf8::encode($str); # to octets
 	if ($str =~ s/\A([^A-Ya-z])//ms) { # start with a letter
 		  $first = sprintf('Z%02x', ord($1));
 	}
 	$str =~ s/([^A-Za-z0-9_\.\-])/$ESCAPES{$1}/egms;
+	utf8::decode($str); # allow wide chars
 	$first . $str;
 }
 
@@ -155,6 +157,7 @@ sub from_attr ($) {
 	}
 	$str =~ s!::([a-f0-9]{2})!chr(hex($1))!egms;
 	$str =~ tr!:!/!;
+	utf8::decode($str);
 	$first . $str;
 }
 
diff --git a/t/solve/0001-simple-mod.patch b/t/solve/0001-simple-mod.patch
index c6bb1575..c55fe310 100644
--- a/t/solve/0001-simple-mod.patch
+++ b/t/solve/0001-simple-mod.patch
@@ -3,9 +3,11 @@ To: meta@public-inbox.org
 Subject: [PATCH] TODO: take expert web design advice
 Date: Mon, 1 Apr 2019 08:15:20 +0000
 Message-Id: <20190401081523.16213-1-BOFH@YHBT.net>
+Content-Type: text/plain; charset=utf-8
 
 ---
  TODO | 2 ++
+ Ω    | 5 --
  1 file changed, 2 insertions(+)
 
 diff --git a/TODO b/TODO
diff --git a/t/solver_git.t b/t/solver_git.t
index 92402c3a..92c07334 100644
--- a/t/solver_git.t
+++ b/t/solver_git.t
@@ -154,7 +154,16 @@ EOF
 	my $non_existent = 'ee5e32211bf62ab6531bdf39b84b6920d0b6775a';
 	my $client = sub {
 		my ($cb) = @_;
-		my $res = $cb->(GET("/$name/3435775/s/"));
+		my $mid = '20190401081523.16213-1-BOFH@YHBT.net';
+		my @warn;
+		my $res = do {
+			local $SIG{__WARN__} = sub { push @warn, @_ };
+			$cb->(GET("/$name/$mid/"));
+		};
+		is_deeply(\@warn, [], 'no warnings from rendering diff');
+		like($res->content, qr!>&#937;</a>!, 'omega escaped');
+
+		$res = $cb->(GET("/$name/3435775/s/"));
 		is($res->code, 200, 'success with existing blob');
 
 		$res = $cb->(GET("/$name/".('0'x40).'/s/'));

  reply	other threads:[~2020-01-19  9:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-19  9:40 [PATCH 0/2] hval: handle wide characters properly Eric Wong
2020-01-19  9:40 ` Eric Wong [this message]
2020-01-19  9:40 ` [PATCH 2/2] hval: from_attr: move to unit test Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200119094052.11772-2-e@yhbt.net \
    --to=e@yhbt.net \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).