user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 09/24] www: only emit ASCII chars in attachment filenames
Date: Tue,  4 Jun 2019 11:27:33 +0000	[thread overview]
Message-ID: <20190604112748.23598-10-e@80x24.org> (raw)
In-Reply-To: <20190604112748.23598-1-e@80x24.org>

We don't want to emit funky URLs which can be lost in
translation or cause problems with non-Unicode-aware
clients.

Then, don't accept non-ASCII filenames in URLs, since
a manually-generated URL/filename in attachment downloads
could be used for Unicode homographs to confuse folks who
down the attachment.
---
 lib/PublicInbox/Hval.pm | 3 +++
 lib/PublicInbox/View.pm | 2 +-
 lib/PublicInbox/WWW.pm  | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm
index 95a0f70..2b44397 100644
--- a/lib/PublicInbox/Hval.pm
+++ b/lib/PublicInbox/Hval.pm
@@ -13,6 +13,9 @@ our @EXPORT_OK = qw/ascii_html obfuscate_addrs to_filename src_escape
 		to_attr from_attr/;
 my $enc_ascii = find_encoding('us-ascii');
 
+# safe-ish acceptable filename pattern for portability
+our $FN = '[a-zA-Z0-9][a-zA-Z0-9_\-\.]+[a-zA-Z0-9]'; # needs \z anchor
+
 sub new {
 	my ($class, $raw, $href) = @_;
 
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 09afdaf..83ae99b 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -528,7 +528,7 @@ sub attach_link ($$$$;$) {
 	$desc = $fn unless defined $desc;
 	$desc = '' unless defined $desc;
 	my $sfn;
-	if (defined $fn && $fn =~ /\A[[:alnum:]][\w\.-]+[[:alnum:]]\z/) {
+	if (defined $fn && $fn =~ /\A$PublicInbox::Hval::FN\z/o) {
 		$sfn = $fn;
 	} elsif ($ct eq 'text/plain') {
 		$sfn = 'a.txt';
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index b6f18f8..50b6950 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -28,7 +28,7 @@ use PublicInbox::UserContent;
 our $INBOX_RE = qr!\A/([\w\-][\w\.\-]*)!;
 our $MID_RE = qr!([^/]+)!;
 our $END_RE = qr!(T/|t/|t\.mbox(?:\.gz)?|t\.atom|raw|)!;
-our $ATTACH_RE = qr!(\d[\.\d]*)-([[:alnum:]][\w\.-]+[[:alnum:]])!i;
+our $ATTACH_RE = qr!([0-9][0-9\.]*)-($PublicInbox::Hval::FN)!;
 our $OID_RE = qr![a-f0-9]{7,40}!;
 
 sub new {
-- 
EW


  parent reply	other threads:[~2019-06-04 11:27 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-04 11:27 [PATCH 00/24] fix IDN linkification, add paranoia Eric Wong
2019-06-04 11:27 ` [PATCH 01/24] linkify: support Internationalized Domain Names in URLs Eric Wong
2019-06-04 11:27 ` [PATCH 02/24] nntp: be explicit about ASCII digit matches Eric Wong
2019-06-04 11:27 ` [PATCH 03/24] nntp: ensure we only handle ASCII whitespace Eric Wong
2019-06-04 11:27 ` [PATCH 04/24] mid: id_compress requires ASCII-clean words Eric Wong
2019-06-04 11:27 ` [PATCH 05/24] feed: only accept ASCII digits for ref~$N Eric Wong
2019-06-04 11:27 ` [PATCH 06/24] http: require SERVER_PORT to be ASCII digit Eric Wong
2019-06-04 11:27 ` [PATCH 07/24] wwwlisting: require ASCII digit for port number Eric Wong
2019-06-04 11:27 ` [PATCH 08/24] wwwattach: only pass the charset through if ASCII Eric Wong
2019-06-04 11:27 ` Eric Wong [this message]
2019-06-04 11:27 ` [PATCH 10/24] www: require ASCII filenames in git blob downloads Eric Wong
2019-06-04 11:27 ` [PATCH 11/24] config: do not accept non-ASCII digits in cgitrc params Eric Wong
2019-06-04 11:27 ` [PATCH 12/24] newswww: only accept ASCII digits as article numbers Eric Wong
2019-06-04 11:27 ` [PATCH 13/24] view: require YYYYmmDD(HHMMSS) timestamps to be ASCII Eric Wong
2019-06-04 11:27 ` [PATCH 14/24] githttpbackend: require Range:, Status: to be ASCII digits Eric Wong
2019-06-04 11:27 ` [PATCH 15/24] searchview: do not allow non-ASCII offsets and limits Eric Wong
2019-06-04 11:27 ` [PATCH 16/24] msgtime: require ASCII digits for parsing dates Eric Wong
2019-06-04 11:27 ` [PATCH 17/24] filter/rubylang: require ASCII digit for mailcount Eric Wong
2019-06-04 11:27 ` [PATCH 18/24] inbox: require ASCII digits for feedmax var Eric Wong
2019-06-04 11:27 ` [PATCH 19/24] solver|viewdiff: restrict digit matches to ASCII Eric Wong
2019-06-04 11:27 ` [PATCH 20/24] www: require ASCII digit for git epoch Eric Wong
2019-06-04 11:27 ` [PATCH 21/24] require ASCII digits for local FS items Eric Wong
2019-06-04 11:27 ` [PATCH 22/24] githttpbackend: require ASCII in path Eric Wong
2019-06-04 11:27 ` [PATCH 23/24] www: require ASCII range for mbox downloads Eric Wong
2019-06-04 11:27 ` [PATCH 24/24] www: require ASCII word characters for CSS filenames Eric Wong
2019-06-05  2:18 ` [PATCH 25/24] tighten up digit matches to ASCII for git output Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190604112748.23598-10-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).