diff options
author | Eric Wong <e@80x24.org> | 2021-04-11 05:32:55 +0000 |
---|---|---|
committer | Eric Wong <e@80x24.org> | 2021-04-11 06:40:21 +0000 |
commit | e98c3f01267c810ee214be87d0ee1bd575b23b88 (patch) | |
tree | 938f62dce4d4faa792b9f2813b0c6f155d10695b /lib/PublicInbox/Hval.pm | |
parent | ea4e9025dd14f251996baf724e04fc478375b6a2 (diff) | |
download | public-inbox-e98c3f01267c810ee214be87d0ee1bd575b23b88.tar.gz |
As they are likely Message-IDs. If an email address ends up in a URL, then it's likely public, so there's even less reason to obfuscate that particular address. [km: add xt/perf-obfuscate.t] [ew: modernize perf test (5.10.1), use diag instead of print] This version of the patch avoids the massive slowdown noted by Kyle in <https://public-inbox.org/meta/87wnt9or6t.fsf@kyleam.com/>. Performance remains roughly the same, if not slightly faster (which may be due to me testing this on a busy server). Results from xt/perf-obfuscate.t against 6078 messages on a local mirror of <https://public-inbox.org/meta/>: before: 6.67 usr + 0.04 sys = 6.71 CPU after: 6.64 usr + 0.04 sys = 6.68 CPU Reported-by: Kyle Meyer <kyle@kyleam.com> Helped-by: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87a6q8p5qa.fsf@kyleam.com/
Diffstat (limited to 'lib/PublicInbox/Hval.pm')
-rw-r--r-- | lib/PublicInbox/Hval.pm | 21 |
1 files changed, 14 insertions, 7 deletions
diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm index d20f70ae..eab4738e 100644 --- a/lib/PublicInbox/Hval.pm +++ b/lib/PublicInbox/Hval.pm @@ -82,15 +82,22 @@ sub obfuscate_addrs ($$;$) { my $repl = $_[2] // '•'; my $re = $ibx->{-no_obfuscate_re}; # regex of domains my $addrs = $ibx->{-no_obfuscate}; # { $address => 1 } - $_[1] =~ s/(([\w\.\+=\-]+)\@([\w\-]+\.[\w\.\-]+))/ - my ($addr, $user, $domain) = ($1, $2, $3); - if ($addrs->{$addr} || ((defined $re && $domain =~ $re))) { - $addr; + $_[1] =~ s#(\S+)\@([\w\-]+\.[\w\.\-]+)# + my ($pfx, $domain) = ($1, $2); + if (index($pfx, '://') > 0 || $pfx !~ s/([\w\.\+=\-]+)\z//) { + "$pfx\@$domain"; } else { - $domain =~ s!([^\.]+)\.!$1$repl!; - $user . '@' . $domain + my $user = $1; + my $addr = "$user\@$domain"; + if ($addrs->{$addr} || ((defined($re) && + $domain =~ $re))) { + $pfx.$addr; + } else { + $domain =~ s!([^\.]+)\.!$1$repl!; + $pfx . $user . '@' . $domain + } } - /sge; + #sge; } # like format_sanitized_subject in git.git pretty.c with '%f' format string |