diff options
author | Eric Wong <e@yhbt.net> | 2020-04-03 21:06:20 +0000 |
---|---|---|
committer | Eric Wong <e@yhbt.net> | 2020-04-03 21:46:55 +0000 |
commit | 1a02e2d367b71eca9fc8093ce83fcae50873003d (patch) | |
tree | 99012da5753e87dca4293258d5e160d87b217b07 /lib/PublicInbox/MsgIter.pm | |
parent | fc92ce8845ac5f09939722537624fa48441f7c0b (diff) | |
download | public-inbox-1a02e2d367b71eca9fc8093ce83fcae50873003d.tar.gz |
These seem mostly harmless since Perl will just truncate the match and start a new one on a newline boundary in our case. The only downside is we'd end up with redundant <span> tags in HTML. Limiting the number of line matched ourselves with `{1,$NUM}' doesn't seem prudent since lines vary in length, so we continue to defer the job of limiting matches to the Perl regexp engine. I've noticed this warning in practice on 100K+ line patches to locale data.
Diffstat (limited to 'lib/PublicInbox/MsgIter.pm')
-rw-r--r-- | lib/PublicInbox/MsgIter.pm | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/lib/PublicInbox/MsgIter.pm b/lib/PublicInbox/MsgIter.pm index 6c18d2bf..fa25564a 100644 --- a/lib/PublicInbox/MsgIter.pm +++ b/lib/PublicInbox/MsgIter.pm @@ -71,4 +71,14 @@ sub msg_part_text ($$) { ($s, $err); } +# returns an array of quoted or unquoted sections +sub split_quotes { + # Quiet "Complex regular subexpression recursion limit" warning + # in case an inconsiderate sender quotes 32K of text at once. + # The warning from Perl is harmless for us since our callers can + # tolerate less-than-ideal matches which work within Perl limits. + no warnings 'regexp'; + split(/((?:^>[^\n]*\n)+)/sm, shift); +} + 1; |