about summary refs log tree commit homepage
path: root/lib/PublicInbox/Smsg.pm
diff options
context:
space:
mode:
authorEric Wong <e@80x24.org>2021-01-03 02:06:15 +0000
committerEric Wong <e@80x24.org>2021-01-03 18:30:31 +0000
commit71461c67fee940b05309baa8c67bac10c8c51ac6 (patch)
tree07ab30ed55e4bd62ab2022167e14e0ae09bb43ad /lib/PublicInbox/Smsg.pm
parent323d8bac125e89a76c904a54a7ae0b2e36f05cc6 (diff)
downloadpublic-inbox-71461c67fee940b05309baa8c67bac10c8c51ac6.tar.gz
We don't need to be keeping the raw message around after it hits
git.  Shard work now relies on Storable (or Sereal) and all of
the indexing code relies on the Email::MIME-like API of Eml to
access interesting parts of the message.

Similarly, smsg->{raw_bytes} is no longer carried around and we
do the CRLF adjustment when setting smsg->{bytes}.

There's also a small simplification to t/import.t while
we're in the area to use xqx instead of spawn/popen_rd.
Diffstat (limited to 'lib/PublicInbox/Smsg.pm')
-rw-r--r--lib/PublicInbox/Smsg.pm13
1 files changed, 13 insertions, 0 deletions
diff --git a/lib/PublicInbox/Smsg.pm b/lib/PublicInbox/Smsg.pm
index 571cbb6f..c6ff7f52 100644
--- a/lib/PublicInbox/Smsg.pm
+++ b/lib/PublicInbox/Smsg.pm
@@ -135,4 +135,17 @@ sub subject_normalized ($) {
         $subj;
 }
 
+# returns the number of bytes to add if given a non-CRLF arg
+sub crlf_adjust ($) {
+        if (index($_[0], "\r\n") < 0) {
+                # common case is LF-only, every \n needs an \r;
+                # so favor a cheap tr// over an expensive m//g
+                $_[0] =~ tr/\n/\n/;
+        } else { # count number of '\n' w/o '\r', expensive:
+                scalar(my @n = ($_[0] =~ m/(?<!\r)\n/g));
+        }
+}
+
+sub set_bytes { $_[0]->{bytes} = $_[2] + crlf_adjust($_[1]) }
+
 1;