about summary refs log tree commit
diff options
context:
space:
mode:
authorEric Wong <e@yhbt.net>2020-05-09 09:09:00 +0000
committerEric Wong <e@yhbt.net>2020-05-10 07:00:16 +0000
commitcc5d9ec286f758de07b57087cfd537759b93dabe (patch)
tree3ee066360df6335df8000efb144895b703db3dcb
parent8b44e99ec009508d7e050ee44d34a1cf0f111dd5 (diff)
downloadpublic-inbox-cc5d9ec286f758de07b57087cfd537759b93dabe.tar.gz
-rw-r--r--Documentation/RelNotes/v1.5.0.eml40
-rw-r--r--Documentation/technical/data_structures.txt17
-rw-r--r--TODO7
3 files changed, 53 insertions, 11 deletions
diff --git a/Documentation/RelNotes/v1.5.0.eml b/Documentation/RelNotes/v1.5.0.eml
index c9108c15..a9d8b241 100644
--- a/Documentation/RelNotes/v1.5.0.eml
+++ b/Documentation/RelNotes/v1.5.0.eml
@@ -5,21 +5,57 @@ MIME-Version: 1.0
 Content-Type: text/plain; charset=utf-8
 Content-Disposition: inline
 
+This release introduces a new pure-Perl lazy email parser,
+PublicInbox::Eml, which uses roughly 10% less memory and
+is up to 2x faster than Email::MIME.   This is a major
+internal change
+
+Limits commonly enforced by MTAs are also enforced in the
+new parser, as messages may bypass MTA transports.
+
+Email::MIME and other Email::* modules are no longer
+dependencies nor used at all outside of maintainer validation
+tests.
+
 * public-inbox-index
 
   - `--max-size=SIZE' CLI switch and `publicinbox.indexMaxSize'
-     config file option added
+    config file option added to prevent indexing of overly
+    large messages.
+
+  - List-Id headers are indexed in new messages, old messages
+    can be found after `--reindex'.
 
 * public-inbox-watch
 
   - multiple values of `publicinbox.<name>.watchheader' are
-    supported, thanks to Kyle Meyer
+    now supported, thanks to Kyle Meyer
+
+  - List-Id headers are matched case-insensitively as specified
+    by RFC 2919
 
 * PublicInbox::WWW
 
   - $INBOX_DIR/description and $INBOX_DIR/cloneurl are not
     memoized if missing
 
+  - improved display of threads, thanks to Kyle Meyer
+
+  - search for List-Id is available via `l:' prefix if indexed
+
+  - all encodings are preloaded at startup to reduce fragmentation
+
+  - diffstat linkification and highlighting are stricter and
+    less likely to linkify tables in cover letters
+
+  - fix hunk header links to solver which were off-by-one line,
+    thanks again to Kyle Meyer
+
+Release tarball available for download over HTTPS or Tor .onion:
+
+https://yhbt.net/public-inbox.git/snapshot/public-inbox-1.5.0.tar.gz
+http://ou63pmih66umazou.onion/public-inbox.git/snapshot/public-inbox-1.5.0.tar.gz
+
 Please report bugs via plain-text mail to: meta@public-inbox.org
 
 See archives at https://public-inbox.org/meta/ for all history.
diff --git a/Documentation/technical/data_structures.txt b/Documentation/technical/data_structures.txt
index 46d5acff..8776a67b 100644
--- a/Documentation/technical/data_structures.txt
+++ b/Documentation/technical/data_structures.txt
@@ -28,14 +28,13 @@ Outside of tests, this is typically a singleton.
 Per-message classes
 -------------------
 
-* PublicInbox::MIME - Email::MIME subclass
-  Common abbreviation: $mime
+* PublicInbox::Eml - Email::MIME-like class
+  Common abbreviation: $mime, $eml
   Used by: PublicInbox::WWW, PublicInbox::SearchIdx
 
-  An representation of an entire email, multipart or not.  It's
-  a subclass of Email::MIME to workaround bugs in old
-  Email::MIME versions.  An option to use libgmime or libmailutils
-  may be supported in the future for performance and memory use.
+  An representation of an entire email, multipart or not.
+  An option to use libgmime or libmailutils may be supported
+  in the future for performance and memory use.
 
   This can be a memory hog with big messages and giant
   attachments, so our PublicInbox::WWW interface only keeps
@@ -47,6 +46,12 @@ Per-message classes
   Our PublicInbox::V2Writable class may have two objects of this
   type in memory at-a-time for deduplication.
 
+  In public-inbox 1.4 and earlier, Email::MIME and its subclass,
+  PublicInbox::MIME were used.  Despite still slurping,
+  PublicInbox::Eml is faster and uses less memory due to
+  lazy header parsing and lazy subpart instantiation with
+  shorter object lifetimes.
+
 * PublicInbox::Smsg - small message skeleton
   Used by: PublicInbox::{NNTP,WWW,SearchIdx}
   Common abbreviation: $smsg
diff --git a/TODO b/TODO
index 4c4e8e00..16de36bf 100644
--- a/TODO
+++ b/TODO
@@ -42,6 +42,7 @@ all need to be considered for everything we introduce)
   while retaining compatibility with old versions.
 
 * Support more of RFC 3977 (NNTP)
+  Is there anything left for read-only support?
 
 * Combined "super server" for NNTP/HTTP/POP3 to reduce memory overhead
 
@@ -75,9 +76,9 @@ all need to be considered for everything we introduce)
 * linkify thread skeletons better
   https://public-inbox.org/git/6E3699DEA672430CAEA6DEFEDE6918F4@PhilipOakley/
 
-* low-memory Email::MIME replacement: currently we generate many
-  allocations/strings for headers we never look at and slurp
-  entire message bodies into memory.  GMime+Inline::C could work.
+* Further lower mail parser memory usage.  We still slurp entire
+  message bodies into memory and incur 2-3x overhead on
+  multipart messages.  Inline::C (and maybe gmime) could work.
 
 * use REQUEST_URI properly for CGI / mod_perl2 compatibility
   with Message-IDs which include '%' (done?)