user/dev discussion of public-inbox itself
 help / color / Atom feed
* [PATCH] various doc updates ahead of 1.5.0
@ 2020-05-10  6:59 Eric Wong
  0 siblings, 0 replies; only message in thread
From: Eric Wong @ 2020-05-10  6:59 UTC (permalink / raw)
  To: meta

---
 Documentation/RelNotes/v1.5.0.eml           | 40 +++++++++++++++++++--
 Documentation/technical/data_structures.txt | 17 +++++----
 TODO                                        |  7 ++--
 3 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/Documentation/RelNotes/v1.5.0.eml b/Documentation/RelNotes/v1.5.0.eml
index c9108c15..a9d8b241 100644
--- a/Documentation/RelNotes/v1.5.0.eml
+++ b/Documentation/RelNotes/v1.5.0.eml
@@ -5,21 +5,57 @@ MIME-Version: 1.0
 Content-Type: text/plain; charset=utf-8
 Content-Disposition: inline
 
+This release introduces a new pure-Perl lazy email parser,
+PublicInbox::Eml, which uses roughly 10% less memory and
+is up to 2x faster than Email::MIME.   This is a major
+internal change
+
+Limits commonly enforced by MTAs are also enforced in the
+new parser, as messages may bypass MTA transports.
+
+Email::MIME and other Email::* modules are no longer
+dependencies nor used at all outside of maintainer validation
+tests.
+
 * public-inbox-index
 
   - `--max-size=SIZE' CLI switch and `publicinbox.indexMaxSize'
-     config file option added
+    config file option added to prevent indexing of overly
+    large messages.
+
+  - List-Id headers are indexed in new messages, old messages
+    can be found after `--reindex'.
 
 * public-inbox-watch
 
   - multiple values of `publicinbox.<name>.watchheader' are
-    supported, thanks to Kyle Meyer
+    now supported, thanks to Kyle Meyer
+
+  - List-Id headers are matched case-insensitively as specified
+    by RFC 2919
 
 * PublicInbox::WWW
 
   - $INBOX_DIR/description and $INBOX_DIR/cloneurl are not
     memoized if missing
 
+  - improved display of threads, thanks to Kyle Meyer
+
+  - search for List-Id is available via `l:' prefix if indexed
+
+  - all encodings are preloaded at startup to reduce fragmentation
+
+  - diffstat linkification and highlighting are stricter and
+    less likely to linkify tables in cover letters
+
+  - fix hunk header links to solver which were off-by-one line,
+    thanks again to Kyle Meyer
+
+Release tarball available for download over HTTPS or Tor .onion:
+
+https://yhbt.net/public-inbox.git/snapshot/public-inbox-1.5.0.tar.gz
+http://ou63pmih66umazou.onion/public-inbox.git/snapshot/public-inbox-1.5.0.tar.gz
+
 Please report bugs via plain-text mail to: meta@public-inbox.org
 
 See archives at https://public-inbox.org/meta/ for all history.
diff --git a/Documentation/technical/data_structures.txt b/Documentation/technical/data_structures.txt
index 46d5acff..8776a67b 100644
--- a/Documentation/technical/data_structures.txt
+++ b/Documentation/technical/data_structures.txt
@@ -28,14 +28,13 @@ Outside of tests, this is typically a singleton.
 Per-message classes
 -------------------
 
-* PublicInbox::MIME - Email::MIME subclass
-  Common abbreviation: $mime
+* PublicInbox::Eml - Email::MIME-like class
+  Common abbreviation: $mime, $eml
   Used by: PublicInbox::WWW, PublicInbox::SearchIdx
 
-  An representation of an entire email, multipart or not.  It's
-  a subclass of Email::MIME to workaround bugs in old
-  Email::MIME versions.  An option to use libgmime or libmailutils
-  may be supported in the future for performance and memory use.
+  An representation of an entire email, multipart or not.
+  An option to use libgmime or libmailutils may be supported
+  in the future for performance and memory use.
 
   This can be a memory hog with big messages and giant
   attachments, so our PublicInbox::WWW interface only keeps
@@ -47,6 +46,12 @@ Per-message classes
   Our PublicInbox::V2Writable class may have two objects of this
   type in memory at-a-time for deduplication.
 
+  In public-inbox 1.4 and earlier, Email::MIME and its subclass,
+  PublicInbox::MIME were used.  Despite still slurping,
+  PublicInbox::Eml is faster and uses less memory due to
+  lazy header parsing and lazy subpart instantiation with
+  shorter object lifetimes.
+
 * PublicInbox::Smsg - small message skeleton
   Used by: PublicInbox::{NNTP,WWW,SearchIdx}
   Common abbreviation: $smsg
diff --git a/TODO b/TODO
index 4c4e8e00..16de36bf 100644
--- a/TODO
+++ b/TODO
@@ -42,6 +42,7 @@ all need to be considered for everything we introduce)
   while retaining compatibility with old versions.
 
 * Support more of RFC 3977 (NNTP)
+  Is there anything left for read-only support?
 
 * Combined "super server" for NNTP/HTTP/POP3 to reduce memory overhead
 
@@ -75,9 +76,9 @@ all need to be considered for everything we introduce)
 * linkify thread skeletons better
   https://public-inbox.org/git/6E3699DEA672430CAEA6DEFEDE6918F4@PhilipOakley/
 
-* low-memory Email::MIME replacement: currently we generate many
-  allocations/strings for headers we never look at and slurp
-  entire message bodies into memory.  GMime+Inline::C could work.
+* Further lower mail parser memory usage.  We still slurp entire
+  message bodies into memory and incur 2-3x overhead on
+  multipart messages.  Inline::C (and maybe gmime) could work.
 
 * use REQUEST_URI properly for CGI / mod_perl2 compatibility
   with Message-IDs which include '%' (done?)

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, back to index

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-10  6:59 [PATCH] various doc updates ahead of 1.5.0 Eric Wong

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git