user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@yhbt.net>
To: meta@public-inbox.org
Subject: [PATCH] doc: start writeup on semi-automatic memory management
Date: Fri, 17 Apr 2020 10:24:45 +0000	[thread overview]
Message-ID: <20200417102445.16161-1-e@yhbt.net> (raw)

I don't consider Perl's memory management "automatic".  Instead,
having an extra bit of control as a hacker is nice and there's
no need to burden ordinary users with GC tuning knobs.
---
 Documentation/technical/memory.txt | 50 ++++++++++++++++++++++++++++++
 MANIFEST                           |  1 +
 2 files changed, 51 insertions(+)
 create mode 100644 Documentation/technical/memory.txt

diff --git a/Documentation/technical/memory.txt b/Documentation/technical/memory.txt
new file mode 100644
index 00000000..bb1c92fd
--- /dev/null
+++ b/Documentation/technical/memory.txt
@@ -0,0 +1,50 @@
+semi-automatic memory management in public-inbox
+------------------------------------------------
+
+The majority of public-inbox is implemented in Perl 5, a
+language and interpreter not particularly known for being
+memory-efficient.
+
+We strive to keep processes small to improve locality, allow
+the kernel to cache more files, and to be a good neighbor to
+other processes running on the machine.  Taking advantage of
+automatic reference counting (ARC) in Perl allows us
+deterministically release memory back to the heap.
+
+We start with a simple data model with few circular
+references.  This both eases human understanding and reduces
+the likelyhood of bugs.
+
+Knowing the relative sizes and quantities of our data
+structures, we limit the scope of allocations as much as
+possible and keep large allocations shortest-lived.  This
+minimizes both the cognitive overhead on humans in addition
+to reducing memory pressure on the machine.
+
+Short-lived non-immortal closures (aka "anonymous subs") are
+avoided in long-running daemons unless required for
+compatibility with PSGI.  Closures are memory-intensive and
+may make allocation lifetimes less obvious to humans.  They
+are also the source of memory leaks in older versions of
+Perl, including 5.16.3 found in enterprise distros.
+
+We also use Perl's `delete' and `undef' built-ins to drop
+reference counts sooner than scope allows.  These functions
+are required to break the few reference cycles we have that
+would otherwise lead to leaks.
+
+Of note, `undef' may be used in two ways:
+
+1. to free(3) the underlying buffer:
+
+	undef $scalar;
+
+2. to reset a buffer but reduce realloc(3) on subsequent growth:
+
+	$scalar = "";		# useful when repeated appending
+	$scalar = undef;	# usually not needed
+
+In the future, our internal data model will be further
+flattened and simplified to reduce the overhead imposed by
+small objects.  Large allocations may also be avoided by
+optionally using Inline::C.
diff --git a/MANIFEST b/MANIFEST
index ba5cc6a4..c007f7d4 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -41,6 +41,7 @@ Documentation/reproducibility.txt
 Documentation/standards.perl
 Documentation/technical/data_structures.txt
 Documentation/technical/ds.txt
+Documentation/technical/memory.txt
 Documentation/technical/whyperl.txt
 Documentation/txt2pre
 HACKING

                 reply	other threads:[~2020-04-17 10:24 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200417102445.16161-1-e@yhbt.net \
    --to=e@yhbt.net \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).