* [PATCH 5/5] Fix some typos/grammar/errors in docs and comments
@ 2023-08-28 10:42 68% ` Štěpán Němec
0 siblings, 0 replies; 7+ results
From: Štěpán Němec @ 2023-08-28 10:42 UTC (permalink / raw)
To: meta
---
Please note the FIXME added in this patch: I lacked the confidence to
repair that paragraph on my own.
Documentation/RelNotes/v2.0.0.wip | 2 +-
Documentation/dc-dlvr-spam-flow.txt | 2 +-
Documentation/design_notes.txt | 10 ++++----
Documentation/design_www.txt | 12 ++++-----
Documentation/lei.pod | 2 +-
Documentation/public-inbox-config.pod | 10 ++++----
Documentation/public-inbox-daemon.pod | 20 ++++++++-------
Documentation/public-inbox-glossary.pod | 6 ++---
Documentation/public-inbox-learn.pod | 4 +--
Documentation/public-inbox-purge.pod | 4 +--
Documentation/public-inbox-tuning.pod | 12 ++++-----
Documentation/public-inbox-v2-format.pod | 6 ++---
Documentation/public-inbox-watch.pod | 4 +--
Documentation/reproducibility.txt | 4 +--
Documentation/standards.perl | 4 +--
Documentation/technical/data_structures.txt | 28 ++++++++++-----------
Documentation/technical/ds.txt | 6 ++---
Documentation/technical/memory.txt | 2 +-
Documentation/technical/whyperl.txt | 20 +++++++--------
HACKING | 14 +++++------
INSTALL | 4 +--
README | 16 ++++++------
TODO | 6 ++---
ci/README | 2 +-
ci/profiles.sh | 2 +-
devel/README | 2 +-
examples/varnish-4.vcl | 2 +-
lib/PublicInbox/DS.pm | 4 +--
lib/PublicInbox/Daemon.pm | 2 +-
sa_config/README | 4 +--
script/public-inbox-mda | 4 +--
scripts/README | 2 +-
32 files changed, 111 insertions(+), 111 deletions(-)
diff --git a/Documentation/RelNotes/v2.0.0.wip b/Documentation/RelNotes/v2.0.0.wip
index cccf11ae587d..40c87169ccd9 100644
--- a/Documentation/RelNotes/v2.0.0.wip
+++ b/Documentation/RelNotes/v2.0.0.wip
@@ -60,7 +60,7 @@
* fix `lei q -tt' on locally-indexed messages (still broken for remotes:
https://public-inbox.org/meta/20230226170931.M947721@dcvr/ )
- * `lei import' now set labels+keywords consistently on all
+ * `lei import' now sets labels+keywords consistently on all
already-imported messages
solver (used by lei (rediff|blob), and PublicInbox::WWW)
diff --git a/Documentation/dc-dlvr-spam-flow.txt b/Documentation/dc-dlvr-spam-flow.txt
index d151d272d0ae..6210fc7dcff4 100644
--- a/Documentation/dc-dlvr-spam-flow.txt
+++ b/Documentation/dc-dlvr-spam-flow.txt
@@ -39,7 +39,7 @@ delivery path as well as removing the message from the git tree.
* incron - run commands based on filesystem events: http://incron.aiken.cz/
-* sendmail / MTA - we use and recommend use postfix, which includes a
+* sendmail / MTA - we use and recommend postfix, which includes a
sendmail-compatible wrapper: http://www.postfix.org/
* spamc / spamd - SpamAssassin: http://spamassassin.apache.org/
diff --git a/Documentation/design_notes.txt b/Documentation/design_notes.txt
index 3df5af3e3cf2..95f025560c9e 100644
--- a/Documentation/design_notes.txt
+++ b/Documentation/design_notes.txt
@@ -52,15 +52,15 @@ Why email?
There is no need to ask the NSA for backups of your mail archives :)
* git, one of the most widely-used version control systems, includes many
- tools for for email, including: git-format-patch(1), git-send-email(1),
+ tools for email, including: git-format-patch(1), git-send-email(1),
git-am(1), git-imap-send(1). Furthermore, the development of git itself
is based on the git mailing list: https://public-inbox.org/git/
(or
http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/git/
- for Tor users)
+ for Tor users).
* Email is already the de-facto form of communication in many Free Software
- communities..
+ communities.
* Fallback/transition to private email and other lists, in case the
public-inbox host becomes unavailable, users may still directly email
@@ -76,13 +76,13 @@ Why git?
* As of 2016, git is widely used and known to nearly all Free Software
developers. For non-developers it is packaged for all major GNU/Linux
- and *BSD distributions. NNTP is not as widely-used nowadays, and
+ and *BSD distributions. NNTP is not as widely used nowadays, and
most IMAP clients do not have good support for read-only mailboxes.
Why perl 5?
-----------
-* Perl 5 is widely available on modern *nix systems with good a history
+* Perl 5 is widely available on modern *nix systems, with a good history
of backwards and forward compatibility.
* git and SpamAssassin both use it, so it should be one less thing for
diff --git a/Documentation/design_www.txt b/Documentation/design_www.txt
index b1f916ddb369..68488b1fa253 100644
--- a/Documentation/design_www.txt
+++ b/Documentation/design_www.txt
@@ -102,7 +102,7 @@ We also set <title> to make window management easier.
We favor <pre>-formatted text since public-inbox is intended as a place
to share and discuss patches and code. Unfortunately, long paragraphs
-tends to be less readable with fixed-width serif fonts which GUI
+tend to be less readable with fixed-width serif fonts which GUI
browsers default to.
* No graphics, images, or icons at all. We tolerate, but do not
@@ -122,12 +122,12 @@ browsers default to.
avoided as they do not render well with some displays or user-chosen
fonts.
-* No JavaScript. JS is historically too buggy and insecure, and we will
+* No JavaScript. JS is historically too buggy and insecure, and we will
never expect our readers to do either of the following:
- a) read and audit all our code for on every single page load
- b) trust us and and run code without reading it
+ a) read and audit all our code on every single page load
+ b) trust us and run code without reading it
-* We only use CSS for one reason: wrapping pre-formatted text
+* We only use CSS for one reason: wrapping pre-formatted text.
This is necessary because unfortunate GUI browsers tend to be
prone to layout widening from unwrapped mailers.
Do not expect CSS to be enabled, especially with scary things like:
@@ -141,4 +141,4 @@ CSS classes (for user-supplied CSS)
-----------------------------------
See examples in contrib/css/ and lib/PublicInbox/WwwText.pm
-(or https://public-inbox.org/meta/_/text/color/ soon)
+(or <https://public-inbox.org/meta/_/text/color/>)
diff --git a/Documentation/lei.pod b/Documentation/lei.pod
index f01f506af359..2b10f4906e1a 100644
--- a/Documentation/lei.pod
+++ b/Documentation/lei.pod
@@ -126,7 +126,7 @@ Other subcommands include
=head1 FILES
-By default storage is located at C<$XDG_DATA_HOME/lei/store>. The
+By default, storage is located at C<$XDG_DATA_HOME/lei/store>. The
configuration for lei resides at C<$XDG_CONFIG_HOME/lei/config>.
=head1 ERRORS
diff --git a/Documentation/public-inbox-config.pod b/Documentation/public-inbox-config.pod
index d175d2d74726..d2389abceb0e 100644
--- a/Documentation/public-inbox-config.pod
+++ b/Documentation/public-inbox-config.pod
@@ -191,7 +191,7 @@ Default: :all
The local path name of a CSS file for the PSGI web interface.
May contain the attributes "media", "title" and "href" which match
the associated attributes of the HTML <style> tag.
-"href" may be specified to point to the URL of an remote CSS file
+"href" may be specified to point to the URL of a remote CSS file
and the path may be "/dev/null" or any empty file.
Multiple files may be specified and will be included in the
order specified.
@@ -291,10 +291,10 @@ Default: /var/www/htdocs/cgit/cgit.cgi or /usr/lib/cgit/cgit.cgi
=item publicinbox.cgitdata
A path to the data directory used by cgit for storing static files.
-Typically guessed based the location of C<cgit.cgi> (from
-C<publicinbox.cgitbin>, but may be overridden.
+Typically guessed based on the location of C<cgit.cgi> (from
+C<publicinbox.cgitbin>), but may be overridden.
-Default: basename of C<publicinbox.cgitbin>, /var/www/htdocs/cgit/
+Default: dirname of C<publicinbox.cgitbin>, /var/www/htdocs/cgit/
or /usr/share/cgit/
=item publicinbox.cgit
@@ -311,7 +311,7 @@ Try using C<cgit> as the first choice, this is the default.
=item * fallback
Fall back to using C<cgit> only if our native, inbox-aware
-git code repository viewer doesn't recognized the URL.
+git code repository viewer doesn't recognize the URL.
=item * rewrite
diff --git a/Documentation/public-inbox-daemon.pod b/Documentation/public-inbox-daemon.pod
index 7121683325c7..c5c88bdd04fa 100644
--- a/Documentation/public-inbox-daemon.pod
+++ b/Documentation/public-inbox-daemon.pod
@@ -101,6 +101,8 @@ Default: 1
The default TLS certificate for HTTPS, IMAPS, NNTPS, POP3S and/or STARTTLS
support if the C<cert> option is not given with C<--listen>.
+=for comment FIXME this paragraph needs repair
+
Well-known TCP ports automatically get TLS or STARTTLS support
If using systemd-compatible socket activation and a TCP listener
on port well-known ports (563 is inherited, it is automatically
@@ -112,15 +114,15 @@ STARTTLS support.
The default TLS certificate key for the default C<--cert> or
per-listener C<cert=> option. The private key may be
-concatenated into the path used by the cert, in which case this
+concatenated into the cert file itself, in which case this
option is not needed.
=item --multi-accept INTEGER
-By default, each worker accepts one connection at-a-time to maximize
+By default, each worker accepts one connection at a time to maximize
fairness and minimize contention across multiple processes on a
shared listen socket. Accepting multiple connections at once may be
-useful in constrained deployments with few, heavily-loaded workers.
+useful in constrained deployments with few, heavily loaded workers.
Negative values enables a worker to accept all available clients at
once, possibly starving others in the process. C<-1> behaves like
C<multi_accept yes> in nginx; while C<0> (the default) is
@@ -137,7 +139,7 @@ Default: 0
=head1 SIGNALS
Most of our signal handling behavior is copied from L<nginx(8)>
-and/or L<starman(1)>; so it is possible to reuse common scripts
+and/or L<starman(1)>, so it is possible to reuse common scripts
for managing them.
=over 8
@@ -158,7 +160,7 @@ Reload config files associated with the process.
=item SIGTTIN
-Increase the number of running workers processes by one.
+Increase the number of running worker processes by one.
=item SIGTTOU
@@ -166,7 +168,7 @@ Decrease the number of running worker processes by one.
=item SIGWINCH
-Stop all running worker processes. SIGHUP or SIGTTIN
+Stop all running worker processes. SIGHUP or SIGTTIN
may be used to restart workers.
=item SIGQUIT
@@ -194,7 +196,7 @@ activation. See L<systemd.socket(5)> and L<sd_listen_fds(3)>.
=item PERL_INLINE_DIRECTORY
-Pointing this to point to a writable directory enables the use
+Pointing this to a writable directory enables the use
of L<Inline> and L<Inline::C> extensions which may provide
platform-specific performance improvements. Currently, this
enables the use of L<vfork(2)> which speeds up subprocess
@@ -211,8 +213,8 @@ created by a user. See L<Inline> and L<Inline::C> for more details.
There are two ways to upgrade a running process.
Users of process management systems with socket activation
-(L<systemd(1)> or similar) may rely on multiple instances For
-systemd, this means using two (or more) '@' instances for each
+(L<systemd(1)> or similar) may rely on multiple daemon instances.
+For systemd, this means using two (or more) '@' instances for each
service (e.g. C<SERVICENAME@INSTANCE>) as documented in
L<systemd.unit(5)>.
diff --git a/Documentation/public-inbox-glossary.pod b/Documentation/public-inbox-glossary.pod
index 3c9e2bd21283..d88539c8b0fb 100644
--- a/Documentation/public-inbox-glossary.pod
+++ b/Documentation/public-inbox-glossary.pod
@@ -25,7 +25,7 @@ C<over.sqlite3>
=item tid, THREADID
-A sequentially-assigned positive integer. These integers are
+A sequentially assigned positive integer. These integers are
per-inbox or per-extindex. In the future, this may be prefixed
with C<T> for JMAP (RFC 8621) and RFC 8474. This may not be
strictly compliant with RFC 8621 since inboxes and extindices
@@ -40,7 +40,7 @@ RFC-(822|2822|5322) email message.
=item IMAP EMAILID, JMAP Email Id
-To-be-decided. This will likely be the git blob ID prefixed with C<g>
+To be decided. This will likely be the git blob ID prefixed with C<g>
rather than the numeric UID to accommodate the same blob showing
up in both an extindex and inbox (or multiple extindices).
@@ -87,7 +87,7 @@ but it imports drafts.
For L<lei(1)> users only. This will allow lei users to place
the same email into one or more virtual folders for
-ease-of-filtering. This is NOT tied to public-inbox names, as
+ease of filtering. This is NOT tied to public-inbox names, as
messages stored by lei may not be public.
These are similar in spirit to arbitrary freeform "tags"
diff --git a/Documentation/public-inbox-learn.pod b/Documentation/public-inbox-learn.pod
index 3c92b1cc698b..f776df6b2bb0 100644
--- a/Documentation/public-inbox-learn.pod
+++ b/Documentation/public-inbox-learn.pod
@@ -54,7 +54,7 @@ This is similar to the C<spam> command above, but does
not feed the message to L<spamc(1)> and only removes messages
which match on any of the C<To:>, C<Cc:>, and C<List-ID:> headers.
-The C<--all> option may be used match C<spam> semantics in removing
+The C<--all> option may be used to match C<spam> semantics in removing
the message from all configured inboxes. C<--all> is only
available in public-inbox 1.6.0+.
@@ -82,7 +82,7 @@ L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
=head1 COPYRIGHT
-Copyright 2019-2021 all contributors L<mailto:meta@public-inbox.org>
+Copyright all contributors L<mailto:meta@public-inbox.org>
License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
diff --git a/Documentation/public-inbox-purge.pod b/Documentation/public-inbox-purge.pod
index 945286c69f97..1223b5775828 100644
--- a/Documentation/public-inbox-purge.pod
+++ b/Documentation/public-inbox-purge.pod
@@ -31,7 +31,7 @@ leads to discontiguous git history.
=item --all
Purge the message in all inboxes configured in ~/.public-inbox/config.
-This is an alternative to specifying individual inboxes directories
+This is an alternative to specifying individual inbox directories
on the command-line.
=back
@@ -74,7 +74,7 @@ L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
=head1 COPYRIGHT
-Copyright 2019-2021 all contributors L<mailto:meta@public-inbox.org>
+Copyright all contributors L<mailto:meta@public-inbox.org>
License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
diff --git a/Documentation/public-inbox-tuning.pod b/Documentation/public-inbox-tuning.pod
index 53668eccb7cb..58a4d9bcbabd 100644
--- a/Documentation/public-inbox-tuning.pod
+++ b/Documentation/public-inbox-tuning.pod
@@ -79,8 +79,8 @@ RAM. Attempts to parallelize random I/O on HDDs leads to pathological
slowdowns as inboxes grow.
While C<-V2> introduced Xapian shards as a parallelization
-mechanism for SSDs; enabling C<publicInbox.indexSequentialShard>
-repurposes sharding as mechanism to reduce the kernel page cache
+mechanism for SSDs, enabling C<publicInbox.indexSequentialShard>
+repurposes sharding as a mechanism to reduce the kernel page cache
footprint when indexing on HDDs.
Initializing a mirror with a high C<--jobs> count to create more
@@ -108,7 +108,7 @@ indices on btrfs to achieve acceptable performance (even on SSD).
Disabling copy-on-write also disables checksumming, thus C<raid1>
(or higher) configurations may be corrupt after unsafe shutdowns.
-Fortunately, these SQLite and Xapian indices are designed to
+Fortunately, these SQLite and Xapian indices are designed to be
recoverable from git if missing.
Disabling CoW does not prevent all fragmentation. Large values
@@ -125,7 +125,7 @@ C<btrfs filesystem defragment -fr $INBOX_DIR> may be necessary.
Large filesystems benefit significantly from the C<space_cache=v2>
mount option documented in L<btrfs(5)>.
-Older, non-CoW filesystems are generally work well out-of-the-box
+Older, non-CoW filesystems generally work well out of the box
for our Xapian and SQLite indices.
=head2 Performance on solid state drives
@@ -152,7 +152,7 @@ C<LimitNOFILE=> in L<systemd.exec(5)>) may need to be raised to
accommodate many concurrent clients.
Transport Layer Security (IMAPS, NNTPS, or via STARTTLS) significantly
-increases memory use of client sockets, sure to account for that in
+increases memory use of client sockets, be sure to account for that in
capacity planning.
=head2 Other OS tuning knobs
@@ -168,7 +168,7 @@ Other OSes may have similar tuning knobs (patches appreciated).
L<public-inbox-extindex(1)> allows any number of public-inboxes
to share the same Xapian indices.
-git 2.33+ startup time is orders-of-magnitude faster and uses
+git 2.33+ startup time is orders of magnitude faster and uses
less memory when dealing with thousands of alternates required
for thousands of inboxes with L<public-inbox-extindex(1)>.
diff --git a/Documentation/public-inbox-v2-format.pod b/Documentation/public-inbox-v2-format.pod
index e93d7fc701d9..de3b0bfd390f 100644
--- a/Documentation/public-inbox-v2-format.pod
+++ b/Documentation/public-inbox-v2-format.pod
@@ -30,7 +30,7 @@ databases for parallelism by "shards".
- all.git # empty, alternates to $EPOCH.git
- xap$SCHEMA_VERSION/$SHARD # per-shard Xapian DB
- xap$SCHEMA_VERSION/over.sqlite3 # OVER-view DB for NNTP, threading
- - msgmap.sqlite3 # same the v1 msgmap
+ - msgmap.sqlite3 # same as the v1 msgmap
For blob lookups, the reader only needs to open the "all.git"
repository with $GIT_DIR/objects/info/alternates which references
@@ -89,7 +89,7 @@ After-the-fact invocations of L<public-inbox-index> will ignore
messages written to 'd' after they are written to 'm'.
Deltafication is not significantly improved over v1, but overall
-storage for trees is made as as small as possible. Initial
+storage for trees is made as small as possible. Initial
statistics and benchmarks showing the benefits of this approach
are documented at:
@@ -97,7 +97,7 @@ L<https://public-inbox.org/meta/20180209205140.GA11047@dcvr/>
=head2 XAPIAN SHARDS
-Another second scalability problem in v1 was the inability to
+Another scalability problem in v1 was the inability to
utilize multiple CPU cores for Xapian indexing. This is
addressed by using shards in Xapian to perform import
indexing in parallel.
diff --git a/Documentation/public-inbox-watch.pod b/Documentation/public-inbox-watch.pod
index e8f97c8088c9..febda0b13df4 100644
--- a/Documentation/public-inbox-watch.pod
+++ b/Documentation/public-inbox-watch.pod
@@ -41,7 +41,7 @@ importing them into public-inbox git repositories and indices.
public-inbox-watch is useful in situations when a user wishes to
mirror an existing mailing list, but has no access to run
L<public-inbox-mda(1)> on a server. Unlike public-inbox-mda
-which is invoked once per-message, public-inbox-watch is a
+which is invoked once per message, public-inbox-watch is a
persistent process, making it faster for after-the-fact imports
of large Maildirs.
@@ -62,7 +62,7 @@ public-inbox-watch takes no command-line options.
=head1 CONFIGURATION
These configuration knobs should be used in the
-L<public-inbox-config(5)> file
+L<public-inbox-config(5)> file.
=over 8
diff --git a/Documentation/reproducibility.txt b/Documentation/reproducibility.txt
index 4e56ada48bb2..3336de731a4d 100644
--- a/Documentation/reproducibility.txt
+++ b/Documentation/reproducibility.txt
@@ -12,7 +12,7 @@ reproducible.
Keeping all communications as email ensures the full history
of the entire project can be mirrored by anyone with the
resources to do so. Compact, low-complexity data requires
-less resources to mirror, so sticking with plain-text
+less resources to mirror, so sticking with plain text
ensures more parties can mirror and potentially fork the
project with all its data.
@@ -26,4 +26,4 @@ If these things make power hungry project leaders and admins
uncomfortable, good. That was the point. It's how checks
and balances ought to work.
-Comments, corrections, etc welcome: meta@public-inbox.org
+Comments, corrections, etc. welcome: meta@public-inbox.org
diff --git a/Documentation/standards.perl b/Documentation/standards.perl
index c36afb5d718b..743cdee1ce24 100755
--- a/Documentation/standards.perl
+++ b/Documentation/standards.perl
@@ -11,11 +11,11 @@ Non-exhaustive list of standards public-inbox software attempts or
intends to implement. This list is intended to be a quick reference
for hackers and users.
-Given the goals of interoperability and accessibility; strict
+Given the goals of interoperability and accessibility, strict
conformance to standards is not always possible, but rather
best-effort taking into account real-world cases. In particular,
"obsolete" standards remain relevant as long as clients and
-data exists.
+data using them exist.
IETF RFCs
---------
diff --git a/Documentation/technical/data_structures.txt b/Documentation/technical/data_structures.txt
index 4dcf9ce609be..5ed21882b9f8 100644
--- a/Documentation/technical/data_structures.txt
+++ b/Documentation/technical/data_structures.txt
@@ -32,19 +32,19 @@ Per-message classes
Common abbreviation: $mime, $eml
Used by: PublicInbox::WWW, PublicInbox::SearchIdx
- An representation of an entire email, multipart or not.
+ A representation of an entire email, multipart or not.
An option to use libgmime or libmailutils may be supported
in the future for performance and memory use.
This can be a memory hog with big messages and giant
attachments, so our PublicInbox::WWW interface only keeps
- one object of this class in memory at-a-time.
+ one object of this class in memory at a time.
In other words, this is the "meat" of the message, whereas
$smsg (below) is just the "skeleton".
Our PublicInbox::V2Writable class may have two objects of this
- type in memory at-a-time for deduplication.
+ type in memory at a time for deduplication.
In public-inbox 1.4 and earlier, Email::MIME and its subclass,
PublicInbox::MIME were used. Despite still slurping,
@@ -61,10 +61,10 @@ Per-message classes
This is loaded from either the overview DB (over.sqlite3) or
the Xapian DB (docdata.glass), though the Xapian docdata
- is won't hold NNTP-only fields (Cc:/To:)
+ won't hold NNTP-only fields (Cc:/To:).
There may be hundreds or thousands of these objects in memory
- at-a-time, so fields are pruned if unneeded.
+ at a time, so fields are pruned if unneeded.
* PublicInbox::SearchThread::Msg - subclass of Smsg
Common abbreviation: $cont or $node
@@ -75,9 +75,9 @@ Per-message classes
Nowadays, this is a re-blessed $smsg with additional fields.
As with $smsg objects, there may be hundreds or thousands
- of these objects in memory at-a-time.
+ of these objects in memory at a time.
- We also do not use a linked-list for storing children as JWZ
+ We also do not use a linked list for storing children as JWZ
describes, but instead a Perl hashref for {children} which
becomes an arrayref upon sorting.
@@ -88,7 +88,7 @@ Per-inbox classes
* PublicInbox::Inbox - represents a single public-inbox
Common abbreviation: $ibx
- Used everywhere
+ Used everywhere.
This represents a "publicinbox" section in the config
file, see public-inbox-config(5) for details.
@@ -152,7 +152,7 @@ ad-hoc structures shared across packages
This holds the PSGI $env as well as any internal variables
used by various modules of PublicInbox::WWW.
- As with the PSGI $env, there is one per-active WWW
+ As with the PSGI $env, there is one per active WWW
request+response cycle. It does not exist for idle HTTP
clients.
@@ -174,8 +174,8 @@ daemon classes
Common abbreviation: $http
Used by: PublicInbox::DS, public-inbox-httpd
- Unlike PublicInbox::NNTP, this class no knowledge of any of
- the email or git-specific parts of public-inbox, only PSGI.
+ Unlike PublicInbox::NNTP, this class has no knowledge of any of
+ the email- or git-specific parts of public-inbox, only PSGI.
However, it supports APIs and behaviors (e.g. streaming large
responses) which PublicInbox::WWW may take advantage of.
@@ -188,7 +188,7 @@ daemon classes
This class calls non-blocking accept(2) or accept4(2) on a
listen socket to create new PublicInbox::HTTP and
- PublicInbox::HTTP instances.
+ PublicInbox::NNTP instances.
* PublicInbox::HTTPD
Common abbreviation: $httpd
@@ -197,9 +197,9 @@ daemon classes
wrappers around client sockets accepted from
PublicInbox::Listener.
- Since the SERVER_NAME and SERVER_PORT PSGI variables needs to be
+ Since the SERVER_NAME and SERVER_PORT PSGI variables need to be
exposed for HTTP/1.0 requests when Host: headers are missing,
- this is per-Listener socket.
+ this is per Listener socket.
* PublicInbox::HTTPD::Async
Common abbreviation: $async
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
index 4cfb62fe44c8..afead2f155e0 100644
--- a/Documentation/technical/ds.txt
+++ b/Documentation/technical/ds.txt
@@ -19,7 +19,7 @@ Most notably:
triggers a call.
The lack of read/write callback distinction is driven by the
- fact TLS libraries (e.g. OpenSSL via IO::Socket::SSL) may
+ fact that TLS libraries (e.g. OpenSSL via IO::Socket::SSL) may
declare SSL_WANT_READ on SSL_write(), and SSL_WANT_READ on
SSL_read(). So we end up having to let each user object decide
whether it wants to make read or write calls depending on its
@@ -35,7 +35,7 @@ Most notably:
Reducing the user-supplied code down to a single callback allows
subclasses to keep their logic self-contained. The combination
of this change and one-shot wakeups (see below) for bidirectional
- data flows make asynchronous code easier to reason about.
+ data flows makes asynchronous code easier to reason about.
Other divergences:
@@ -53,7 +53,7 @@ Other divergences:
Augmented features:
-* obj->write(CODEREF) passes the object itself to the CODEREF
+* obj->write(CODEREF) passes the object itself to the CODEREF.
Being able to enqueue subroutine calls is a powerful feature in
Danga::Socket for keeping linear logic in an asynchronous environment.
Unfortunately, each subroutine takes several kilobytes of memory.
diff --git a/Documentation/technical/memory.txt b/Documentation/technical/memory.txt
index a35b2c734409..039694c33441 100644
--- a/Documentation/technical/memory.txt
+++ b/Documentation/technical/memory.txt
@@ -8,7 +8,7 @@ memory-efficient.
We strive to keep processes small to improve locality, allow
the kernel to cache more files, and to be a good neighbor to
other processes running on the machine. Taking advantage of
-automatic reference counting (ARC) in Perl allows us
+automatic reference counting (ARC) in Perl allows us to
deterministically release memory back to the heap.
We start with a simple data model with few circular
diff --git a/Documentation/technical/whyperl.txt b/Documentation/technical/whyperl.txt
index fbe2e1b16e06..db1d9793a76a 100644
--- a/Documentation/technical/whyperl.txt
+++ b/Documentation/technical/whyperl.txt
@@ -21,7 +21,7 @@ Good Things
Perl 5 is installed on many, if not most GNU/Linux and
BSD-based servers and workstations. It is likely the most
- widely-installed programming environment that offers a
+ widely installed programming environment that offers a
significant amount of POSIX functionality. Users won't
have to waste bandwidth or space with giant toolchains or
architecture-specific binaries.
@@ -47,8 +47,8 @@ Good Things
* Predictable performance
- While Perl is neither fast or memory-efficient, its
- performance and memory use are predictable and does not
+ While Perl is neither fast nor memory-efficient, its
+ performance and memory use are predictable and do not
require GC tuning by the user.
public-inbox is developed for (and mostly on) old
@@ -56,7 +56,7 @@ Good Things
late 1990s, and any cheap VPS today has more than enough
RAM and CPU for handling plain-text email.
- Low hardware requirements increases the reach of our software
+ Low hardware requirements increase the reach of our software
to more users, improving centralization resistance.
* Compatibility
@@ -86,7 +86,7 @@ Good Things
There should be no need to rely on language-specific
package managers such as cpan(1), those systems increase
- the learning curve for users and systems administrators.
+ the learning curve for users and system administrators.
* Compactness and terseness
@@ -98,7 +98,7 @@ Good Things
* Performance ceiling and escape hatch
With optional Inline::C, we can be "as fast as C" in some
- cases. Inline::C is widely-packaged by distros and it
+ cases. Inline::C is widely packaged by distros and it
gives us an escape hatch for dealing with missing bindings
or performance problems should they arise. Inline::C use
(as opposed to XS) also preserves the software freedom and
@@ -135,7 +135,7 @@ Bad Things
(m//, substr(), index(), etc.) still require memory copies
into userspace, negating a benefit of zero-copy.
-* The XS/C API make it difficult to improve internals while
+* The XS/C API makes it difficult to improve internals while
preserving compatibility.
* Lack of optional type checking. This may be a blessing in
@@ -161,14 +161,14 @@ Red herrings to ignore when evaluating other runtimes
-----------------------------------------------------
These don't discount a language or runtime from being
-being used, they're just not interesting.
+used, they're just not interesting.
* Lightweight threading
While lightweight threading implementations are
- convenient, they tend to be significantly heavier than a
+ convenient, they tend to be significantly heavier than
pure event-loop systems (or multi-threaded event-loop
- systems)
+ systems).
Lightweight threading implementations have stack overhead
and growth typically measured in kilobytes. The userspace
diff --git a/HACKING b/HACKING
index df68b54d0f40..18ec74206c45 100644
--- a/HACKING
+++ b/HACKING
@@ -7,7 +7,7 @@ It is archived at: https://public-inbox.org/meta/
and http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/ (using Tor)
Contributions are email-driven, just like contributing to git
-itself or the Linux kernel; however anonymous and pseudonymous
+itself or the Linux kernel; nevertheless, anonymous and pseudonymous
contributions will always be welcome.
Please consider our goals in mind:
@@ -15,17 +15,17 @@ Please consider our goals in mind:
Decentralization, Accessibility, Compatibility, Performance
These goals apply to everyone: users viewing over the web or NNTP,
-sysadmins running public-inbox, and other hackers working public-inbox.
+sysadmins running public-inbox, and other hackers working on public-inbox.
We will reject any feature which advocates or contributes to any
-particular instance of a public-inbox becoming a single point of failure.
+particular instance of public-inbox becoming a single point of failure.
Things we've considered but rejected include:
* exposing article serial numbers outside of NNTP
* allowing readers to inject metadata (e.g. votes)
We care about being accessible to folks with vision problems and/or
-lack the computing resources to view so-called "modern" websites.
+lacking the computing resources to view so-called "modern" websites.
This includes folks on slow connections and ancient browsers which
may be too difficult to upgrade due to resource demands.
@@ -45,7 +45,7 @@ Just-Ahead-of-Time-compiled C (via Inline::C)
Do not recurse on user-supplied data. Neither Perl or C handle
deep recursion gracefully. See lib/PublicInbox/SearchThread.pm
and lib/PublicInbox/MsgIter.pm for examples of non-recursive
-alternatives to previously-recursive algorithms.
+alternatives to previously recursive algorithms.
Performance should be reasonably good for server administrators, too,
and we will sacrifice features to achieve predictable performance.
@@ -61,8 +61,6 @@ on specific topics, in particular data_structures.txt
Optional packages for testing and development
---------------------------------------------
-Optional packages testing and development:
-
- Plack::Test deb: libplack-test-perl
pkg: p5-Plack
rpm: perl-Plack-Test
@@ -107,6 +105,6 @@ Perl notes
----------
* \w, \s, \d character classes all match Unicode characters;
- so write out class ranges (e.g "[0-9]") if you only intend to
+ so write out class ranges (e.g., "[0-9]") if you only intend to
match ASCII. Do not use the "/a" (ASCII) modifier, that requires
Perl 5.14 and we're only depending on 5.10.1 at the moment.
diff --git a/INSTALL b/INSTALL
index 91e590ce3318..f5e14ebe73d4 100644
--- a/INSTALL
+++ b/INSTALL
@@ -1,7 +1,7 @@
public-inbox (server-side) installation
---------------------------------------
-This is for folks who want to setup their own public-inbox instance.
+This is for folks who want to set up their own public-inbox instance.
Clients should use normal git-clone/git-fetch, IMAP or NNTP clients
if they want to import mail into their personal inboxes.
@@ -135,7 +135,7 @@ Numerous optional modules are likely to be useful as well:
foreground servers)
The following module is typically pulled in by dependencies listed
-above, so there is no need to explicitly install them:
+above, so there is no need to explicitly install it:
- DBI deb: libdbi-perl
pkg: p5-DBI
diff --git a/README b/README
index abe8ddc0075f..a9aa0e864ca2 100644
--- a/README
+++ b/README
@@ -17,7 +17,7 @@ public-inbox spawned around three main ideas:
communication. Users may have broken graphics drivers, limited
eyesight, or be unable to afford modern hardware.
-public-inbox aims to be easy-to-deploy and manage; encouraging projects
+public-inbox aims to be easy to deploy and manage, encouraging projects
to run their own instances with minimal overhead.
Implementation
@@ -27,7 +27,7 @@ public-inbox stores mail in git repositories as documented
in https://public-inbox.org/public-inbox-v2-format.txt and
https://public-inbox.org/public-inbox-v1-format.txt
-By storing (and optionally) exposing an inbox via git, it is
+By storing and (optionally) exposing an inbox via git, it is
fast and efficient to host and mirror public-inboxes.
Traditional mailing lists use the "push" model. For readers,
@@ -42,11 +42,11 @@ follow the list via NNTP, IMAP, POP3, Atom feed or HTML archives.
If a reader loses interest, they simply stop following.
-Since we use git, mirrors are easy-to-setup, and lists are
-easy-to-relocate to different mail addresses without losing
+Since we use git, mirrors are easy to set up, and lists are
+easy to relocate to different mail addresses without losing
or splitting archives.
-_Anybody_ may also setup a delivery-only mailing list server to
+_Anybody_ may also set up a delivery-only mailing list server to
replay a public-inbox git archive to subscribers via SMTP.
Features
@@ -111,7 +111,7 @@ and pull requests to our public-inbox address at:
Please Cc: all recipients when replying as we do not require
subscription. This also makes it easier to rope in folks of
-tangentially related projects we depend on (e.g. git developers
+tangentially related projects we depend on (e.g., git developers
on git@vger.kernel.org).
The archives are readable via IMAP, NNTP or HTTP:
@@ -155,8 +155,8 @@ This improves accessibility, and saves bandwidth and storage
as mail is archived forever.
As of the 2010s, successful online social networks and forums are the
-ones which heavily restrict users formatting options; so public-inbox
-aims to preserve the focus on content, and not presentation.
+ones which heavily restrict users' formatting options; public-inbox
+aims to preserve the focus on content, not presentation.
Copyright
---------
diff --git a/TODO b/TODO
index 77453eba27ac..de628e2e310a 100644
--- a/TODO
+++ b/TODO
@@ -1,8 +1,8 @@
TODO items for public-inbox
(Not in any particular order, and
-performance, ease-of-setup, installation, maintainability, etc
-all need to be considered for everything we introduce)
+performance, ease of setup, installation, maintainability, etc.
+all need to be considered for everything we introduce.)
* general performance improvements, but without relying on
XS or pre-built modules any more than we currently do.
@@ -32,7 +32,7 @@ all need to be considered for everything we introduce)
portability to older Linux, free BSDs and maybe Hurd).
* dogfood latest Xapian, Perl5, SQLite, git and various modules to
- ensure things continue working as they should (or more better)
+ ensure things continue working as they should (or better)
while retaining compatibility with old versions.
* Support more of RFC 3977 (NNTP)
diff --git a/ci/README b/ci/README
index 4687fbc57059..728d82a0052c 100644
--- a/ci/README
+++ b/ci/README
@@ -27,7 +27,7 @@ run in the top-level source tree, that is, as `./ci/run.sh'.
or doing development. However, it can be convenient to for
users to mass-install several packages.
-* ci/profiles.sh - prints to-be tested package profile for the current OS
+* ci/profiles.sh - prints to-be-tested package profile for the current OS
Called automatically by ci/run.sh
The output is read by ci/run.sh
diff --git a/ci/profiles.sh b/ci/profiles.sh
index e58b61d50a13..55b998d73633 100755
--- a/ci/profiles.sh
+++ b/ci/profiles.sh
@@ -2,7 +2,7 @@
# Copyright (C) 2019-2021 all contributors <meta@public-inbox.org>
# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-# Prints OS-specific package profiles to stdout (one per-newline) to use
+# Prints OS-specific package profiles to stdout (one per line) to use
# as command-line args for ci/deps.perl. Called automatically by ci/run.sh
# set by os-release(5) or similar
diff --git a/devel/README b/devel/README
index 8f9a0485ec3f..c4be51415d34 100644
--- a/devel/README
+++ b/devel/README
@@ -1 +1 @@
-scripts use for public-inbox development that don't belong in t/
+scripts used for public-inbox development that don't belong in t/
diff --git a/examples/varnish-4.vcl b/examples/varnish-4.vcl
index 5fc202ed4f36..624f60133599 100644
--- a/examples/varnish-4.vcl
+++ b/examples/varnish-4.vcl
@@ -28,7 +28,7 @@ sub vcl_recv {
}
sub vcl_pipe {
- # By default Connection: close is set on all piped requests by varnish,
+ # By default, Connection: close is set on all piped requests by varnish,
# but public-inbox-httpd supports persistent connections well :)
unset bereq.http.connection;
return (pipe);
diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 98084b5c8a0a..e89dc4306c7b 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -209,8 +209,8 @@ sub await_cb ($;@) {
warn "E: awaitpid($pid): $@" if $@;
}
-# This relies on our Perl process is single-threaded, or at least
-# no threads are spawning and waiting on processes (``, system(), etc...)
+# This relies on our Perl process being single-threaded, or at least
+# no threads spawning and waiting on processes (``, system(), etc...)
# Threads are officially discouraged by the Perl5 team, and I expect
# that to remain the case.
sub reap_pids {
diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index 30442227bdf8..88b0fa45bbb6 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -155,7 +155,7 @@ options:
-l ADDRESS address to listen on$dh
--cert=FILE default SSL/TLS certificate
- --key=FILE default SSL/TLS certificate
+ --key=FILE default SSL/TLS certificate key
-W WORKERS number of worker processes to spawn (default: 1)
See public-inbox-daemon(8) and $prog(1) man pages for more.
diff --git a/sa_config/README b/sa_config/README
index 6703c38fe1ae..3705e1e85d1b 100644
--- a/sa_config/README
+++ b/sa_config/README
@@ -4,9 +4,9 @@ SpamAssassin configs for public-inbox.org
root/ - files for system-wide use (plugins, rule definitions,
new rules should have a zero score which should be overridden)
user/ - per-user config (keep as much in here as possible)
- These files go into the users home directory
+ These files go into the user's home directory.
-All files in these example directory are CC0:
+All files in these example directories are CC0:
To the extent possible under law, Eric Wong has waived all copyright and
related or neighboring rights to these examples.
diff --git a/script/public-inbox-mda b/script/public-inbox-mda
index 7e2bee92096e..ba4989569e25 100755
--- a/script/public-inbox-mda
+++ b/script/public-inbox-mda
@@ -33,8 +33,8 @@ use PublicInbox::Filter::Base;
use PublicInbox::InboxWritable;
use PublicInbox::Spamcheck;
-# n.b: hopefully we can setup the emergency path without bailing due to
-# user error, we really want to setup the emergency destination ASAP
+# n.b.: Hopefully we can set up the emergency path without bailing due to
+# user error, we really want to set up the emergency destination ASAP
# in case there's bugs in our code or user error.
my $emergency = $ENV{PI_EMERGENCY} || "$ENV{HOME}/.public-inbox/emergency/";
$ems = PublicInbox::Emergency->new($emergency);
diff --git a/scripts/README b/scripts/README
index 3b9c37da8787..7ffbd93cb994 100644
--- a/scripts/README
+++ b/scripts/README
@@ -1,5 +1,5 @@
This directory contains informal scripts and random tools used
-in the development of public-inbox. Some only exist only for
+in the development of public-inbox. Some only exist for
historical purposes, and some may not work anymore.
See the "script/" directory (not "scripts/") for supported and
--
2.42.0
^ permalink raw reply related [relevance 68%]
* [PATCH 2/6] doc: technical/ds: update blurb to note more daemons
@ 2023-03-09 19:28 99% ` Eric Wong
0 siblings, 0 replies; 7+ results
From: Eric Wong @ 2023-03-09 19:28 UTC (permalink / raw)
To: meta
And add a note about the various wakeup modes of kqueue|epoll
while we're at it; we use all of them!
---
Documentation/technical/ds.txt | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
index 89cc05af..4cfb62fe 100644
--- a/Documentation/technical/ds.txt
+++ b/Documentation/technical/ds.txt
@@ -1,9 +1,14 @@
PublicInbox::DS - event loop and async I/O base class
-Our PublicInbox::DS event loop which powers public-inbox-nntpd
-and public-inbox-httpd diverges significantly from the
-unmaintained Danga::Socket package we forked from. In fact,
-it's probably different from most other event loops out there.
+Our PublicInbox::DS event loop which powers most of our long-lived
+processes(*) diverges significantly from the unmaintained Danga::Socket
+package we forked from. In fact, it's probably different from most
+other event loops out there.
+
+Most notably, it uses one-shot, level-trigger, and edge-trigger mode
+modes of kqueue|epoll depending on the situation.
+
+(*) public-inbox-netd,(-httpd,-imapd,-nntpd,-pop3d,-watch) + lei-daemon
Most notably:
^ permalink raw reply related [relevance 99%]
* [PATCH 12/12] ds: drop dwaitpid, switch to waitpid(-1)
@ 2023-01-17 7:19 83% ` Eric Wong
0 siblings, 0 replies; 7+ results
From: Eric Wong @ 2023-01-17 7:19 UTC (permalink / raw)
To: meta
With no remaining users, we can drop dwaitpid and switch
awaitpid to rely on waitpid(-1) to save syscalls.
---
Documentation/technical/ds.txt | 2 +-
lib/PublicInbox/DS.pm | 68 +++++++---------------------------
2 files changed, 15 insertions(+), 55 deletions(-)
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
index 5a1655a1..89cc05af 100644
--- a/Documentation/technical/ds.txt
+++ b/Documentation/technical/ds.txt
@@ -81,7 +81,7 @@ New features
* IO::Socket::SSL support (for NNTPS, STARTTLS+NNTP, HTTPS)
-* dwaitpid (waitpid wrapper) support for reaping dead children
+* awaitpid (waitpid wrapper) support for reaping dead children
* reliable signal wakeups are supported via signalfd on Linux,
EVFILT_SIGNAL on *BSDs via IO::KQueue.
diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 9563a1cb..c849f515 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -32,11 +32,10 @@ use PublicInbox::Syscall qw(:epoll);
use PublicInbox::Tmpfile;
use Errno qw(EAGAIN EINVAL);
use Carp qw(carp croak);
-our @EXPORT_OK = qw(now msg_more dwaitpid awaitpid add_timer add_uniq_timer);
+our @EXPORT_OK = qw(now msg_more awaitpid add_timer add_uniq_timer);
my %Stack;
my $nextq; # queue for next_tick
-my $wait_pids; # list of [ pid, callback, callback_arg ]
my $AWAIT_PIDS; # pid => [ $callback, @args ]
my $reap_armed;
my $ToClose; # sockets to close when event loop is done
@@ -75,11 +74,11 @@ sub Reset {
# we may be iterating inside one of these on our stack
my @q = delete @Stack{keys %Stack};
for my $q (@q) { @$q = () }
- $AWAIT_PIDS = $wait_pids = $nextq = $ToClose = undef;
+ $AWAIT_PIDS = $nextq = $ToClose = undef;
$ep_io = undef; # closes real $Epoll FD
$Epoll = undef; # may call DSKQXS::DESTROY
- } while (@Timers || keys(%Stack) || $nextq || $wait_pids ||
- $ToClose || keys(%DescriptorMap) || $AWAIT_PIDS ||
+ } while (@Timers || keys(%Stack) || $nextq || $AWAIT_PIDS ||
+ $ToClose || keys(%DescriptorMap) ||
$PostLoopCallback || keys(%UniqTimer));
$reap_armed = undef;
@@ -209,43 +208,23 @@ sub await_cb ($;@) {
warn "E: awaitpid($pid): $@" if $@;
}
-# We can't use waitpid(-1) safely here since it can hit ``, system(),
-# and other things. So we scan the $wait_pids list, which is hopefully
-# not too big. We keep $wait_pids small by not calling dwaitpid()
-# until we've hit EOF when reading the stdout of the child.
-
+# This relies on our Perl process is single-threaded, or at least
+# no threads are spawning and waiting on processes (``, system(), etc...)
+# Threads are officially discouraged by the Perl5 team, and I expect
+# that to remain the case.
sub reap_pids {
$reap_armed = undef;
- my $tmp = $wait_pids // [];
- $wait_pids = undef;
- $Stack{reap_runq} = $tmp;
my $oldset = block_signals();
-
- # old API
- foreach my $ary (@$tmp) {
- my ($pid, $cb, $arg) = @$ary;
- my $ret = waitpid($pid, WNOHANG);
- if ($ret == 0) {
- push @$wait_pids, $ary; # autovivifies @$wait_pids
- } elsif ($ret == $pid) {
- if ($cb) {
- eval { $cb->($arg, $pid) };
- warn "E: dwaitpid($pid) in_loop: $@" if $@;
- }
+ while (1) {
+ my $pid = waitpid(-1, WNOHANG) // last;
+ last if $pid <= 0;
+ if (defined(my $cb_args = delete $AWAIT_PIDS->{$pid})) {
+ await_cb($pid, @$cb_args) if $cb_args;
} else {
- warn "waitpid($pid, WNOHANG) = $ret, \$!=$!, \$?=$?";
+ warn "W: reaped unknown PID=$pid: \$?=$?\n";
}
}
-
- # new API TODO: convert to waitpid(-1) in the future as long
- # as we don't use threads
- for my $pid (keys %$AWAIT_PIDS) {
- my $wpid = waitpid($pid, WNOHANG) // next;
- my $cb_args = delete $AWAIT_PIDS->{$wpid} or next;
- await_cb($pid, @$cb_args);
- }
sig_setmask($oldset);
- delete $Stack{reap_runq};
}
# reentrant SIGCHLD handler (since reap_pids is not reentrant)
@@ -719,25 +698,6 @@ sub long_response ($$;@) {
undef;
}
-sub dwaitpid ($;$$) {
- my ($pid, $cb, $arg) = @_;
- if ($in_loop) {
- push @$wait_pids, [ $pid, $cb, $arg ];
- # We could've just missed our SIGCHLD, cover it, here:
- enqueue_reap();
- } else {
- my $ret = waitpid($pid, 0);
- if ($ret == $pid) {
- if ($cb) {
- eval { $cb->($arg, $pid) };
- carp "E: dwaitpid($pid) !in_loop: $@" if $@;
- }
- } else {
- carp "waitpid($pid, 0) = $ret, \$!=$!, \$?=$?";
- }
- }
-}
-
sub awaitpid {
my ($pid, @cb_args) = @_;
$AWAIT_PIDS->{$pid} //= @cb_args ? \@cb_args : 0;
^ permalink raw reply related [relevance 83%]
* [PATCH 12/12] httpd/async: switch to level-triggered epoll
@ 2021-10-16 1:01 85% ` Eric Wong
0 siblings, 0 replies; 7+ results
From: Eric Wong @ 2021-10-16 1:01 UTC (permalink / raw)
To: meta
We'll save ourselves some code here and let the kernel do more
work, instead.
---
Documentation/technical/ds.txt | 3 +--
lib/PublicInbox/HTTPD/Async.pm | 16 +++++-----------
lib/PublicInbox/Qspawn.pm | 1 -
3 files changed, 6 insertions(+), 14 deletions(-)
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
index 7bc1ad79ce0c..5a1655a1450e 100644
--- a/Documentation/technical/ds.txt
+++ b/Documentation/technical/ds.txt
@@ -77,8 +77,7 @@ New features
which (if any) events it's interested in for the next loop iteration.
* Edge-triggering available via EPOLLET or EV_CLEAR. These reduce wakeups
- for unidirectional classes (e.g. PublicInbox::Listener sockets,
- and pipes via PublicInbox::HTTPD::Async).
+ for unidirectional classes when throughput is more important than fairness.
* IO::Socket::SSL support (for NNTPS, STARTTLS+NNTP, HTTPS)
diff --git a/lib/PublicInbox/HTTPD/Async.pm b/lib/PublicInbox/HTTPD/Async.pm
index 7238650aff97..1651da88ac03 100644
--- a/lib/PublicInbox/HTTPD/Async.pm
+++ b/lib/PublicInbox/HTTPD/Async.pm
@@ -17,7 +17,7 @@ package PublicInbox::HTTPD::Async;
use strict;
use parent qw(PublicInbox::DS);
use Errno qw(EAGAIN);
-use PublicInbox::Syscall qw(EPOLLIN EPOLLET);
+use PublicInbox::Syscall qw(EPOLLIN);
# This is called via: $env->{'pi-httpd.async'}->()
# $io is a read-only pipe ($rpipe) for now, but may be a
@@ -39,7 +39,7 @@ sub new {
}, $class;
my $pp = tied *$io;
$pp->{fh}->blocking(0) // die "$io->blocking(0): $!";
- $self->SUPER::new($io, EPOLLIN | EPOLLET);
+ $self->SUPER::new($io, EPOLLIN);
}
sub event_step {
@@ -54,15 +54,12 @@ sub event_step {
my $r = sysread($sock, my $buf, 65536);
if ($r) {
$self->{fh}->write($buf); # may call $http->close
- if ($http->{sock}) { # !closed
- $self->requeue;
- # let other clients get some work done, too
- return;
- }
+ # let other clients get some work done, too
+ return if $http->{sock}; # !closed
# else: fall through to close below...
} elsif (!defined $r && $! == EAGAIN) {
- return; # EPOLLET means we'll be notified
+ return; # EPOLLIN means we'll be notified
}
# Done! Error handling will happen in $self->{fh}->close
@@ -89,9 +86,6 @@ sub async_pass {
$self->{http} = $http;
$self->{fh} = $fh;
-
- # either hit EAGAIN or ->requeue to keep EPOLLET happy
- event_step($self);
}
# may be called as $forward->close in PublicInbox::HTTP or EOF (event_step)
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index a1ff65b65324..53d0ad55ee84 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -192,7 +192,6 @@ sub event_step {
sub rd_hdr ($) {
my ($self) = @_;
# typically used for reading CGI headers
- # we must loop until EAGAIN for EPOLLET in HTTPD/Async.pm
# We also need to check EINTR for generic PSGI servers.
my $ret;
my $total_rd = 0;
^ permalink raw reply related [relevance 85%]
* [PATCH] favor git(1) rather than libgit2 for ExtSearch
@ 2021-06-24 5:50 65% Eric Wong
0 siblings, 0 replies; 7+ results
From: Eric Wong @ 2021-06-24 5:50 UTC (permalink / raw)
To: meta
While both git and libgit2 take around 16 minutes to load 100K
alternates there's already a proposed patch to make git faster:
<https://lore.kernel.org/git/20210624005806.12079-1-e@80x24.org/>
It's also easier to patch and install git locally since the
git.git build system defaults to prefix=$HOME and dealing with
dynamic linking with libgit2 is more difficult for end users
relying on Inline::C.
libgit2 remains in use for the non-ALL.git case, but maybe it's
not necessary (libgit2 is significantly slower than git in
Debian 10 due to SHA-1 collision checking).
---
Documentation/technical/ds.txt | 2 +-
lib/PublicInbox/GitAsyncCat.pm | 21 +++++++++++++--------
lib/PublicInbox/GzipFilter.pm | 3 +--
lib/PublicInbox/HTTPD.pm | 2 +-
lib/PublicInbox/IMAP.pm | 10 +++++-----
lib/PublicInbox/NNTP.pm | 4 ++--
lib/PublicInbox/SolverGit.pm | 3 +--
7 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
index a0793ca2..7bc1ad79 100644
--- a/Documentation/technical/ds.txt
+++ b/Documentation/technical/ds.txt
@@ -64,7 +64,7 @@ Augmented features:
* ->requeue support. An optimization of the AddTimer(0, ...) idiom
for immediately dispatching code at the next event loop iteration.
public-inbox uses this for fairly generating large responses
- iteratively (see PublicInbox::NNTP::long_response or git_async_cat
+ iteratively (see PublicInbox::NNTP::long_response or ibx_async_cat
for blob retrievals).
New features
diff --git a/lib/PublicInbox/GitAsyncCat.pm b/lib/PublicInbox/GitAsyncCat.pm
index 7d1a13db..57c194d9 100644
--- a/lib/PublicInbox/GitAsyncCat.pm
+++ b/lib/PublicInbox/GitAsyncCat.pm
@@ -8,7 +8,7 @@ use strict;
use parent qw(PublicInbox::DS Exporter);
use POSIX qw(WNOHANG);
use PublicInbox::Syscall qw(EPOLLIN EPOLLET);
-our @EXPORT = qw(git_async_cat git_async_prefetch);
+our @EXPORT = qw(ibx_async_cat ibx_async_prefetch);
use PublicInbox::Git ();
our $GCF2C; # singleton PublicInbox::Gcf2Client
@@ -45,12 +45,16 @@ sub event_step {
}
}
-sub git_async_cat ($$$$) {
- my ($git, $oid, $cb, $arg) = @_;
- if ($GCF2C //= eval {
+sub ibx_async_cat ($$$$) {
+ my ($ibx, $oid, $cb, $arg) = @_;
+ my $git = $ibx->git;
+ # {topdir} means ExtSearch (likely [extindex "all"]) with potentially
+ # 100K alternates. git(1) has a proposed patch for 100K alternates:
+ # <https://lore.kernel.org/git/20210624005806.12079-1-e@80x24.org/>
+ if (!defined($ibx->{topdir}) && ($GCF2C //= eval {
require PublicInbox::Gcf2Client;
PublicInbox::Gcf2Client::new();
- } // 0) { # 0: do not retry if libgit2 or Inline::C are missing
+ } // 0)) { # 0: do not retry if libgit2 or Inline::C are missing
$GCF2C->gcf2_async(\"$oid $git->{git_dir}\n", $cb, $arg);
\undef;
} else { # read-only end of git-cat-file pipe
@@ -66,9 +70,10 @@ sub git_async_cat ($$$$) {
# this is safe to call inside $cb, but not guaranteed to enqueue
# returns true if successful, undef if not.
-sub git_async_prefetch {
- my ($git, $oid, $cb, $arg) = @_;
- if ($GCF2C) {
+sub ibx_async_prefetch {
+ my ($ibx, $oid, $cb, $arg) = @_;
+ my $git = $ibx->git;
+ if (!defined($ibx->{topdir}) && $GCF2C) {
if (!$GCF2C->{wbuf}) {
$oid .= " $git->{git_dir}\n";
return $GCF2C->gcf2_async(\$oid, $cb, $arg); # true
diff --git a/lib/PublicInbox/GzipFilter.pm b/lib/PublicInbox/GzipFilter.pm
index 48ed11a5..334d6581 100644
--- a/lib/PublicInbox/GzipFilter.pm
+++ b/lib/PublicInbox/GzipFilter.pm
@@ -180,8 +180,7 @@ sub async_blob_cb { # git->cat_async callback
sub smsg_blob {
my ($self, $smsg) = @_;
- git_async_cat($self->{ibx}->git, $smsg->{blob},
- \&async_blob_cb, $self);
+ ibx_async_cat($self->{ibx}, $smsg->{blob}, \&async_blob_cb, $self);
}
1;
diff --git a/lib/PublicInbox/HTTPD.pm b/lib/PublicInbox/HTTPD.pm
index b193c9ae..fb683f74 100644
--- a/lib/PublicInbox/HTTPD.pm
+++ b/lib/PublicInbox/HTTPD.pm
@@ -37,7 +37,7 @@ sub new {
# XXX unstable API!, only GitHTTPBackend needs
# this to limit git-http-backend(1) parallelism.
# We also check for the truthiness of this to
- # detect when to use git_async_cat for slow blobs
+ # detect when to use async paths for slow blobs
'pi-httpd.async' => \&pi_httpd_async
);
bless {
diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index af8ce72b..9402aa41 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -612,7 +612,7 @@ sub fetch_run_ops {
$self->msg_more(")\r\n");
}
-sub fetch_blob_cb { # called by git->cat_async via git_async_cat
+sub fetch_blob_cb { # called by git->cat_async via ibx_async_cat
my ($bref, $oid, $type, $size, $fetch_arg) = @_;
my ($self, undef, $msgs, $range_info, $ops, $partial) = @$fetch_arg;
my $ibx = $self->{ibx} or return $self->close; # client disconnected
@@ -627,8 +627,8 @@ sub fetch_blob_cb { # called by git->cat_async via git_async_cat
}
my $pre;
if (!$self->{wbuf} && (my $nxt = $msgs->[0])) {
- $pre = git_async_prefetch($ibx->git, $nxt->{blob},
- \&fetch_blob_cb, $fetch_arg);
+ $pre = ibx_async_prefetch($ibx, $nxt->{blob},
+ \&fetch_blob_cb, $fetch_arg);
}
fetch_run_ops($self, $smsg, $bref, $ops, $partial);
$pre ? $self->zflush : requeue_once($self);
@@ -760,7 +760,7 @@ sub fetch_blob { # long_response
}
}
uo2m_extend($self, $msgs->[-1]->{num});
- git_async_cat($self->{ibx}->git, $msgs->[0]->{blob},
+ ibx_async_cat($self->{ibx}, $msgs->[0]->{blob},
\&fetch_blob_cb, \@_);
}
@@ -1228,7 +1228,7 @@ sub long_step {
} elsif ($more) { # $self->{wbuf}:
$self->update_idle_time;
- # control passed to git_async_cat if $more == \undef
+ # control passed to ibx_async_cat if $more == \undef
requeue_once($self) if !ref($more);
} else { # all done!
delete $self->{long_cb};
diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index f7d99913..9df47133 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -515,7 +515,7 @@ found:
$smsg->{nntp_code} = $code;
set_art($self, $art);
# this dereferences to `undef'
- ${git_async_cat($ibx->git, $smsg->{blob}, \&blob_cb, $smsg)};
+ ${ibx_async_cat($ibx, $smsg->{blob}, \&blob_cb, $smsg)};
}
}
@@ -549,7 +549,7 @@ sub msg_hdr_write ($$) {
$smsg->{nntp}->msg_more($$hdr);
}
-sub blob_cb { # called by git->cat_async via git_async_cat
+sub blob_cb { # called by git->cat_async via ibx_async_cat
my ($bref, $oid, $type, $size, $smsg) = @_;
my $self = $smsg->{nntp};
my $code = $smsg->{nntp_code};
diff --git a/lib/PublicInbox/SolverGit.pm b/lib/PublicInbox/SolverGit.pm
index 92106e75..b0cd0f2c 100644
--- a/lib/PublicInbox/SolverGit.pm
+++ b/lib/PublicInbox/SolverGit.pm
@@ -593,8 +593,7 @@ sub resolve_patch ($$) {
if (my $msgs = $want->{try_smsgs}) {
my $smsg = shift @$msgs;
if ($self->{psgi_env}->{'pi-httpd.async'}) {
- return git_async_cat($want->{cur_ibx}->git,
- $smsg->{blob},
+ return ibx_async_cat($want->{cur_ibx}, $smsg->{blob},
\&extract_diff_async,
[$self, $want, $smsg]);
} else {
^ permalink raw reply related [relevance 65%]
* [PATCH 37/43] www: update internal docs
@ 2020-07-05 23:27 65% ` Eric Wong
0 siblings, 0 replies; 7+ results
From: Eric Wong @ 2020-07-05 23:27 UTC (permalink / raw)
To: meta
We no longer favor getline+close for streaming PSGI responses
when using public-inbox-httpd. We still support it for other
PSGI servers, though.
---
Documentation/technical/ds.txt | 4 ++--
lib/PublicInbox/GetlineBody.pm | 4 +---
lib/PublicInbox/GzipFilter.pm | 17 +++++++++++++----
lib/PublicInbox/HTTPD.pm | 5 ++---
lib/PublicInbox/Mbox.pm | 8 ++------
lib/PublicInbox/View.pm | 2 +-
lib/PublicInbox/WwwAtomStream.pm | 6 ++----
lib/PublicInbox/WwwStream.pm | 7 +++----
8 files changed, 26 insertions(+), 27 deletions(-)
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
index cbd06cfb4..a0793ca23 100644
--- a/Documentation/technical/ds.txt
+++ b/Documentation/technical/ds.txt
@@ -64,8 +64,8 @@ Augmented features:
* ->requeue support. An optimization of the AddTimer(0, ...) idiom
for immediately dispatching code at the next event loop iteration.
public-inbox uses this for fairly generating large responses
- iteratively (see PublicInbox::NNTP::long_response or the use of
- ->getline callbacks for generating gigantic gzipped mboxes).
+ iteratively (see PublicInbox::NNTP::long_response or git_async_cat
+ for blob retrievals).
New features
diff --git a/lib/PublicInbox/GetlineBody.pm b/lib/PublicInbox/GetlineBody.pm
index 6becaaf5f..988bc63f4 100644
--- a/lib/PublicInbox/GetlineBody.pm
+++ b/lib/PublicInbox/GetlineBody.pm
@@ -5,9 +5,7 @@
# end callback when the object goes out-of-scope.
# This depends on rpipe being _blocking_ on getline.
#
-# public-inbox-httpd favors "getline" response bodies to take a
-# "pull"-based approach to feeding slow clients (as opposed to a
-# more common "push" model)
+# This is only used by generic PSGI servers and not public-inbox-httpd
package PublicInbox::GetlineBody;
use strict;
use warnings;
diff --git a/lib/PublicInbox/GzipFilter.pm b/lib/PublicInbox/GzipFilter.pm
index 6380f50e9..d72ad3c88 100644
--- a/lib/PublicInbox/GzipFilter.pm
+++ b/lib/PublicInbox/GzipFilter.pm
@@ -1,7 +1,16 @@
# Copyright (C) 2020 all contributors <meta@public-inbox.org>
# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-
-# Qspawn filter
+#
+# In public-inbox <=1.5.0, public-inbox-httpd favored "getline"
+# response bodies to take a "pull"-based approach to feeding
+# slow clients (as opposed to a more common "push" model).
+#
+# In newer versions, public-inbox-httpd supports a backpressure-aware
+# pull/push model which also accounts for slow git blob storage.
+# {async_next} callbacks only run when the DS {wbuf} is drained
+# {async_eml} callbacks only run when a blob arrives from git.
+#
+# We continue to support getline+close for generic PSGI servers.
package PublicInbox::GzipFilter;
use strict;
use parent qw(Exporter);
@@ -14,12 +23,12 @@ our @EXPORT_OK = qw(gzf_maybe);
my %OPT = (-WindowBits => 15 + 16, -AppendOutput => 1);
my @GZIP_HDRS = qw(Vary Accept-Encoding Content-Encoding gzip);
-sub new { bless {}, shift }
+sub new { bless {}, shift } # qspawn filter
# for Qspawn if using $env->{'pi-httpd.async'}
sub attach {
my ($self, $http_out) = @_;
- $self->{http_out} = $http_out;
+ $self->{http_out} = $http_out; # PublicInbox::HTTP::{Chunked,Identity}
$self
}
diff --git a/lib/PublicInbox/HTTPD.pm b/lib/PublicInbox/HTTPD.pm
index 331939699..a9f55ff61 100644
--- a/lib/PublicInbox/HTTPD.pm
+++ b/lib/PublicInbox/HTTPD.pm
@@ -36,9 +36,8 @@ sub new {
# XXX unstable API!, only GitHTTPBackend needs
# this to limit git-http-backend(1) parallelism.
- # The rest of our PSGI code is generic, relying
- # on "pull" model using "getline" to prevent
- # over-buffering.
+ # We also check for the truthiness of this to
+ # detect when to use git_async_cat for slow blobs
'pi-httpd.async' => \&pi_httpd_async
);
bless {
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index abdf43c93..8726b9f64 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -1,12 +1,8 @@
# Copyright (C) 2015-2020 all contributors <meta@public-inbox.org>
# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-# Streaming (via getline) interface for formatting messages as an mboxrd.
-# Used by the PSGI web interface.
-#
-# public-inbox-httpd favors "getline" response bodies to take a
-# "pull"-based approach to feeding slow clients (as opposed to a
-# more common "push" model)
+# Streaming interface for mboxrd HTTP responses
+# See PublicInbox::GzipFilter for details.
package PublicInbox::Mbox;
use strict;
use parent 'PublicInbox::GzipFilter';
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 895e4f278..60dad6bac 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -415,7 +415,7 @@ sub stream_thread ($$) {
PublicInbox::WwwStream::aresponse($ctx, 200, \&stream_thread_i);
}
-# /$INBOX/$MESSAGE_ID/t/
+# /$INBOX/$MSGID/t/ and /$INBOX/$MSGID/T/
sub thread_html {
my ($ctx) = @_;
my $mid = $ctx->{mid};
diff --git a/lib/PublicInbox/WwwAtomStream.pm b/lib/PublicInbox/WwwAtomStream.pm
index 073df1dfa..3b5b133a5 100644
--- a/lib/PublicInbox/WwwAtomStream.pm
+++ b/lib/PublicInbox/WwwAtomStream.pm
@@ -1,10 +1,8 @@
# Copyright (C) 2016-2020 all contributors <meta@public-inbox.org>
# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
#
-# Atom body stream for which yields getline+close methods
-# public-inbox-httpd favors "getline" response bodies to take a
-# "pull"-based approach to feeding slow clients (as opposed to a
-# more common "push" model)
+# Atom body stream for HTTP responses
+# See PublicInbox::GzipFilter for details.
package PublicInbox::WwwAtomStream;
use strict;
use parent 'PublicInbox::GzipFilter';
diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index 7d257a191..23b03f0e8 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -1,11 +1,10 @@
# Copyright (C) 2016-2020 all contributors <meta@public-inbox.org>
# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
#
-# HTML body stream for which yields getline+close methods
+# HTML body stream for which yields getline+close methods for
+# generic PSGI servers and callbacks for public-inbox-httpd.
#
-# public-inbox-httpd favors "getline" response bodies to take a
-# "pull"-based approach to feeding slow clients (as opposed to a
-# more common "push" model)
+# See PublicInbox::GzipFilter parent class for more info.
package PublicInbox::WwwStream;
use strict;
use parent qw(Exporter PublicInbox::GzipFilter);
^ permalink raw reply related [relevance 65%]
* [PATCH] doc: technical/ds.txt: describe PublicInbox::DS divergences
@ 2020-01-10 20:35 63% Eric Wong
0 siblings, 0 replies; 7+ results
From: Eric Wong @ 2020-01-10 20:35 UTC (permalink / raw)
To: meta
Danga::Socket 1.62 was released a few months back and
the maintainer indicated it would be the last release.
We've diverged significantly in incompatible ways...
While most of this should've already been documented in
commit messages, putting it all into one document could
make it easier-to-digest.
It's also a strange design for anybody used to conventional
event loops. Maybe this is an unconventional project :P
---
Documentation/technical/ds.txt | 112 +++++++++++++++++++++++++++++++++
MANIFEST | 1 +
lib/PublicInbox/DS.pm | 16 ++---
3 files changed, 121 insertions(+), 8 deletions(-)
create mode 100644 Documentation/technical/ds.txt
diff --git a/Documentation/technical/ds.txt b/Documentation/technical/ds.txt
new file mode 100644
index 00000000..cbd06cfb
--- /dev/null
+++ b/Documentation/technical/ds.txt
@@ -0,0 +1,112 @@
+PublicInbox::DS - event loop and async I/O base class
+
+Our PublicInbox::DS event loop which powers public-inbox-nntpd
+and public-inbox-httpd diverges significantly from the
+unmaintained Danga::Socket package we forked from. In fact,
+it's probably different from most other event loops out there.
+
+Most notably:
+
+* There is one and only one callback: ->event_step. Unlike other
+ event loops, there are no separate callbacks for read, write,
+ error or hangup events. In fact, we never care which kevent
+ filter or poll/epoll event flag (e.g. POLLIN/POLLOUT/POLLHUP)
+ triggers a call.
+
+ The lack of read/write callback distinction is driven by the
+ fact TLS libraries (e.g. OpenSSL via IO::Socket::SSL) may
+ declare SSL_WANT_READ on SSL_write(), and SSL_WANT_READ on
+ SSL_read(). So we end up having to let each user object decide
+ whether it wants to make read or write calls depending on its
+ internal state, completely independent of the event loop.
+
+ Error and hangup (POLLERR and POLLHUP) callbacks are redundant and
+ only triggered in rare cases. They're redundant because the
+ result of every read and write call in ->event_step must be
+ checked, anyways. At best, callbacks for POLLHUP and POLLERR can
+ save one syscall per socket lifetime and not worth the extra code
+ it imposes.
+
+ Reducing the user-supplied code down to a single callback allows
+ subclasses to keep their logic self-contained. The combination
+ of this change and one-shot wakeups (see below) for bidirectional
+ data flows make asynchronous code easier to reason about.
+
+Other divergences:
+
+* ->write buffering uses temporary files whereas Danga::Socket used
+ the heap. The rationale for this is the kernel already provides
+ ample (and configurable) space for socket buffers. Modern kernels
+ also cache FS operations aggressively, so systems with ample RAM
+ are unlikely to notice degradation, while small systems are less
+ likely to suffer unpredictable heap fragmentation, swap and OOM
+ penalties.
+
+ In the future, we may introduce sendfile and mmap+SSL_write to
+ reduce data copies, and use FALLOC_FL_PUNCH_HOLE on Linux to
+ release space after the buffer is partially cleared.
+
+Augmented features:
+
+* obj->write(CODEREF) passes the object itself to the CODEREF
+ Being able to enqueue subroutine calls is a powerful feature in
+ Danga::Socket for keeping linear logic in an asynchronous environment.
+ Unfortunately, each subroutine takes several kilobytes of memory.
+ One small change to Danga::Socket is to pass the receiver object
+ (aka "$self") to the CODEREF. $self can store any necessary
+ state it needs for a normal (named) subroutine. This allows us to
+ put the same sub into multiple queues without paying a large
+ memory penalty for each one.
+
+ This idea is also more easily ported to C or other languages which
+ lack anonymous subroutines (aka "closures").
+
+* ->requeue support. An optimization of the AddTimer(0, ...) idiom
+ for immediately dispatching code at the next event loop iteration.
+ public-inbox uses this for fairly generating large responses
+ iteratively (see PublicInbox::NNTP::long_response or the use of
+ ->getline callbacks for generating gigantic gzipped mboxes).
+
+New features
+
+* One-shot wakeups allowed via EPOLLONESHOT or EV_DISPATCH. These
+ flags allow us to simplify code in ->event_step callbacks for
+ bidirectional sockets (NNTP and HTTP). Instead of merely reacting
+ to events, control is handed over at ->event_step in one-shot scenarios.
+ The event_step caller (NNTP || HTTP) then becomes proactive in declaring
+ which (if any) events it's interested in for the next loop iteration.
+
+* Edge-triggering available via EPOLLET or EV_CLEAR. These reduce wakeups
+ for unidirectional classes (e.g. PublicInbox::Listener sockets,
+ and pipes via PublicInbox::HTTPD::Async).
+
+* IO::Socket::SSL support (for NNTPS, STARTTLS+NNTP, HTTPS)
+
+* dwaitpid (waitpid wrapper) support for reaping dead children
+
+* reliable signal wakeups are supported via signalfd on Linux,
+ EVFILT_SIGNAL on *BSDs via IO::KQueue.
+
+Removed features
+
+* Many fields removed or moved to subclasses, so the underlying
+ hash is smaller and suitable for FDs other than stream sockets.
+ Some fields we enforce (e.g. wbuf, wbuf_off) are autovivified
+ on an as-needed basis to save memory when they're not needed.
+
+* TCP_CORK support removed, instead we use MSG_MORE on non-TLS sockets
+ and we may use vectored I/O support via GnuTLS in the future
+ for TLS sockets.
+
+* per-FD PLCMap (post-loop callback) removed, we got ->requeue
+ support where no extra hash lookups or assignments are necessary.
+
+* read push backs removed. Some subclasses use a read buffer ({rbuf})
+ but they control it, not this event loop.
+
+* Profiling and debug logging removed. Perl and OS-specific tracers
+ and profilers are sufficient.
+
+* ->AddOtherFds support removed, everything watched is a subclass of
+ PublicInbox::DS, but we've slimmed down the fields to eliminate
+ the memory penalty for objects.
diff --git a/MANIFEST b/MANIFEST
index 914015ad..3736c777 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -34,6 +34,7 @@ Documentation/public-inbox-watch.pod
Documentation/public-inbox-xcpdb.pod
Documentation/public-inbox.cgi.pod
Documentation/standards.perl
+Documentation/technical/ds.txt
Documentation/txt2pre
HACKING
INSTALL
diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 09dc3992..058b1358 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -3,15 +3,15 @@
#
# This license differs from the rest of public-inbox
#
-# This is a fork of the (for now) unmaintained Danga::Socket 1.61.
-# Unused features will be removed, and updates will be made to take
-# advantage of newer kernels.
+# This is a fork of the unmaintained Danga::Socket (1.61) with
+# significant changes. See Documentation/technical/ds.txt in our
+# source for details.
#
-# API changes to diverge from Danga::Socket will happen to better
-# accomodate new features and improve scalability. Do not expect
-# this to be a stable API like Danga::Socket.
-# Bugs encountered (and likely fixed) are reported to
-# bug-Danga-Socket@rt.cpan.org and visible at:
+# Do not expect this to be a stable API like Danga::Socket,
+# but it will evolve to suite our needs and to take advantage of
+# newer Linux and *BSD features.
+# Bugs encountered were reported to bug-Danga-Socket@rt.cpan.org,
+# fixed in Danga::Socket 1.62 and visible at:
# https://rt.cpan.org/Public/Dist/Display.html?Name=Danga-Socket
package PublicInbox::DS;
use strict;
^ permalink raw reply related [relevance 63%]
Results 1-7 of 7 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-01-10 20:35 63% [PATCH] doc: technical/ds.txt: describe PublicInbox::DS divergences Eric Wong
2020-07-05 23:27 [PATCH 00/43] www: async git cat-file w/ -httpd Eric Wong
2020-07-05 23:27 65% ` [PATCH 37/43] www: update internal docs Eric Wong
2021-06-24 5:50 65% [PATCH] favor git(1) rather than libgit2 for ExtSearch Eric Wong
2021-10-16 1:00 [PATCH 00/16] some yak-shaving and annoyance fixes Eric Wong
2021-10-16 1:01 85% ` [PATCH 12/12] httpd/async: switch to level-triggered epoll Eric Wong
2023-01-17 7:18 [PATCH 00/12] improve process reaping Eric Wong
2023-01-17 7:19 83% ` [PATCH 12/12] ds: drop dwaitpid, switch to waitpid(-1) Eric Wong
2023-03-09 19:28 [PATCH 0/6] various doc updates Eric Wong
2023-03-09 19:28 99% ` [PATCH 2/6] doc: technical/ds: update blurb to note more daemons Eric Wong
2023-08-28 10:42 [PATCH 1/5] ci/profiles.sh: fix case matching logic Štěpán Němec
2023-08-28 10:42 68% ` [PATCH 5/5] Fix some typos/grammar/errors in docs and comments Štěpán Němec
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).