1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
| | Internal data structures of public-inbox
This is a guide for hackers new to our code base. Do not
consider our internal data structures stable for external
consumers, this document should be updated when internals
change. I recommend reading this document from the source tree,
with the source code easily accessible if you need examples.
This mainly documents in-memory data structures. If you're
interested in the stable on-filesystem formats, see the
public-inbox-config(5), public-inbox-v1-format(5) and
public-inbox-v2-format(5) manpages.
Common abbreviations when used outside of their packages are
documented. `$self' is the common variable name when used
within their package.
PublicInbox::Config
-------------------
PublicInbox::Config is the root class which loads a
public-inbox-config file and instantiates PublicInbox::Inbox,
PublicInbox::WWW, PublicInbox::NNTPD, and other top-level
classes.
Outside of tests, this is typically a singleton.
Per-message classes
-------------------
* PublicInbox::MIME - Email::MIME subclass
Common abbreviation: $mime
Used by: PublicInbox::WWW, PublicInbox::SearchIdx
An representation of an entire email, multipart or not. It's
a subclass of Email::MIME to workaround bugs in old
Email::MIME versions. An option to use libgmime or libmailutils
may be supported in the future for performance and memory use.
This can be a memory hog with big messages and giant
attachments, so our PublicInbox::WWW interface only keeps
one object of this class in memory at-a-time.
In other words, this is the "meat" of the message, whereas
$smsg (below) is just the "skeleton".
Our PublicInbox::V2Writable class may have two objects of this
type in memory at-a-time for deduplication.
* PublicInbox::Smsg - small message skeleton
Used by: PublicInbox::{NNTP,WWW,SearchIdx}
Common abbreviation: $smsg
Represents headers shown in NNTP overview and PSGI message
summaries (thread skeleton).
This is loaded from either the overview DB (over.sqlite3) or
the Xapian DB (docdata.glass), though the Xapian docdata
is won't hold NNTP-only fields (Cc:/To:)
There may be hundreds or thousands of these objects in memory
at-a-time, so fields are pruned if unneeded.
* PublicInbox::SearchThread::Msg - subclass of Smsg
Common abbreviation: $cont or $node
Used by: PublicInbox::WWW
The structure we use for a non-recursive[1] variant of
JWZ's algorithm: <https://www.jwz.org/doc/threading.html>.
Nowadays, this is a re-blessed $smsg with additional fields.
As with $smsg objects, there may be hundreds or thousands
of these objects in memory at-a-time.
We also do not use a linked-list for storing children as JWZ
describes, but instead a Perl hashref for {children} which
becomes an arrayref upon sorting.
[1] https://rt.cpan.org/Ticket/Display.html?id=116727
Per-inbox classes
-----------------
* PublicInbox::Inbox - represents a single public-inbox
Common abbreviation: $ibx
Used everywhere
This represents a "publicinbox" section in the config
file, see public-inbox-config(5) for details.
* PublicInbox::Git - represents a single git repository
Common abbreviation: $git, $ibx->git
Used everywhere.
Each configured "publicinbox" or "coderepo" has one of these.
* PublicInbox::Msgmap - msgmap.sqlite3 read-write interface
Common abbreviation: $mm, $ibx->mm
Used everywhere if SQLite is available.
Each indexed inbox has one of these, see
public-inbox-v1-format(5) and public-inbox-v2-format(5)
manpages for details.
* PublicInbox::Over - over.sqlite3 read-only interface
Common abbreviation: $over, $ibx->over
Used everywhere if SQLite is available.
Each indexed inbox has one of these, see
public-inbox-v1-format(5) and public-inbox-v2-format(5)
manpages for details.
* PublicInbox::Search - Xapian read-only interface
Common abbreviation: $srch, $ibx->search
Used everywhere if Search::Xapian (or Xapian.pm) is available.
Each indexed inbox has one of these, see
public-inbox-v1-format(5) and public-inbox-v2-format(5)
manpages for details.
PublicInbox::WWW
----------------
The main PSGI web interface, uses several other packages to
form our web interface.
PublicInbox::SolverGit
----------------------
This is instantiated from the $INBOX/$BLOB_OID/s/ WWW endpoint
and represents the stages and states for "solving" a blob by
searching for and applying patches. See the code and comments
in PublicInbox/SolverGit.pm
PublicInbox::Qspawn
-------------------
This is instantiated from various WWW endpoints and represents
the stages and states for running and managing subprocesses
in a way which won't exceed configured process limits defined
via "publicinboxlimiter.*" directives in public-inbox-config(5).
ad-hoc structures shared across packages
----------------------------------------
* $ctx - PublicInbox::WWW app request context
This holds the PSGI $env as well as any internal variables
used by various modules of PublicInbox::WWW.
As with the PSGI $env, there is one per-active WWW
request+response cycle. It does not exist for idle HTTP
clients.
daemon classes
--------------
* PublicInbox::NNTP - a NNTP client socket
Common abbreviation: $nntp
Used by: PublicInbox::DS, public-inbox-nntpd
Unlike PublicInbox::HTTP, all of the NNTP client logic for
serving to NNTP clients is here, including what would be
in $ctx on the HTTP or WWW side.
There may be thousands of these since we support thousands of
NNTP clients.
* PublicInbox::HTTP - a HTTP client socket
Common abbreviation: $http
Used by: PublicInbox::DS, public-inbox-httpd
Unlike PublicInbox::NNTP, this class no knowledge of any of
the email or git-specific parts of public-inbox, only PSGI.
However, it supports APIs and behaviors (e.g. streaming large
responses) which PublicInbox::WWW may take advantage of.
There may be thousands of these since we support thousands of
HTTP clients.
* PublicInbox::Listener - a SOCK_STREAM listen socket (TCP or Unix)
Used by: PublicInbox::DS, public-inbox-httpd, public-inbox-nntpd
Common abbreviation: @listeners in PublicInbox::Daemon
This class calls non-blocking accept(2) or accept4(2) on a
listen socket to create new PublicInbox::HTTP and
PublicInbox::HTTP instances.
* PublicInbox::HTTPD
Common abbreviation: $httpd
Represents an HTTP daemon which creates PublicInbox::HTTP
wrappers around client sockets accepted from
PublicInbox::Listener.
Since the SERVER_NAME and SERVER_PORT PSGI variables needs to be
exposed for HTTP/1.0 requests when Host: headers are missing,
this is per-Listener socket.
* PublicInbox::HTTPD::Async
Common abbreviation: $async
Used for implementing an asynchronous "push" interface for
slow, expensive responses which may require spawning
git-httpd-backend(1), git-apply(1) or other commands.
This will also be used for dealing with future asynchronous
operations such as HTTP reverse proxying and slow storage
retrieval operations.
* PublicInbox::NNTPD
Common abbreviation: $nntpd
Represents an NNTP daemon which creates PublicInbox::NNTP
wrappers around client sockets accepted from
PublicInbox::Listener.
This is currently a singleton, but it is associated with a
given PublicInbox::Config which may be instantiated more than
once in the future.
* PublicInbox::ParentPipe
Per-worker process class to detect shutdown of master process.
This is not used if using -W0 to disable worker processes
in public-inbox-httpd or public-inbox-nntpd.
This is a per-worker singleton.
|