Date | Commit message (Collapse) |
|
Currently only for git-http-backend use, this allows limiting
the number of spawned processes per-inbox or by group, if there
are multiple large inboxes amidst a sea of small ones.
For example, a "big" repo limiter could be used for big inboxes:
which would be shared between multiple repos:
[limiter "big"]
max = 4
[publicinbox "git"]
address = git@vger.kernel.org
mainrepo = /path/to/git.git
; shared limiter with giant:
httpbackendmax = big
[publicinbox "giant"]
address = giant@project.org
mainrepo = /path/to/giant.git
; shared limiter with git:
httpbackendmax = big
; This is a tiny inbox, use the default limiter with 32 slots:
[publicinbox "meta"]
address = meta@public-inbox.org
mainrepo = /path/to/meta.git
|
|
And bump the default limit to 32 so we match git-daemon
behavior. This shall allow us to configure different levels
of concurrency for different repositories and prevent clones
of giant repos from stalling service to small repos.
|
|
We now generate all of our HTML using WwwStream which
forces us to have consistent headers and footers in
the HTML itself.
This also makes the search-capable vs search-less installs
go to the new.html endpoint to maintain consistency
(in case an admin decides to enable Xapian).
|
|
We should not fail tests when this is not available.
|
|
They're uncommon, fortunately, but we make no attempt to
handle nested comments (which would open us up to things
like CVE-2015-7686) or use the comment in place of a
missing name.
|
|
This avoids breaking clients on graceful shutdown since
NNTP responses should usually be quick.
|
|
Lighter and ever-so-slightly faster!
Most importantly, this won't do non-obvious stuff behind our
backs like trying to parse a POST request body for a query
string param.
|
|
This is lighter and we can work further towards eliminating
our Plack::Request dependency entirely.
|
|
It seems common for address entries to end up as:
"foo@example" <foo@example>
Avoid needlessly displaying the domain name in that case.
|
|
Probably better than bloating our own API with configurable
warning streams and such...
|
|
This encapsulates an entire PSGI response array, hopefully
making it easier to generate responses and avoid typos when
setting the Content-Type.
|
|
Angle brackets around the --in-reply-to= arg for git send-email
has been optional since git v1.5.3.2, so strip them and make the
command-line argument easier-to-type.
|
|
Address::names is sufficient to handle what from_name did.
|
|
We may remove from_name in the future.
...And disallow quotes in email addresses.
Technically I believe they're allowed, but they're definitely
uncommon and unlikely to show up in legitimate mail.
|
|
Mailing lists I watch and mirror may not have the best spam
filtering, and an extra layer should not hurt.
|
|
And improve documentation for existing dependencies, too.
|
|
This should hopefully make it easier to try other anti-spam
systems (or none at all) in the future.
|
|
Inboxes are normally created by Config, but having the
population logic in Inbox should make it easier to mock
for testing.
|
|
Favor Inbox objects as our primary source of truth to simplify
our code. This increases our coupling with PSGI to make it
easier to write tests in the future.
A lot of this code was originally designed to be usable
standalone without PSGI or CGI at all; but that might increase
development effort.
|
|
Only mark seen messages as spam, otherwise it could be
too aggressive and cause problems or over training.
We wouldn't want a wayward FIFO ruining our day, either :)
|
|
We can support spam removal by watching a special "spam"
Maildir, too. We can run public-inbox-learn as a separate
step, and that command will be improved to support
auto-learning, too.
|
|
This should be portable despite the intended use of this
directory being non-portable.
|
|
While we only want to stop our daemons and gracefully destroy
subprocesses, it is common for 'Ctrl-C' from a terminal to kill
the entire pgroup.
Killing an entire pgroup nukes subprocesses like git-upload-pack
breaks graceful shutdown on long clones. Make a best effort to
ensure git-upload-pack processes are not broken when somebody
signals an entire process group.
Followup-to: commit 37bf2db81bbbe114d7fc5a00e30d3d5a6fa74de5
("doc: systemd examples should only kill one process")
|
|
This will allow us to commonalize HTML generation in the future
and is the start of moving existing HTML generation to a "pull"
streaming model (from the existing "push" one).
Using the getline/close pull model is superior to the existing
$fh->write streaming as it allows us to throttle response
generation based on backpressure from slow clients.
|
|
We no longer depend on it for the core code, and tests
are optional for users. Hopefully this makes this
easier-to-install.
|
|
It should be possible to serve the contents of a public-inbox
over NNTP but not HTTP.
|
|
This removes the Email::Filter dependency as well as the
signature-breaking scrubber code. We now prefer to
reject unacceptable messages and grudgingly (and blindly)
mirror messages we're not the primary endpoint for.
|
|
This is transactional and hopefully safer in case we hit SIGSEGV
or SIGKILL during processing, as the tmp/ copy will remain on
the FS even if DESTROY/END handlers are not called.
|
|
This filter API should be independent of Email::Filter and
hopefully less intrusive to long running processes.
|
|
Email::Filter doesn't offer any functionality we need, here;
and our dependency on Email::Filter will gradually be removed
since it (and Email::LocalDelivery) seem abandoned and we
can have more-fine-grained control by rolling our own Maildir
delivery which can work transactionally.
|
|
Remove mbox tests since mbox is unreliable due to raciness
and incompatible implementations. We will drop support for
mbox emergency destinations, soon.
|
|
Totally unnecessary...
|
|
Since ssoma is optional, here, IPC::Run shall also be optional.
(And it may be removed entirely in the future).
|
|
Or whatever the appropriate Perl terminology, is...
And we will need to do something appropriate for other
encodings, too. I still barely understand Perl Unicode
despite attempting to understand the docs over the years..
|
|
We need to ensure we show the message body ASAP since
the thread generation via Xapian could take a while
and maybe even raise an exception or crash.
|
|
Oops :x Add an additional test for live data for any
unprintable characters, too, since this could be a dangerous
source of HTML injection.
|
|
This should reduce link following for replies and improve
visibility. This should also reduce cache overhead/footprint
for crawlers.
|
|
Plack::Request is unnecessary overhead for this given the
strictness of git-http-backend. Furthermore, having to make
commit 311c2adc8c63 ("avoid Plack::Request parsing body")
to avoid tempfiles should not have been necessary.
|
|
Oops, we totally forgot to automate testing for this :x
|
|
Most of its functionality is in the PublicInbox::Inbox class.
While we're at it, we no longer auto-create newsgroup names
based on the inbox name, since newsgroup names probably deserve
some thought when it comes to hierarchy.
|
|
We don't serve things like robots.txt, favicon.ico, or
.well-known/ endpoints ourselves, but ensure we can be
used with Plack::App::Cascade for others.
|
|
Oops, added a test to prevent regressions while we're at it.
|
|
git has stricter requirements for ident names (no '<>')
which Email::Address allows.
Even in 1.908, Email::Address also has an incomplete fix for
CVE-2015-7686 with a DoS-able regexp for comments. Since we
don't care for or need all the RFC compliance of Email::Address,
avoiding it entirely may be preferable.
Email::Address will still be installed as a requirement for
Email::MIME, but it is only used by the
Email::MIME::header_str_set which we do not use
|
|
Having an excessive amount of git-pack-objects processes is
dangerous to the health of the server. Queue up process spawning
for long-running responses and serve them sequentially, instead.
|
|
Since PSGI does not require Transfer-Encoding: chunked or
Content-Length, we cannot expect random apps we host to chunk
their responses.
Thus, to improve interoperability, chunk at the HTTP layer like
other PSGI servers do. I'm chosing a more syscall-intensive method
(via multiple send(...MSG_MORE) for now to reduce copy + packet
overhead.
|
|
Followup-to: commit 24e0219f364ed402f9136227756e0f196dc651aa
("remove GIT_DIR env usage in favor of --git-dir")
|
|
We need to ensure $? is set properly for users.
|
|
This hopefully makes the intent of the code clearer, too.
The the HTTP use of the numeric reference for getline
caused problems in Git.pm, already.
|
|
Having a file start with '.' or '-' can be confusing
and for users, so do not allow it.
|
|
We shall ensure links continue working for this.
|