Date | Commit message (Collapse) |
|
I didn't wait until September to do it, this year!
|
|
There's a bunch of leftover "require" and "use" statements we no
longer need and can get rid of, along with some excessive
imports via "use".
IO::Handle usage isn't always obvious, so add comments
describing why a package loads it. Along the same lines,
document the tmpdir support as the reason we depend on
File::Temp 0.19, even though every Perl 5.10.1+ user has it.
While we're at it, favor "use" over "require", since it it gives
us extra compile-time checking.
|
|
Remove redundant "r" functions for generating short error
responses. These responses will no longer be cached by clients,
which is probably a good thing since most errors ought to be
transient, anyways. This also fixes error responses for our
cgit wrapper when static files are missing.
|
|
The ref() call could be hitting memory leaks on Perl 5.16.x.
It's been 3 years (2016-12-25) since 292ca34140489da2
("githttpbackend: simplify compatibility code") back when
this project was barely known and probably nobody used
examples/public-inbox.psgi...
|
|
We can save callers the trouble of {-hold} and {-dev_null}
refs as well as the trouble of calling fileno().
|
|
Make it easier to share code between our GitHTTPBackend and Cgit
packages, for now, and possibly other packages in the future.
We can avoid inline_object and anonymous subs at the same
time, reducing per-request memory overhead.
|
|
Callers can supply an arg to parse_hdr, now, eliminating the
need for closures to capture local variables.
|
|
Naming $start_cb consistently helps avoid confusing new readers,
and some comments will help with understanding flow
|
|
REMOTE_HOST is not set by us (it is the reverse DNS name) of
REMOTE_ADDR, and there's few better ways to kill HTTP server
performance than to use standard name resolution APIs like
getnameinfo(3).
|
|
Although we always unlink temporary files, give them a
meaningful name so that we can we can still make sense
of the pre-unlink name when using lsof(8) or similar
tools on Linux.
|
|
|
|
It may make sense to use PerlIO::mmap or PerlIO::scalar for
DS write buffering with IO::Socket::SSL or similar (since we can't
use MSG_MORE), so that means we need to go through buffering
in userspace for the common case; while still being easily
compatible with slow clients.
And it also simplifies GitHTTPBackend slightly.
Maybe it can make sense for HTTP input buffering, too...
|
|
We mainly support git-upload-pack; and maybe somebody uses
git-receive-pack with this. Perhaps other (experimental)
command names are acceptable. But it's unlikely anybody will
want Unicode command names for git services.
|
|
Non-ASCII digits would be interpreted as a zeroes as integers.
While we're at it, ensure the Status: code is an ASCII digit,
too; though I would not expect git-http-backend(1) or cgit(1)
start spewing non-ASCII digits at us.
|
|
These modules are unmaintained upstream at the moment, but I'll
be able to help with the intended maintainer once/if CPAN
ownership is transferred. OTOH, we've been waiting for that
transfer for several years, now...
Changes I intend to make:
* EPOLLEXCLUSIVE for Linux
* remove unused fields wasting memory
* kqueue bugfixes e.g. https://rt.cpan.org/Ticket/Display.html?id=116615
* accept4 support
And some lower priority experiments:
* switch to EV_ONESHOT / EPOLLONESHOT (incompatible changes)
* nginx-style buffering to tmpfile instead of string array
* sendfile off tmpfile buffers
* io_uring maybe?
|
|
We can reduce the configuration needed to run cgit by reusing
the static file handling logic of the dumb git HTTP protocol.
I hate logos and icons, so don't expect public-inbox.org or
80x24.org to ever have those to waste users' bandwidth with :P
But I expect other users to find this useful.
|
|
Reads to git-http-backend(1) could fail or EOF prematurely,
so we must be ready for that case.
Furthermore, cgit (and possibly other CGI) uses LF instead
of CRLF, so support those programs, too.
|
|
This will be useful for other CGI wrappers we make.
This also fixes a bug with some PSGI servers which did not
present a real IO::Handle in the psgi.input env field.
|
|
This will be useful for reproducibility when mirroring
coderepos and generating diffs.
|
|
Was: ("repobrowse: port patch generation over to qspawn")
We'll be using it for githttpbackend and maybe other things.
|
|
Hopefully this helps people familiarize themselves with
the source code.
|
|
We must detect EOF when reading a POST body with standard PSGI servers.
This does not affect deployments using the standard public-inbox-httpd;
but most smaller inboxes should be able to get away using a generic
PSGI server.
|
|
Using update-copyrights from gnulib
While we're at it, use the SPDX identifier for AGPL-3.0+ to
ease mechanical processing.
|
|
Fewer returns improves readability and the diffstat agrees.
|
|
Fewer conditionals means theres fewer code paths to test
and makes things easier-to-read.
|
|
Use a more meaningful variable name for the Qspawn
object, since this module is the reference for its
use.
|
|
Notes for future developers (myself included) since we
can't assume people can read my mind.
|
|
We do not need to import IO::File into the main programs
since Perl 5.8+ supports literal "undef" for generating
anonymous temporary file handles.
|
|
This was sloppy code, all calls need to be checked
for failure.
|
|
Currently only for git-http-backend use, this allows limiting
the number of spawned processes per-inbox or by group, if there
are multiple large inboxes amidst a sea of small ones.
For example, a "big" repo limiter could be used for big inboxes:
which would be shared between multiple repos:
[limiter "big"]
max = 4
[publicinbox "git"]
address = git@vger.kernel.org
mainrepo = /path/to/git.git
; shared limiter with giant:
httpbackendmax = big
[publicinbox "giant"]
address = giant@project.org
mainrepo = /path/to/giant.git
; shared limiter with git:
httpbackendmax = big
; This is a tiny inbox, use the default limiter with 32 slots:
[publicinbox "meta"]
address = meta@public-inbox.org
mainrepo = /path/to/meta.git
|
|
And bump the default limit to 32 so we match git-daemon
behavior. This shall allow us to configure different levels
of concurrency for different repositories and prevent clones
of giant repos from stalling service to small repos.
|
|
Hopefully this can reduce memory overhead for people that
use one-shot CGI.
|
|
No need to keep an extra array around for this.
|
|
This will allow cache proxies such as Varnish to avoid
caching data sent by us.
|
|
This means we can still show non-git users a somewhat browseable
URL with a link to the README.html file while allowing git users
to type less when cloning.
All of the following are supported:
git clone https://public-inbox.org/ public-inbox
git clone https://public-inbox.org/public-inbox
git clone https://public-inbox.org/public-inbox.git
torsocks git clone http://ou63pmih66umazou.onion/public-inbox
|
|
No point in forcing users to pass a hashref/object to
get a single git directory.
|
|
Apparently git-http-backend exits with a non-zero
status on shallow clones (due to git-upload-pack),
so there is a to-be-fixed bug in git.git
http://mid.gmane.org/20160621112303.GA21973@dcvr.yhbt.net
http://mid.gmane.org/20160621121041.GA29156@sigill.intra.peff.net
|
|
Plack::Request is unnecessary overhead for this given the
strictness of git-http-backend. Furthermore, having to make
commit 311c2adc8c63 ("avoid Plack::Request parsing body")
to avoid tempfiles should not have been necessary.
|
|
The generic PSGI code needs to avoid resource leaks if
smart cloning is disabled (due to resource contraints).
|
|
This makes more sense as it keeps management of rpipe
nice and neat.
|
|
We need to avoid circular references in the generic PSGI layer,
do it by abusing DESTROY.
|
|
Having an excessive amount of git-pack-objects processes is
dangerous to the health of the server. Queue up process spawning
for long-running responses and serve them sequentially, instead.
|
|
We will have clients dropping connections during long clone
and fetch operations; so do not retain references holding
backend processes once we detect a client has dropped.
|
|
Only check query parameters since there's no useful body
in there.
|
|
This bit is still being redone to support gigantic repos.
|
|
This simplifies the code somewhat; but it could probably
still be made simpler. It will need to support command
queueing for expensive commands so expensive processes
can be queued up.
|
|
We can rely entirely on getline + close callbacks
and be compatible with 100% of PSGI servers.
|
|
We will figure out a different way to avoid overloading...
|
|
Mostly stolen from git upstream, these should prevent any caches
such as varnish or squid from acting improperly.
|
|
We can maintain the client HTTP connection if the process exited
with failure as long as we terminated our own response properly.
|