public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2018-03-27	githttpbackend: avoid infinite loop on generic PSGI servers
	We must detect EOF when reading a POST body with standard PSGI servers. This does not affect deployments using the standard public-inbox-httpd; but most smaller inboxes should be able to get away using a generic PSGI server.
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2016-12-25	githttpbackend: minor cleanups to improve readability
	Fewer returns improves readability and the diffstat agrees.
2016-12-25	githttpbackend: simplify compatibility code
	Fewer conditionals means theres fewer code paths to test and makes things easier-to-read.
2016-12-25	githttpbackend: minor readability improvement
	Use a more meaningful variable name for the Qspawn object, since this module is the reference for its use.
2016-12-22	doc: various comments on async handling
	Notes for future developers (myself included) since we can't assume people can read my mind.
2016-11-26	avoid IO::File for anonymous temporary files
	We do not need to import IO::File into the main programs since Perl 5.8+ supports literal "undef" for generating anonymous temporary file handles.
2016-11-26	githttpbackend: error checking for input handling
	This was sloppy code, all calls need to be checked for failure.
2016-07-09	www: add configurable limiters
	Currently only for git-http-backend use, this allows limiting the number of spawned processes per-inbox or by group, if there are multiple large inboxes amidst a sea of small ones. For example, a "big" repo limiter could be used for big inboxes: which would be shared between multiple repos: [limiter "big"] max = 4 [publicinbox "git"] address = git@vger.kernel.org mainrepo = /path/to/git.git ; shared limiter with giant: httpbackendmax = big [publicinbox "giant"] address = giant@project.org mainrepo = /path/to/giant.git ; shared limiter with git: httpbackendmax = big ; This is a tiny inbox, use the default limiter with 32 slots: [publicinbox "meta"] address = meta@public-inbox.org mainrepo = /path/to/meta.git
2016-07-09	qspawn: allow configurable limiters
	And bump the default limit to 32 so we match git-daemon behavior. This shall allow us to configure different levels of concurrency for different repositories and prevent clones of giant repos from stalling service to small repos.
2016-07-09	cleanup some unnecessary use/requires
	Hopefully this can reduce memory overhead for people that use one-shot CGI.
2016-07-07	githttpbackend: avoid intermediate array creation from stat
	No need to keep an extra array around for this.
2016-07-03	githttpbackend: match Content-Type of git-http-backend(1)
	This will allow cache proxies such as Varnish to avoid caching data sent by us.
2016-07-01	git: allow cloning from the URL root, too
	This means we can still show non-git users a somewhat browseable URL with a link to the README.html file while allowing git users to type less when cloning. All of the following are supported: git clone https://public-inbox.org/ public-inbox git clone https://public-inbox.org/public-inbox git clone https://public-inbox.org/public-inbox.git torsocks git clone http://ou63pmih66umazou.onion/public-inbox
2016-07-01	githttpbackend: allow git to be a regular scalar string
	No point in forcing users to pass a hashref/object to get a single git directory.
2016-06-24	githttpbackend: shallow clone workaround
	Apparently git-http-backend exits with a non-zero status on shallow clones (due to git-upload-pack), so there is a to-be-fixed bug in git.git http://mid.gmane.org/20160621112303.GA21973@dcvr.yhbt.net http://mid.gmane.org/20160621121041.GA29156@sigill.intra.peff.net
2016-05-30	git-http-backend: remove dependency on Plack::Request
	Plack::Request is unnecessary overhead for this given the strictness of git-http-backend. Furthermore, having to make commit 311c2adc8c63 ("avoid Plack::Request parsing body") to avoid tempfiles should not have been necessary.
2016-05-27	git-http-backend: close pipe for generic PSGI on errors
	The generic PSGI code needs to avoid resource leaks if smart cloning is disabled (due to resource contraints).
2016-05-27	git-http-backend: move real close to GetlineBody
	This makes more sense as it keeps management of rpipe nice and neat.
2016-05-27	git-http-backend: fix aborts for generic PSGI clone
	We need to avoid circular references in the generic PSGI layer, do it by abusing DESTROY.
2016-05-24	git-http-backend: use qspawn to limit running processes
	Having an excessive amount of git-pack-objects processes is dangerous to the health of the server. Queue up process spawning for long-running responses and serve them sequentially, instead.
2016-05-23	git-http-backend: refactor to support cleanup
	We will have clients dropping connections during long clone and fetch operations; so do not retain references holding backend processes once we detect a client has dropped.
2016-05-23	git-http-backend: avoid Plack::Request parsing body
	Only check query parameters since there's no useful body in there.
2016-05-23	git-http-backend: cleanup vestigial the process limiter code
	This bit is still being redone to support gigantic repos.
2016-05-22	git-http-backend: switch to async_pass
	This simplifies the code somewhat; but it could probably still be made simpler. It will need to support command queueing for expensive commands so expensive processes can be queued up.
2016-05-22	git-http-backend: simplify dumb serving
	We can rely entirely on getline + close callbacks and be compatible with 100% of PSGI servers.
2016-05-22	git-http-backend: remove process limit
	We will figure out a different way to avoid overloading...
2016-05-15	git-http-backend: set cache headers
	Mostly stolen from git upstream, these should prevent any caches such as varnish or squid from acting improperly.
2016-05-12	git-http-backend: do not drop connection on successful finish
	We can maintain the client HTTP connection if the process exited with failure as long as we terminated our own response properly.
2016-05-03	git-http-backend: reduce memory use for clone/fetch
	When serving large static files or large packs, we may call Danga::Socket::write directly to queue up callbacks to resume reading and defer firing them until the socket is writable. This prevents us from scheduling writes or buffering until we know the socket is writable and prevents needless buffering by Danga::Socket when faced with slow clients. For smart clones, this comes at the cost of throttling the output of "git pack-objects" to the speed of the client connection. This is probably not ideal, but is the behavior of the standard git-daemon, too; and is preferable to running the httpd out-of-memory. Buffering to the filesystem may be an option in the future...
2016-05-01	git-http-backend: use real lseek for Content-Range
	Since we use sysread, we must use sysseek for symmetry although PerlIO may be doing a real lseek with "seek", anyways. Fixes: 310819ea86ac ("git-http-backend: favor sysread for regular files")
2016-04-29	http: improve error handling for aborted responses
	We need to abort connections properly if a response is prematurely truncated. This includes problems with serving static files, since a clumsy admin or broken FS could return truncated responses and inadvertently leave a client waiting (since the client saw "Content-Length" in the header and expected a certain length).
2016-04-29	git-http-backend: check EINTR as well as EAGAIN
	The blocking PSGI server may cause EINTR to be hit, here.
2016-04-28	githttpbackend: clamp to one smart HTTP request at-a-time
	Server admins may not be able to afford to have too many git-pack-objects processes running at once. Since PSGI HTTP servers should already be configured to use multiple processes for other requests; limit concurrency of smart backends to one; and fall back to dumb responses if we're already generating a pack.
2016-04-28	githttpbackend: fall back to dumb if smart HTTP is off
	Using http.getanyfile still keeps the http-backend process alive, so it's better to break out of that process and handle serving entirely within the HTTP server.
2016-04-25	githttpbackend: require IO::File explicitly
	This is used all over the place, but may not be in the future, so ensure we explicitly load it ourselves.
2016-03-05	git-http-backend: favor sysread for regular files
	We do not need line buffering, here; so favor sysread to bypass extra copies which may be done by normal read.
2016-03-01	httpd: document pi-httpd.async as totally unstable
	We'll have to use it some more before deciding it is a public interface. I do hope for it to be a usable public interface one day for other users.
2016-02-29	git-http-backend: fixes for mod_perl
	Apache2 mod_perl does not give us a real file handle, so we must translate that before giving that to git-http-backend(1). Also, parse the Status: correctly for errors since we failed to set %ENV properly before the previous fix for SpawnPP
2016-02-29	git-http-backend: stricter parsing of CRLF
	It is not needed as we know git uses CRLF termination.
2016-02-27	git: use built-in spawn implementation for vfork
	This should reduce overhead of spawning git processes from our long-running httpd and nntpd servers.
2016-02-26	git-http-backend: extract input_to_file function
	This will allow us to more easily read and test later.
2016-02-25	git-http-backend: avoid multi-arg print statemtents
	Even with output buffering disabled via IO::Handle::autoflush, writes are not atomic unless it is a single argument passed to "print". Multiple arguments to "print" will show up as multiple calls to write(2) instead of a single, atomic writev(2).
2016-02-25	git-http-backend: start async API for streaming
	git-http-backend may take a while, ensure we can process other requests while waiting on it. We currently do this via Danga::Socket in public-inbox-httpd; but avoid exposing this internal implementation detail to the PSGI interface and instead only expose a callback via: $env->{'pi-httpd.async'}
2016-02-25	git-http-backend: start refactoring to use callback
	Designing for asynchronous, non-blocking operations makes adapting for synchronous, blocking operation easy. Going the other way around is not easy, so do it now and allow us to be more easily adapted for non-blocking use in the next commit...
2016-02-25	use pipe for git-http-backend output
	This allows us to stream the output to the client without buffering everything up-front. Next, we'll let Danga::Socket (or AE in the future) wait for readability.
2016-02-25	remove direct CGI.pm support
	Relying on Plack::Handler::CGI is much easier for long-term maintenance and development. Nowadays, we even include our own httpd implementation to facilitate easier deployment with PSGI/Plack.
2016-02-07	support smart HTTP cloning
	This requires POST and (small file) upload support from the PSGI/Plack web server. CGI.pm is currently not supported with this feature. We'll serve everything git can handle by default for performance in the general case. To avoid introducing cognitive overhead for sysadmins managing existing HTTP backends, we do not introduce new configuration directives. Thus, setting http.uploadpack=false in the relevant git config file for each public-inbox (ssoma) git repo will disable smart HTTP for CPU/memory-constrained systems. Technically we could support http.receivepack to allow posting messages to a public-inbox over HTTP(S), but that breaks the public-inbox model of encouraging users to Cc: everyone. Again, we encourage users to Cc: everyone to reduce the chance of a public-inbox becoming a centralized point of failure/censorship.