about summary refs log tree commit homepage
DateCommit message (Collapse)
2019-01-31doc: remove completed TODO items
2019-01-31doc/config: document "replyto" configuration knob
I hate it, but it's necessary to support some mirrors.
2019-01-31doc/config: user documentation for limiters
I've relied on this feature to keep the VPS behind https://public-inbox.org/git/ from OOM-ing since 2016, so document it to ensure others can make use of low-end servers like I do. More limiters may become configurable for viewvcs and solver functionality (or we continue using the default one).
2019-01-31config: tiny cleanup to use _array() sub
2019-01-31qspawn: documentation updates
This will become critical for future changes to display git commits, diffs, and trees. Use "qspawn.wcb" instead of "qspawn.response" to enhance readability.
2019-01-31inbox: drop psgi.url_scheme requirement from base_url
This will make it easier to make command-line tools from SolverGit.
2019-01-31viewvcs: support streaming large blobs
Forking off git-cat-file here for streaming large blobs is reasonably efficient, at least no worse than using git-http-backend for serving clones. So let our limiter framework deal with it. git itself isn't great for large files, and AFAIK there's no stable/widely-available mechanisms for reading smaller chunks of giant blobs in git itself. Tested with some giant GPU headers in the Linux kernel.
2019-01-31solvergit: allow shorter-than-necessary OIDs from user
We can rely on git to disambiguate, here; because sometimes shorter OIDs can be unambiguous even if we only resolved the longer one.
2019-01-31solvergit: allow searching on longer-than-needed OIDs
public-inbox can only index the abbreviated object_ids in emails, not the full or even longer-than-necessary object_ids. So retry failed object_ids if they're longer than 7 hex characters.
2019-01-31inbox: perform cleanup of Git objects for coderepos
Otherwise, long-running but idle git processes may keep unlinked packs around indefinitely and waste disk space.
2019-01-30t/config.t: test PublicInbox::Git sharing between inboxes
We need to ensure we don't introduce unnecessary processes and memory usage for mapping multiple inboxes to the same code repos.
2019-01-30doc/config: document coderepo and css bits
New features ought to be documented
2019-01-30solvergit: don't confuse Xapian with ".." in filenames
Xapian will interpret ".." as ranges, even quoted phrases. So break up words on ".." since punctuation (AFAIK) is not searchable, anyways.
2019-01-30git: use "git rev-parse --git-path"
Using git worktrees was causing t/solver_git.t to fail on me.
2019-01-30Merge remote-tracking branch 'origin/viewvcs' into master
* origin/viewvcs: (66 commits) solvergit: deal with alternative diff prefixes solvergit: extract mode from diff headers properly solvergit: avoid "Wide character" warnings solvergit: do not show full path names to "git apply" css/216dark: add comments and tweak highlight colors viewvcs: avoid segfault with highlight.pm at shutdown solvergit: do not solve blobs twice t/check-www-inbox: disable history t/check-www-inbox: don't follow mboxes t/check-www-inbox: replace IPC::Run with PublicInbox::Spawn hval: add src_escape for highlight post-processing viewvcs: wire up syntax-highlighting for blobs hlmod: disable enclosing <pre> tag t/hl_mod: extra check to ensure we escape HTML wwwhighlight: read_in_full returns undef on errors solver: crank up max patches to 9999 viewvcs: do not show final error message twice qspawn: decode $? for user-friendliness solver: reduce "git apply" invocations solver: hold patches in temporary directory ...
2019-01-30view: remove unused _msg_date sub
Not needed since commit 956abe9ad5f13a0d1755262be412d6a54fda72e9 ("view: depend on SearchMsg for Message-ID")
2019-01-30httpd: a few comments about some fields we set
Removing 'psgix.input.buffered' could be a possibility in the future.
2019-01-30solvergit: deal with alternative diff prefixes
At least, without extra directory levels, since git-diff supports --src-prefix and --dst-prefix, and /git/6aa8857a11/s/ uses it...
2019-01-30solvergit: extract mode from diff headers properly
grep() won't set $1, so use "=~", instead.
2019-01-30solvergit: avoid "Wide character" warnings
Just quiet Perl down, since we don't know or care about the encoding of the patch we hand off to git-apply.
2019-01-30solvergit: do not show full path names to "git apply"
"git apply" will warn about whitespace with the full path of the patch, which will expose the $TMPDIR environment to users over HTTP(S). This change breaks compatibility with git pre-1.8.5, again; but that was released in late-2013; so hopefully everybody is on newer versions.
2019-01-29css/216dark: add comments and tweak highlight colors
Overkill, but "highlight" supports single-line comments (slc) independently of multi-line comments (com); but we'll use the same color for that. We'll also use #0f0 instead of #0ff for "kwb" (keyword class "b") since blue shades are prevalent in <a> links and comments, while green was unused.
2019-01-29viewvcs: avoid segfault with highlight.pm at shutdown
Proper ordering of destruction seems required to avoid segfaults at shutdown.
2019-01-29solvergit: do not solve blobs twice
In some cases, a file may ping-pong between blob IDs in the same message when reverts occur. So break out of this early. This doesn't account for different abbreviations, but the limited variations of abbreviations should alleviate the problem.
2019-01-29t/check-www-inbox: disable history
WWW::Mechanize keeps an infinitely large stack, which was leading to OOM errors on my system.
2019-01-29t/check-www-inbox: don't follow mboxes
They can be extremely large with no limit, so can lead to OOM errors.
2019-01-29t/check-www-inbox: replace IPC::Run with PublicInbox::Spawn
Because WWW::Mechanize uses truckload of memory, fork needs to prepare all that memory for CoW, which ends up bailing with ENOMEM.
2019-01-29mid: filter out 'y', 'n', and email addresses from references()
Looking at git@vger history, several emails had broken References/In-Reply-To pointing to <y>, <n> and email addresses as Message-IDs in References and In-Reply-To headers. This was causing too many unrelated messages to be linked together in the same thread.
2019-01-28hval: add src_escape for highlight post-processing
We need to post-process "highlight" output to ensure it doesn't contain odd bytes which cause "wide character" warnings or require odd glyphs in source form.
2019-01-27viewvcs: wire up syntax-highlighting for blobs
And update 216dark.css to match a color scheme I'm used to; which is fairly minimal and doesn't use all the classes "highlight" provides.
2019-01-27hlmod: disable enclosing <pre> tag
We already have a <pre> tag in ViewVCS, and nesting <pre> inside the pre-existing <pre> overrides the "white-space:pre" we use to align line numbers.
2019-01-27t/hl_mod: extra check to ensure we escape HTML
Otherwise, it's open season on our users :<
2019-01-27wwwhighlight: read_in_full returns undef on errors
The return value of "print" is not undef for Perl IO::Handle.
2019-01-27solver: crank up max patches to 9999
Might as well, since the only constraint is filesystem space for temporary files for public-inbox-httpd users. -httpd can fairly share work across clients with our use of psgi_qx; and there's a recent patch series in git@vger with 64 patches in sequence.
2019-01-27viewvcs: do not show final error message twice
SolverGit::ERR already writes the exception to the debug log before calling {user_cb}, so there's no need for viewvcs to append it.
2019-01-27qspawn: decode $? for user-friendliness
The raw value of $? isn't very useful, generally.
2019-01-27solver: reduce "git apply" invocations
"git apply" is capable of applying multiple patches in one invocation, so give it multiple patches on the command-line now that we no longer rely on anonymous file handles to hold patches. This cuts down a 64-patch series on git@vger from ~1s to ~800ms with vfork spawn enabled using Inline::C.
2019-01-27solver: hold patches in temporary directory
We can avoid bumping up RLIMIT_NOFILE too much by storing patches in a temporary directory. And we can share this top-level directory with our temporary git repository. Since we no longer rely on a working-tree for git, we are free to rearrange the layout and avoid relying on the ".git" convention and relying on "git -C" for chdir. This may also ease porting public-inbox to older systems where git does not support "-C" for chdir.
2019-01-26solver: rewrite to use Qspawn->psgi_qx and pi-httpd.async
The psgi_qx routine in the now-abandoned "repobrowse" branch allows us to break down blob-solving at each process execution point. It reuses the Qspawn facility for git-http-backend(1), allowing us to limit parallel subprocesses independently of Perl worker count. This is actually a 2-3% slower a fully-synchronous execution; but it is fair to other clients as it won't monopolize the server for hundreds of milliseconds (or even seconds) at a time.
2019-01-26view: swap CRLF for LF in HTML output
It makes no difference to browsers aside from saving a few bytes; and this means we won't have to worry about extra '%0D' showing up in links to solver.
2019-01-26t/qspawn.t: psgi_qx stderr test
2019-01-22qspawn: implement psgi_qx
This new asynchronous API, will allow us to take advantage of non-blocking I/O from even small commands; as those may still need to wait for slow operations.
2019-01-22httpd/async: stop running command if client disconnects
If an HTTP client disconnects while we're piping the output of a process to them, break the pipe of the process to reclaim resources as soon as possible.
2019-01-22qspawn|httpd/async: improve and fix out-of-date comments
2019-01-22qspawn|getlinebody: support streaming filters
This is intended for wrapping "git show" and "git diff" processes in the future and to prevent it from monopolizing callers. This will us to better handle backpressure from gigantic commits.
2019-01-22qspawn: implement psgi_return and use it for githttpbackend
Was: ("repobrowse: port patch generation over to qspawn") We'll be using it for githttpbackend and maybe other things.
2019-01-22httpd/async: remove needless sysread wrapper
We don't appear to be using it anywhere
2019-01-21t/check-www-inbox: trap SIGINT for File::Temp destruction
Otherwise, temporary GDBM files don't get unlinked when I SIGINT the process.
2019-01-21hval: split out escape sequences to a separate table
We'll want to handle those escape sequences independently, "highlight" already does HTML escaping.
2019-01-21highlight: initial wrapper and PSGI service
I'll probably expose the PSGI service for cgit; but it could be useful to others as well.